Skip to content

mkrum/stonefish

Repository files navigation

🪨🐟 stonefish: Transformers (and more) For Chess

This repository contains code for training, evaluating, and deploying deep learning models for playing chess.

  • beat random
  • beat stockfish at depth 1
  • beat myself (write up)
  • beat someone good
  • beat everyone

Search algorithms are considered cheating.

Quick Start

# Train locally
just train-local configs/train_resnet_small.yml

The code can also be deployed to a GKE cluster.

# Deploy to k8s
just up
just train-deploy

See the justfile for more details.

Setup

# Clone and install
git clone https://github.com/mkrum/stonefish.git
cd stonefish
pip install -e .[dev,test]

There is also a dockerfile, which you can build with:

just build-local

Training

# Local training
just train-local configs/train_resnet_big.yml

# Local with custom parameters
python -m stonefish.train configs/train_convnet_small.yml output_dir

# Distributed training (4 GPUs)
torchrun --standalone --nproc_per_node=4 -m stonefish.train configs/train_resnet_big.yml output_dir

Evaluation

# Model vs Random
python -m stonefish.eval --agent1 model:configs/train_resnet_big.yml:checkpoint.pth --agent2 random --games 100

# Model vs Stockfish
python -m stonefish.eval --agent1 model:configs/train_convnet_big.yml:model.pth --agent2 stockfish:depth=3 --games 50

# Model vs Model
python -m stonefish.eval --agent1 model:config1.yml:model1.pth --agent2 model:config2.yml:model2.pth --games 200

Benchmarking

# Benchmark model throughput
just benchmark configs/train_resnet_big.yml

# With specific batch sizes
python stonefish/benchmark.py configs/train_convnet_big.yml --batch-sizes 1 8 32 128 --device cuda

Kubernetes

# Full deployment
just up                  # Create cluster
just train-deploy        # Start training
kubectl logs -f job/stonefish-training  # Watch logs
just cleanup            # Tear down

# Debug shell
just shell

About

chess with transformers

Topics

Resources

Stars

Watchers

Forks

Contributors