RL Puzzle Agents

This codebase is designed to support exploration and practice with reinforcement learning (RL) in structured, logic-based environments. The goal is to create modular, extensible agents that learn how to interact with grid-based puzzles through placement — not dynamic interaction.

These environments are all placement games: once an object is placed, it stays. Like placing a queen in chess or a number in Sudoku — no backsies. The challenge is to learn effective or optimal placement strategies from reward signals, not handcrafted rules.

Each game includes multiple difficulty stages. For example, a 4×4 Sudoku with 12 given numbers may be trivial, but one with only 2 may be unsolvable. Agents should learn to generalize across these variants.

The aim is not to build one universal agent, but rather to understand what it takes for an agent to solve each class of puzzle effectively.

🧩 Games

Gridworld (RL fundamentals, not a placement game, but to learn the basics)
Tic-Tac-Toe (strategy and rewards)
8 Queens (search and constraints)
Tic-Tac-Logic (rule-driven grids)
Small Sudoku (structured logic with partial observability)
Large Sudoku (full challenge)

❓ Open Questions

Can we understand the rules the agent has learned, post-training?
Can we interpret the agent’s steps to understand how it solves puzzles?
How well are the agents doing — and by what metrics?
In what ways do human strategies differ from learned strategies?
How does reward timing (immediate vs delayed feedback) impact learning?
How do we know when an agent has "finished" learning?
Are these agents useful — or just a fun way to replicate solved problems?

How to run

Create a virtual environment and install the requirements:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 -m <file> 
# or
PYTHONPATH=. python3 -m <file>

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github/workflows		.github/workflows
.vscode		.vscode
gridworld		gridworld
output		output
queens		queens
tic_tac_logic		tic_tac_logic
.gitignore		.gitignore
AGENTS.md		AGENTS.md
pyrightconfig.json		pyrightconfig.json
readme.md		readme.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL Puzzle Agents

🧩 Games

❓ Open Questions

How to run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

jasminenoack/Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

RL Puzzle Agents

🧩 Games

❓ Open Questions

How to run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages