CoffeeRL-Lite

A lightweight reinforcement learning framework for coffee optimization using transformer models and PEFT (Parameter-Efficient Fine-Tuning). Train AI models to optimize coffee brewing parameters through reinforcement learning and compare their performance.

Interface Preview

Preview of the Gradio web interface and Experiment Lab for fine-tuning coffee parameters

Key Features

🤖 Reinforcement Learning: PPO-based training with TRL library
📊 Model Comparison: Comprehensive performance comparison between models
🔄 Batch Training: Automated weekly batch training with data accumulation
🤗 Hugging Face Integration: Model versioning and sharing via HF Hub
📈 Performance Tracking: Detailed metrics and reward calculation
🧪 Comprehensive Testing: Full test suite with 100% coverage for core components

Quick Start

Installation

This project uses UV for fast Python package management.

# Clone the repository
git clone <repository-url>
cd CoffeRL

# Install dependencies with UV
uv sync

# Check platform compatibility
uv run python config/platform_config.py

# Launch the application
uv run python main.py

Using Docker

For full feature support including quantization:

# Quick start with Docker Compose
docker-compose --profile prod up

# With GPU support (requires nvidia-docker)
docker-compose --profile gpu up

Usage

Web Interface

Launch the Gradio web interface for interactive coffee optimization:

uv run python main.py

This provides:

Chatbot Interface: Interactive coffee brewing assistant
Experiment Lab: Fine-tune coffee parameters and compare results
Model Comparison: Visual comparison between different model versions

Command Line Tools

Model Comparison

Compare performance between different model versions:

# Quick comparison (10 samples for fast testing)
uv run python src/model_comparator.py \
  --model1 checkpoints/batch_training/batch_1 \
  --model2 checkpoints/batch_training/batch_2 \
  --max-samples 10

# Compare HuggingFace Hub models
uv run python src/model_comparator.py \
  --model1 batch-1 --model1-hf \
  --model2 batch-2 --model2-hf \
  --dataset data/processed/coffee_validation_dataset

Batch Training

Manage automated training workflows:

# Check training status
uv run python src/batch_trainer.py status

# Run batch training
uv run python src/batch_trainer.py train --episodes 500

# View training history
uv run python src/batch_trainer.py history

Reinforcement Learning Training

Train models with custom parameters:

# Train with default settings
uv run python src/train_rl.py --episodes 100

# Train with custom dataset
uv run python src/train_rl.py \
  --dataset data/processed/coffee_training_dataset \
  --episodes 200 \
  --save-freq 50

Platform Compatibility

macOS (Apple Silicon Optimized)

✅ Supported: All core ML libraries with MPS acceleration
✅ Apple Silicon: Optimized for M1/M2/M3 chips
❌ Quantization: Limited support (use Docker for full features)

Linux/Windows

✅ Full Support: All libraries including quantization
✅ CUDA: GPU acceleration supported
✅ Quantization: 4-bit and 8-bit model quantization available

Documentation

Contributing Guide - Development setup and guidelines
CLI Tools Reference - Comprehensive CLI documentation
Pipeline Documentation - Training and evaluation pipelines
Deployment Guide - Production deployment instructions
QLoRA Setup - Quantization and fine-tuning setup

Examples

Basic Model Training

# Start with a quick training run
uv run python src/train_rl.py --episodes 10

# Compare with baseline
uv run python src/model_comparator.py \
  --model1 checkpoints/batch_training/batch_1 \
  --model2 checkpoints/batch_training/batch_2 \
  --max-samples 10

Web Interface Usage

Launch the application: uv run python main.py
Open your browser to the displayed URL
Use the Chatbot tab for interactive coffee advice
Use the Experiment Lab tab to fine-tune brewing parameters
Compare different models and their recommendations

Contributing

We welcome contributions! Please see our Contributing Guide for:

Development setup instructions
Code style guidelines
Testing requirements
Pull request process

Support

Issues: Report bugs and request features via GitHub Issues
Documentation: Check the docs/ directory for detailed guides
Platform Issues: Use Docker for full feature compatibility

License

This project is licensed under the Apache 2.0 License - see the LICENSE document for details.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
config		config
data		data
docs		docs
examples		examples
models		models
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
COMMUNITY_COLLECTION_GUIDE.md		COMMUNITY_COLLECTION_GUIDE.md
Dockerfile		Dockerfile
Dockerfile.gradio		Dockerfile.gradio
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
check_results.py		check_results.py
create_sample_experiments.py		create_sample_experiments.py
demo_experiment_flow.py		demo_experiment_flow.py
demo_experiment_flow_sqlite.py		demo_experiment_flow_sqlite.py
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
run_dashboard.py		run_dashboard.py
run_gradio.py		run_gradio.py
setup.cfg		setup.cfg
simple_demo.py		simple_demo.py
test_submission.py		test_submission.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoffeeRL-Lite

Interface Preview

Key Features

Quick Start

Installation

Using Docker

Usage

Web Interface

Command Line Tools

Model Comparison

Batch Training

Reinforcement Learning Training

Platform Compatibility

macOS (Apple Silicon Optimized)

Linux/Windows

Documentation

Examples

Basic Model Training

Web Interface Usage

Contributing

Support

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

jtmcg3/CoffeeRL

Folders and files

Latest commit

History

Repository files navigation

CoffeeRL-Lite

Interface Preview

Key Features

Quick Start

Installation

Using Docker

Usage

Web Interface

Command Line Tools

Model Comparison

Batch Training

Reinforcement Learning Training

Platform Compatibility

macOS (Apple Silicon Optimized)

Linux/Windows

Documentation

Examples

Basic Model Training

Web Interface Usage

Contributing

Support

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages