Batchman Usage Guide

Quick Start

1. Basic Processing

Run the batch processor on your input file:

python main.py

What you'll see:

================================================================================
🚀 BATCHMAN - Ollama Batch Processor
================================================================================
📋 Model: gemma3:1b
🔗 Server: http://localhost:11434
👷 Workers: 5
⏱️  Timeout: 120s
================================================================================
📖 Loading prompt and input...
✅ Loaded 17 lines to process
================================================================================
⚡ Processing with 5 parallel workers...

┌─ Progress:  58.82% [10/17] │ Avg Response: 1.85s │ Elapsed: 0:00:18 │ ETA: 0:00:13 ┐

Output:

output.jsonl - Your results (one JSON per line, matching input lines)
errors.log - Any errors that occurred during processing

2. Finding Optimal Performance

Run the benchmark to test different worker counts:

python benchmark.py

Interactive prompts:

Worker counts to test (comma-separated, default=1,3,5,10,15,20):
Test with all input lines? (y/n, default=n for faster testing): n
How many lines to test with? (default=10): 10

Results:

================================================================================
📊 BENCHMARK RESULTS
================================================================================
Workers    Total Time      Throughput           Avg Time        Success
--------------------------------------------------------------------------------
1          0:00:25        0.40 items/s         2.50s           10/10
3          0:00:12        0.83 items/s         1.20s           10/10
5          0:00:08        1.25 items/s         0.80s           10/10      ⭐ BEST
10         0:00:09        1.11 items/s         0.90s           10/10
================================================================================

💡 RECOMMENDATION:
   Set PARALLEL_WORKERS = 5 in config.py
   Expected throughput: 1.25 items/second
   Speedup vs 1 worker: 3.13x faster
   Parallel efficiency: 62.5%

Configuration

config.py Settings

# Model Configuration
OLLAMA_MODEL = "gemma3:1b"              # Your Ollama model
OLLAMA_BASE_URL = "http://localhost:11434"
OLLAMA_CONTEXT = 4096                   # Context window size

# Performance
PARALLEL_WORKERS = 5                    # Concurrent workers (optimize with benchmark)
REQUEST_TIMEOUT = 120                   # Max seconds per request

# Files
PROMPT_FILE = "prompt.txt"              # Your prompt template
INPUT_FILE = "input.txt"                # Input data (one per line)
OUTPUT_FILE = "output.jsonl"            # Output results
ERROR_FILE = "errors.log"               # Error log

Prompt Template

Your prompt.txt should use {INPUT} placeholder:

You are an expert music classifier. You are given a file path.
Your job is to determine the metadata correctly from the path.
You will now return a valid JSON in the format:
{
    "artist": "",
    "album": "",
    "year": "",
    "track_number": "",
    "track_name": ""
}
Now tell me the JSON for this file: {INPUT}

Input Format

input.txt - one item per line:

C:\Users\Sam\Music\10cc\CD14\12 24 Hours (Edit).opus
C:\Users\Sam\Music\2Pac\01 - Letter To The President.opus
...

Performance Tuning

Worker Count Guidelines

System Type	Recommended Workers	Notes
Laptop (4-8 cores)	3-5	Balance speed & resources
Desktop (8-16 cores)	5-10	Good parallelization
Server (16+ cores)	10-20+	Maximum throughput

Model Size Impact

Model Size	Recommended Workers	Why
Small (1b-3b)	10-20	Fast inference, high concurrency
Medium (7b-13b)	5-10	Balance memory & speed
Large (30b+)	2-5	Memory intensive

Best Practices

Run benchmark first: Find your system's optimal worker count
```
python benchmark.py
```
Monitor resources: Watch CPU, RAM, and GPU usage while processing
```
# Windows Task Manager
# Linux: htop or top
```
Start conservative: Begin with 5 workers, increase if system handles it well
Test with subset: Use benchmark's line limit feature for quick tests
Adjust based on model: Smaller models = more workers, larger models = fewer workers

Output Format

output.jsonl

JSONL format - one JSON object per line:

{"artist": "10cc", "album": "20th Anniversary", "year": "", "track_number": "12", "track_name": "24 Hours (Edit)"}
{"artist": "2Pac", "album": "Still I Rise", "year": "2006", "track_number": "01", "track_name": "Letter To The President"}

Line correspondence: Line N in output matches line N in input (even if processing was parallel)

errors.log

Errors are logged with context:

2025-11-13 14:23:45,123 - Line 5: JSON Parse Error - Expecting value: line 1 column 1
Input: C:\Users\Sam\Music\BadPath\file.opus
Response: I cannot determine the metadata from this path.

Progress Metrics Explained

┌─ Progress:  58.82% [10/17] │ Avg Response: 1.85s │ Elapsed: 0:00:18 │ ETA: 0:00:13 ┐

Progress: Percentage complete
[10/17]: Current index / Total items
Avg Response: Average time per LLM call (helpful for tuning)
Elapsed: Time since start
ETA: Estimated time remaining

Common Issues & Solutions

Issue: Connection Refused

Problem: Can't connect to Ollama

Error: Connection refused to http://localhost:11434

Solution:

Start Ollama: ollama serve
Verify it's running: ollama list
Check OLLAMA_BASE_URL in config.py

Issue: Slow Performance

Problem: Processing is taking too long

Solutions:

Run benchmark to find optimal workers
Use smaller model (e.g., gemma3:1b vs gemma3:8b)
Increase PARALLEL_WORKERS (if CPU has headroom)
Check if Ollama is using GPU acceleration

Issue: JSON Parsing Errors

Problem: Many errors in errors.log

Solutions:

Improve prompt clarity - explicitly request JSON
Add example in prompt
Check errors.log to see what LLM is returning
Try different model (some are better at following formats)

Issue: Memory Errors

Problem: System runs out of memory

Solutions:

Reduce PARALLEL_WORKERS
Use smaller model
Increase system swap space
Process input in batches

Example Workflows

Workflow 1: First-Time Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Pull your model
ollama pull gemma3:1b

# 3. Prepare your data
# Edit input.txt - add your data
# Edit prompt.txt - customize for your task

# 4. Find optimal workers
python benchmark.py
# When prompted: test with 10 lines, try workers: 1,5,10,15

# 5. Update config with recommended workers
# Edit config.py: PARALLEL_WORKERS = <recommended>

# 6. Process full dataset
python main.py

Workflow 2: Large Dataset Processing

# 1. Test with sample first
head -n 100 large_input.txt > input.txt
python main.py

# 2. Check results
head output.jsonl
cat errors.log

# 3. If good, process full dataset
cp large_input.txt input.txt
python main.py

# 4. Monitor progress
# Watch the progress bar and ETA

Workflow 3: Performance Optimization

# 1. Benchmark current setup
python benchmark.py
# Note the throughput

# 2. Try different model
# Edit config.py: OLLAMA_MODEL = "mistral:7b"
ollama pull mistral:7b
python benchmark.py

# 3. Compare results
# Check benchmark_results.json for both runs

# 4. Choose best config
# Update config.py with winning combination

Tips for Maximum Speed

Use SSD: Faster disk = faster model loading
Keep model warm: Set OLLAMA_KEEP_ALIVE high to avoid reloading
GPU acceleration: Ensure Ollama uses GPU if available
Batch input: Process large batches to amortize overhead
Simple prompts: Shorter prompts = faster processing
Optimize workers: Run benchmark to find sweet spot

Advanced: Processing Strategy

For 1,000+ items:

Benchmark with 50 items: Quick test to find optimal workers
Test run with 100 items: Verify accuracy and error rate
Full run: Process complete dataset with optimal settings
Monitor: Watch progress, adjust if needed

For heterogeneous data:

If some inputs take much longer, consider:
- Splitting input by complexity
- Using different worker counts for each batch
- Implementing timeout handling

Performance Expectations

Based on typical hardware:

Setup	Items/sec	1000 items time
Laptop + 1b model + 5 workers	1-2	8-16 min
Desktop + 7b model + 10 workers	0.5-1	16-30 min
Server + 1b model + 20 workers	3-5	3-5 min

Actual performance varies by hardware, model, and prompt complexity

Support & Debugging

Enable verbose logging:

Add to top of main.py:

import logging
logging.basicConfig(level=logging.DEBUG)

Check Ollama logs:

# Ollama typically logs to system logs
# Check for errors or warnings

# Linux/Mac:
journalctl -u ollama

# Windows: Check Ollama service logs

Verify model works:

ollama run gemma3:1b "Test prompt"

Test with minimal workers:

# config.py
PARALLEL_WORKERS = 1  # Simplifies debugging

Summary

✅ Use python main.py for normal processing
✅ Use python benchmark.py to optimize performance
✅ Start with 5 workers, adjust based on benchmark
✅ Monitor progress bar for live feedback
✅ Check errors.log if issues occur
✅ Line numbers always match between input and output

Happy batch processing! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batchman Usage Guide

Quick Start

1. Basic Processing

2. Finding Optimal Performance

Configuration

config.py Settings

Prompt Template

Input Format

Performance Tuning

Worker Count Guidelines

Model Size Impact

Best Practices

Output Format

output.jsonl

errors.log

Progress Metrics Explained

Common Issues & Solutions

Issue: Connection Refused

Issue: Slow Performance

Issue: JSON Parsing Errors

Issue: Memory Errors

Example Workflows

Workflow 1: First-Time Setup

Workflow 2: Large Dataset Processing

Workflow 3: Performance Optimization

Tips for Maximum Speed

Advanced: Processing Strategy

For 1,000+ items:

For heterogeneous data:

Performance Expectations

Support & Debugging

Enable verbose logging:

Check Ollama logs:

Verify model works:

Test with minimal workers:

Summary

FilesExpand file tree

USAGE_GUIDE.md

Latest commit

History

USAGE_GUIDE.md

File metadata and controls

Batchman Usage Guide

Quick Start

1. Basic Processing

2. Finding Optimal Performance

Configuration

config.py Settings

Prompt Template

Input Format

Performance Tuning

Worker Count Guidelines

Model Size Impact

Best Practices

Output Format

output.jsonl

errors.log

Progress Metrics Explained

Common Issues & Solutions

Issue: Connection Refused

Issue: Slow Performance

Issue: JSON Parsing Errors

Issue: Memory Errors

Example Workflows

Workflow 1: First-Time Setup

Workflow 2: Large Dataset Processing

Workflow 3: Performance Optimization

Tips for Maximum Speed

Advanced: Processing Strategy

For 1,000+ items:

For heterogeneous data:

Performance Expectations

Support & Debugging

Enable verbose logging:

Check Ollama logs:

Verify model works:

Test with minimal workers:

Summary