Skip to content

Fix hardcoded TinyStories data path in train_large/train_large_ane#29

Merged
maderix merged 1 commit intomaderix:mainfrom
nabbilkhan:contrib/fix-training-data-paths
Mar 4, 2026
Merged

Fix hardcoded TinyStories data path in train_large/train_large_ane#29
maderix merged 1 commit intomaderix:mainfrom
nabbilkhan:contrib/fix-training-data-paths

Conversation

@nabbilkhan
Copy link
Copy Markdown
Contributor

@nabbilkhan nabbilkhan commented Mar 3, 2026

Why I worked on this

First, thank you for building this project. ANE is honestly awesome, and I’m actively using it in real workflows.

I’m running these training pipelines as part of my Open Claw agent environment across multiple Apple machines and different launch contexts (manual runs, scripted runs, and restart-driven runs). In that setup, I kept hitting the same issue: the static trainers expected token data at a fixed path.

The run would work in one context, then break in another, and it could also fail after exec() restart because the path context was not explicit. That created unnecessary friction for real usage and for onboarding other people.

What this PR changes

  • Adds --data PATH to train_large and train_large_ane
  • Replaces hardcoded token data path with a runtime-configurable path
  • Preserves --data across exec() restart so resumed training keeps the same dataset
  • Improves missing-data error text with clear next steps
  • Updates training/README.md with examples and flag documentation

Why this helps the community

  • Makes the project easier to run from any working directory
  • Supports custom dataset locations without source edits
  • Prevents restart-related path regressions
  • Reduces first-run setup failures for new contributors
  • Improves compatibility with scripted/automated setups

Validation

I validated this on Apple Silicon macOS with explicit absolute paths:

  • train_large --steps 11 --data <abs-path> (forces restart path)
  • train_large_ane --no-ane-extras --steps 11 --data <abs-path> (same restart validation)
  • Missing-path negative check confirms clear error guidance

Both restart paths resumed correctly and continued reading token data as expected.

Personal note

I’m very excited about this project and would love to keep contributing to its growth. I’ve contributed multiple times to Open Claw, and I’m using ANE in serious, practical workflows, so I plan to keep sending real-world fixes and useful benchmark data upstream.

@maderix maderix merged commit 032f866 into maderix:main Mar 4, 2026
@maderix
Copy link
Copy Markdown
Owner

maderix commented Mar 4, 2026

Thanks — threading --data through execl() restarts was the tricky part and you got it right. Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants