Skip to content

Conversation

@kywch
Copy link
Contributor

@kywch kywch commented Nov 24, 2025

The early-stop thresholding is done in 3 phases

  • Safe zone: no thresholding during early_stop_min_cost
  • Quantile-based thresholding: after early_stop_min_cost, the runs that are below early_stop_quantile (30%) are stopped. The threshold increases as cost increases.
  • Hard zone: The above the upper bound should exceed the max score so far, or it gets stopped.

This will stop the training runs that are frustrating to watch and save some compute for sweep.

Breakout 500 runs:
image

Tetris:
image

@kywch
Copy link
Contributor Author

kywch commented Nov 26, 2025

Changes after last discussion

  • Lowered the threshold for long runs: from 1.05 × self.max_score to 0.9 × self.max_score, and compare it against max(metric, running_mean) so that an unlucky dip won't kill it.
  • Because of the above, we will have more long runs. To limit cost increase, cost normalization is done with np.quantile(log_c, 0.97) instead of max(log_c), and the default expansion rate was lowered to 0.1 (from 0.25).
  • Removed the early_stop_min_cost config and let it set automatically (around 30% of the upper bound). Completely removing min_allowed_cost (safe zone) interfered with tetris, which had slow take-off and a lot of runs got killed in 10s.
  • Supports percentile-based target metric with metric_distribution = percentile. If it's set, the sweep will logit transform incoming percentile scores and use it for GP regression and early stopping. The transformed numbers are visible in the dashboard environment/early_stop_threshold panel.

Protein got sophisticated, so yeah I should probably lay out all the details and try to ablate... later.

image

@kywch
Copy link
Contributor Author

kywch commented Nov 26, 2025

I have some stuff to PR for tower climb, but will do separately

@kywch
Copy link
Contributor Author

kywch commented Nov 28, 2025

Tower climb, 20k maps, 300 runs. It's getting good hyper params.

image

@jsuarez5341 jsuarez5341 merged commit 33661e7 into PufferAI:3.0 Nov 28, 2025
12 checks passed
@kywch kywch deleted the stop-train branch November 28, 2025 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants