Update Python CI versions by btaba · Pull Request #663 · google/brax

btaba · 2026-03-03T22:31:55Z

Drop 3.10 and add 3.12, 3.13 to matrix.

* Update Python versions in CI to 3.11, 3.12, and 3.13 * Update ci.yml

* Replace unicode escaped characters in ipynb files PiperOrigin-RevId: 856196218 Change-Id: I42b3faac6a8f923078c55fc431b526656c19cbfd * Add soft-sign clipping (mean_clip_scale) and configurable mean_kernel_init to PolicyModuleWithStd Add two new optional parameters to PolicyModuleWithStd: - mean_clip_scale: Applies softsign clipping to mean output: scale * (mean / (1 + |mean|)) - mean_kernel_init: Configurable kernel initializer for the final mean Dense layer Also adds policy_network_kwargs to make_ppo_networks for generic pass-through of additional options. PiperOrigin-RevId: 862772852 Change-Id: I063c99d08ce7fc6ce1c5c6b94cb636c7d4cf4bbf * Flatten params. PiperOrigin-RevId: 862801302 Change-Id: If14899543ab36dc983235ffdd6f5b76493324538 * Add mean_kernel_init_fn to checkpoint serialization keywords PiperOrigin-RevId: 862855364 Change-Id: I0fad9dac158a7be2215b1db0c1ce38931be018f0 * Check more general exception type in assert. PiperOrigin-RevId: 866629105 Change-Id: I6a7e2fb51af09589c106268eba29ba48dce86bb9 * [pmap] Prepare brax ES agent for jax_pmap_shmap_merge=True. PiperOrigin-RevId: 867693155 Change-Id: Ibf11ef57bcd9588d1cd55f1124d2c90b809ed1f3 * Fix checkpoint.save PiperOrigin-RevId: 868221106 Change-Id: I81a24e66f1fac351553bbc79c955c0c06a5b76ae * Fix for brax es train, which was broken by cl/869007637. PiperOrigin-RevId: 869379212 Change-Id: Idacecde1a97a0612fd475e79a861d0086f252652 * Bump brax to 0.14.1 PiperOrigin-RevId: 869389467 Change-Id: Icac5463f0f2c6c4499a724cd70c16f6bec112bd1 * [pmap] Remove `jax.config.pmap_shmap_merge`. `jax.config.pmap_shmap_merge` was deprecated as of JAX v0.9.0 in January 2025 and will be removed in JAX v0.10.0 in April 2025. PiperOrigin-RevId: 875621716 Change-Id: I50df76fe83f1f5ee0cc69ced2ddb79b6a27ade97 * Vision update (google#662) * Add configurable CNN * Fix spatial softmax * Fix spatial softmax shape * Fix leading batch dimensions * Remove spatial softmax Change-Id: I00d018ca08226e60ba4ddff03a1a3df48e69dced * Fix Brax external tests by updating jax.sharding.PartitionSpec PiperOrigin-RevId: 878160775 Change-Id: I822c76523cff1d72a7c2135ac9332d39f158434e * Import Brax PR google#662: Vision update PiperOrigin-RevId: 878174910 Change-Id: I8088a5aee048b07566ce65e9da6f580814630e60 * Import Brax PR google#662: Vision update PiperOrigin-RevId: 878202976 Change-Id: I8c81f958868a9cc4f3d218b8df1a0e8df8b39d51 * Update Python CI versions (google#663) * Update Python versions in CI to 3.11, 3.12, and 3.13 * Update ci.yml * Add CNN kernel init and spatial softmax (h/t zakka) PiperOrigin-RevId: 879785561 Change-Id: I0e9dc87128ecad6f63953f06b533713f4e6994c8 * Add modifications for WujiHand training - training.py: Add curriculum support - acting.py: Add full_reset support - wrappers/training.py: Inherit info across episodes - base.py: Minor adjustments * Fix curriculum param naming and remove unused metric - Rename global_step to total_training_steps for clarity - Remove eval/epoch_eval_time metric (redundant with eval/sps) * Add CLAUDE.md documenting wuji-custom branch modifications Documents the curriculum learning kwargs passthrough architecture, modified files, data flow, and maintenance guide for future rebases. * Add merge rules and known limitations to CLAUDE.md * Fix CI: make curriculum_params injection opt-in Add inject_curriculum_params parameter (default False) to train(). Previously step_kwargs were always injected, breaking envs that don't accept **kwargs (e.g. InvertedPendulum in tests). Now only injected when explicitly enabled by the caller. * Update CLAUDE.md: document inject_curriculum_params opt-in behavior * Fix CI: avoid closure capture in vmap when kwargs is empty When no kwargs are passed, use the original jax.vmap path without closure to preserve identical numerical behavior with upstream. The closure-based path is only used when curriculum_params is injected. Fixes: training_test.py::test_domain_randomization_wrapper (255 != 256) * fix: separate code paths in DomainRandomizationVmapWrapper for empty kwargs Previous fix used conditional inside closure (kwargs still captured in scope). Now define completely separate step_fn closures - the no-kwargs path never references kwargs, producing identical JAX trace to upstream. * fix: eliminate kwargs closure capture in all wrapper step methods The previous fix only addressed DomainRandomizationVmapWrapper, but the test call chain is AutoResetWrapper → EpisodeWrapper → DRVmapWrapper. EpisodeWrapper's jax.lax.scan closure and AutoResetWrapper's env.step call also captured the empty kwargs dict, changing JAX trace behavior on Python 3.11. Now ALL wrappers use `if not kwargs` guard to ensure the no-kwargs path never references kwargs at all, producing identical traces to upstream. * feat(brax): add bounds_loss to PPO loss function Soft quadratic penalty on |mu| > 1.1 (rl_games industry standard). Controlled by bounds_loss_coef parameter (default 0.0 = off). Uses dist.loc to extract mean, compatible with both NormalDistribution and NormalTanhDistribution. Metrics: bounds_loss, bounds_loss_scaled, bounds_violation_rate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(brax): thread bounds_loss_coef through PPO train() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(brax): add bounds_loss unit tests for PPO Tests cover pure math verification, distribution compatibility (normal and tanh_normal), gradient finiteness, and smoke training integration across distribution types and coefficient values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(brax): add period-level episode metrics for curriculum decisions - logger.py: accumulate episode length/reward per eval period, expose pop_curriculum_summary() for one-shot consumption - train.py: log training-side episode metrics when log_training_metrics enabled Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: CLAUDE.md wrapper ordering + checkpoint.py None guard for load_config - Fix wrapper call chain diagram: AutoResetWrapper is outermost, not EpisodeWrapper - Add None guard in load_config() for kernel init fn fields to prevent KeyError when loading checkpoints saved with None kernel init fn Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: CLAUDE.md nitpicks — remove hardcoded line number, add code block lang Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: CLAUDE.md add eval path curriculum_params note Document that curriculum_params injection only occurs in training rollouts, not in eval path (brax/training/acting.py eval does not pass step_kwargs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: document multi-host curriculum sync limitation Curriculum update runs only on process_id==0. Single-host multi-GPU (our current setup) is unaffected. Multi-host would need broadcast. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Matej Aleksandrov <maleksandrov@google.com> Co-authored-by: Baruch Tabanpour <btaba@google.com> Co-authored-by: Erik Frey <erikfrey@google.com> Co-authored-by: Daniel Suo <dsuo@google.com> Co-authored-by: Taylor Howell <taylorhowell@google.com> Co-authored-by: Brax Team <no-reply@google.com> Co-authored-by: Mustafa H <34825877+StafaH@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

btaba added 2 commits March 3, 2026 14:31

Update Python versions in CI to 3.11, 3.12, and 3.13

b29a590

Update ci.yml

0c43c46

btaba merged commit 300ca1a into main Mar 3, 2026
6 checks passed

btaba added a commit that referenced this pull request Mar 4, 2026

Update Python CI versions (#663)

d2cb645

* Update Python versions in CI to 3.11, 3.12, and 3.13 * Update ci.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Python CI versions#663

Update Python CI versions#663
btaba merged 2 commits intomainfrom
update-python-ci-versions

btaba commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

btaba commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant