Skip to content

Conversation

@zingale
Copy link
Member

@zingale zingale commented Oct 22, 2025

This allows the amount of unrolling to be set as NET_LOOP_UNROLL_LEN

@zingale
Copy link
Member Author

zingale commented Oct 24, 2025

unrolling by 4 seems to be a much better option. On a test_react run with ase (using 32^3 boxes), this gets 12% faster on groot/CUDA than development

@zingale
Copy link
Member Author

zingale commented Oct 25, 2025

I tested this on a 2D Castro flame_wave with cno-he-burn-34am network on Frontier (48 nodes), and the entire simulation runs 5% faster with this change.

@zingale zingale merged commit e711ad6 into AMReX-Astro:development Oct 27, 2025
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants