Skip to content

Conversation

@PaintNinja
Copy link
Contributor

@PaintNinja PaintNinja commented Dec 27, 2024

This PR is the first step towards fixing EventBus' benchmarking infrastructure to reduce overhead and improve accuracy, based on work I've done over the past year or so working on NovaBus prototypes locally.

Comparisons show roughly a 5-25% reduction in like-for-like post times depending on the benchmark, without any changes to EventBus itself.

The two main changes are:

  1. Use static finals everywhere for the hot paths, to better simulate scenarios such as MinecraftForge.EVENT_BUS.post(...)
  2. Generate entire benchmark classes at runtime rather than only the listeners

Due to the nature of JMH, this sadly increases the amount of repetitive code, although this isn't as bad as it could've been thanks to the benchmark generation logic where applicable. As an added bonus, it's fairly trivial to benchmark arbitrary numbers of listeners now without being restricted to hardcoded presets.

The classloader variant benchmarks have been removed, as this mode is only functional when using ModLauncher anyway and the results are pretty much the same. The registration benchmarks are kept for now and avoid unnecessary synchronisation, but still have pretty high variance which limits their usefulness.

Future work includes revisiting the Groovy code and adding a debug mode that re-enables JFR on request (a kind of non-benchmarking mode). I left those for now to get something functional out sooner rather than stalling for too long. I've been playing around with multiple optimisation ideas for current EventBus and comparing the before & after with the old system and the new system and the differences are much more visible on the new system.

Bare with me here... this is needed for a future commit in this PR
- BenchmarkBase is replaced by static calls to BenchmarkUtils
- As much setup as possible is done statically and stored in static finals
- Update SecureModules and avoid the deprecated constructors

Still to do:
- Reduce the amount of repetitive code
- CacheConcurrent, CacheCopy and CacheClassLoader benchmarks
- Benchmark class generation for arbitrary post sizes, instead of hardcoding 3 presets (single, dozen, hundred)
Allows for arbitrary multipliers of listeners with less code duplication
These are adding significant overhead on NoLoader results
This field is added at runtime
@PaintNinja PaintNinja marked this pull request as ready for review January 14, 2025 19:42
@PaintNinja PaintNinja added the enhancement New feature or request label Jan 16, 2025
@PaintNinja PaintNinja merged commit da2bba6 into MinecraftForge:master Jan 17, 2025
@PaintNinja PaintNinja mentioned this pull request Mar 9, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants