Skip to content

Conversation

@0xEllie
Copy link

@0xEllie 0xEllie commented Sep 22, 2025

Implementation of Ephemery auto restart feature

Fixes Issue(s) #9083 and #8180

This PR is going to resolve feedback on my last PR here.

  • In order not to add up any more overhead to BesuCommand.java the main implementation has been handled inside Runner.java.

  • The implementation had an eye on not changing the codebase of BesuCommand.java while not making a big change on Runner.java as well.

  • Almost every change is under IfNetworkEphemery with no change on other networks or mainnet that we care about the most.

I appreciate your feedback!

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests
  • hive tests: Engine or other RPCs modified?

@0xEllie
Copy link
Author

0xEllie commented Oct 18, 2025

description for commit : ephemeryReset:optimize restart lifecycle and improve logging.
Fixes issue(s) #9083 and #8180

I reviewed my last PR

On ephemery restart, several initialization steps are now skipped since their
configuration parameters don't change across restart cycles. Here are the changes, and the reasonings:

  • Vertex: no need to recreate, as we are not changing any metrics params for ephemery restart. Therefore, vertex stop and shout down is also omitted from the stop services inside the Runner class. Even better, as shutting it down can be problematic.

    Better mention that with this change, peer discovery is not going to be continued while Ephemery is restarting. networkRunner object is going to be stopped at each restart and will be changed to a new one, so p2pNetwork which is derived from networkRunner is also going to be a new one. I've provided screenshots below of peers discovery that at each restart it starts over.

  • Print and exit: better be outside of ephemery restart, as we are going to read and write into the Besu datapath; better make sure it's accessible in the first place.

  • ValidateOptions: we are not changing any commandLine options, so there is no need to validate again at each restart.

  • InstantiateSignatureAlgorithmFactory: no need, as it's going to read EcCurve from genesis, which is not going to be changed on Ephemery restart

  • SetMergeConfigOptions: no need, as it's going to set merge to true or false based on whether the total genesis difficulty attribute is present, which is not prone to change on the ephemery cycle.

Improvement changes on logs:

  • Loges on ephemery restart will show the days remaining until the next restart, and if it's less than one day, it shows in hours; if less than an hour, it shows the minutes.

Refactor changes:

  • Imports on BesuCommandTests has been changed to their specific classes.
  • Variable ephemeryCycleId name changed to ephemeryNextCycleId which is aligned with the value of the variable.
image (8) image (5)

@0xEllie
Copy link
Author

0xEllie commented Oct 18, 2025

description for commit : Tests for Ephemery restart feature, related issue(s) #9083 #8180

Tests for Ephemery

These tests are written specifically for the Ephemery feature and shouldn't run in parallel due to DB concurrent policy.

For running tests, comment out the disable annotation and run each test one by one. This way, other contributors working on the codebase won't be confused.

@macfarla
Copy link
Contributor

macfarla commented Nov 2, 2025

@jflo can you provide Product Owner input as to whether we want this feature in besu itself?

@0xEllie
Copy link
Author

0xEllie commented Nov 3, 2025

Pr description for Enables Besu become a bootnode, preserves the key over every cycle

This PR is going to keep the key throughout the ephemery startup and reserves the same key over every cycle of ephemery restart.

Without this PR, the key is already the same throughout the running node if one doesn't stop the Besu.

There are two ways to implement it.

  • Read the key from the file at the end of each cycle and store it in the new folder at the start of each cycle (every cycle's data will be saved into a new folder to start on a fresh database).
    Pros: This way we handle all the logic inside Runner.
    Cons: It's not safe to read from a file, especially when it comes to a key. Also, if the file accidentally has been removed or the content of the file has been changed, even accidentally, the code would break.

  • Saves the initial key at the startup of the node inside keyPair in BesuCommand and later will store it in the new folder at the start of each ephemery cycle in Runner.
    Pros: This way it would be less exposed to attack and also have fewer bugs and errors in code.
    Cons: It changes BesuCommand due to its nature, which handles lots of configuring logic, and is not supposed to handle more logic inside it. This one is very small, and I think it's worth it.

I went with the second option. Would like to hear your opinion.

(It sounds like this pr unintentionally changed name of ephemery.json file which I didn't add to the commit and honestly dot know how it got itself inside the pr. I'm sorry, I already added another pr to fix it)

Copy link
Contributor

@jflo jflo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a product perspective, this is a good feature we should have in Besu. It will improve the UX and encourage Ephemery network users to stick around after resets.


/** Tests for {@link BesuCommand}. */
@ExtendWith(MockitoExtension.class)
@Disabled("needs to run each test on a single-run")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be a problem, because nobody will remember to do this. If this can't be run via CI/CD, it won't actually be checking for regressions.

@0xEllie
Copy link
Author

0xEllie commented Nov 7, 2025

Pr description for test(ephemery): enable CI/CD with in-memory storage plus keypair test…

This PR mainly fixes database lock errors that prevented tests from running and also includes tests for keypair persistence validation:

  • Replaced RocksDB with in-memory storage to fix lock conflicts.
  • Implemented a more comprehensive cleanup in tearDown by removing files that have been added for each test.
  • Added tests to validate keypair persistence across all restart cycles.

Ephemery test result:
image (11)

@jflo
Copy link
Contributor

jflo commented Dec 2, 2025

Seems like this addresses issues raised on #9084 so suggest closing 9084

@github-project-automation github-project-automation bot moved this to Backlog in RC 25.12.0 Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Open PRs

Development

Successfully merging this pull request may close these issues.

3 participants