-
Notifications
You must be signed in to change notification settings - Fork 153
feat(autopilot): reduce excessive looping of main loop #3938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Reminder: Please update the DB Readme and comment whether migrations are reversible (include rollback scripts if applicable).
Caused by: |
crates/autopilot/src/run_loop.rs
Outdated
| } | ||
| } | ||
| } | ||
| Err(err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting everything in one huge nested statement makes it quite hard to read. Instead you could just smaller match statements with early returns like this:
let mut listener = match sqlx::postgres::PgListener::connect_with(&pool).await {
Ok(listener) => listener,
Err(err) => {
tracing::error!(?err, "failed to create PostgreSQL listener");
tokio::time::sleep(Duration::from_secs(5)).await;
continue;
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks a bit better with the early return as you suggest, but it still not that great.
I'm torn. do you think it would be best to create a nested function for the inner loop and use ? to resolve the Results? but then we don't get specific error messages for each different place an error could happen. I wonder if there is some other way to make this nice for loops in rust...
* use Notify from tokio instead of channel * move the notification function into the persistence module * update the trigger to return the newly created order ID so that it can be immediately accessed/updated into a cache
and try to make the notify loop more beautiful with an early return
…tocol/services into fix/autopilot-proper-wait
jmg-duarte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
crates/autopilot/src/run_loop.rs
Outdated
|
|
||
| while !control.should_shutdown() { | ||
| // Wait for a new block or order before proceeding | ||
| self_arc.wake_notify.notified().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would place it after line 175, after
// caches are warmed up, we're ready to do leader work
if let Some(startup) = self_arc.probes.startup.as_ref() {
startup.store(true, Ordering::Release);
}
so it doesn't wait on startup before it warms up the caches, which takes a lot of time and should be done right away. LGMT otherwise, 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh good idea. but then wouyldn't the caches potentially be stale by the time the auction starts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For networks which are not mainnet it doesn't really matter, because they have super short block times and little traffic. And for mainnet you are right, but the query on startup takes 10+ seconds, so I think we're actually increasing the likelihood of order getting into the auction earlier. 🤔 Under normal circumstances it doesn't matter, because an autopilot instance which is on standby will become leader and it will have its caches warmed up. tl;dr my suggestion is a actually a nit and not very relevant in the grand scheme of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so if on all networks except network it is not important, and on mainnet its unclear if it would help or not... your explanation makes sense though, especially considering auctions take more than one block to get going anyway. I will go head and move it to after cache warm up as you suggest.
|
Should we also account for ethflow orders then? |
there was a previous comment about this, and it was discovered that all orders end up in the same table, including ethflow/onchain orders. I just double checked and the specific table you mentioned appears to be updated in the same loop as the orders table inserts https://github.com/cowprotocol/services/blob/main/crates/autopilot/src/database/onchain_order_events/mod.rs#L368 in |
…tocol/services into fix/autopilot-proper-wait
squadgazzz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…icks up the cancelled order for the next auction
|
the test was failing because with the recent change to put the loop wait after the cache generation, a new block is only minted once before the test verifies the condition (in this case, of an order being cancelled after being placed), and auction order cache does not repopulated. for now I have added additional block minting to the wait_for_condition loop. (btw I wasn't able to replicate this test failure on my local machine, so seems it could be considered flaky) |
Description
Currently the
run_foveverloop in autopilot will continue looping without waiting as long as there is no auction to process and the block phase is still early enough. This is ofc not efficient and makes it difficult to view debug logs in the playground as they are spammed every time a block is mined. Technically an event listener is likely to be able to process events faster than a 50ms database scanning loop (as it is now)Changes
How to test
docker compose -f playground/docker-compose.fork.yml up --builddocker compose -f playground/docker-compose.fork.yml down --volumesplayground/test_playground.sh