fix Synapse base performance (more than 10x speed up)#2161
Conversation
24fc51b to
96224f6
Compare
thewhaleking
left a comment
There was a problem hiding this comment.
This looks good, but I want to test it a bit.
|
@thewhaleking do you think I should add some additional tests cases to test suite? |
Additional test cases are always welcomed. There was some chatter about cache invalidation, but I don't see how that would apply here. I could be missing something, however. |
|
AFAIK there is no cache invalidation issue here and a use case where it would be needed has not been presented so far. As for the additional test cases - it is my belief that the one already attached should suffice, but you mentioned some additional testing, so I wondered if I missed something. |
96224f6 to
8ec83f7
Compare
I just meant testing as in running locally and seeing if there was any way I could make it wrong. I have not encountered a situation in which this occurs. I would like to get this merged in early this week. |
|
2 weeks passed since PR creation Anything I can do to help this move along? |
|
@mjurbanski-reef Thanks for the contribution! Can you please rebase this off of staging so we can merge it? |
The base branch was changed.
8ec83f7 to
f416bf1
Compare
|
@ibraheem-opentensor rebased . Only failing test is the same as on staging now https://github.com/opentensor/bittensor/actions/runs/10300011141/job/28508536572 . |
Bug
bittensor Synapse.to_headers() introduces 100ms+ delay on both server and client side. i.e. each bittensor request RTT is overall delayed by 300ms more than it should be
Description of the Change
While this was a wrong place (inside the loop) to "regenerate" schema, somehow this line was "okay" in pydantic v1 but in v2 it is slower 1640b0d#diff-2a49a90fb91fba6743fa9a754231a89e98730b05e4ff55db2fa532e1915c126cR635
The problem with this is that some subnets do award the fastest miner etc, so adapting bittensor v7 would be deteremental for them.
The fixed code in this PR vs previous version result in 10x speedup for simple "Synapse" ping. The difference will likely be greater the more complex "Synapse" object is used.
Nice thing about it is that after releasing this change it is in miners best interests to update due to the same reward scoring reason mentioned already. I'm pretty sure after this it will be noticably faster than v6, but did not bother to measure it as there as everyone needs to update anyhow.
That is more than 15x improvement.
Realistically, it is best to think about it as "150ms" improvement per side.
Please note this code affects both client (dendrite) and server (axon) side, so if only one of them is updated the improvement will be half of that. This information is important to miners who want to know what is the difference between them using it or not (as they cannot influence the other side).
Alternate Designs
Not using cache, but that was more than 3x slower (i.e. 25ms per side)
Possible Drawbacks
None.
Verification Process
Unit tests already cover these things + performance testing using following script:
Release Notes