-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix data races triggered by functional tests. #4247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix data races triggered by functional tests. #4247
Conversation
ecb339b to
0a34a4b
Compare
323dc8d to
fe7bba9
Compare
PastaPastaPasta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for finding and fixing these issues! 👍
Few notes:
- it's better to avoid holding locks for too long to lower the risk of introducing deadlocks at some point (d876781)
- we should always lock
quorumVvecCswhenever we accessquorumVvecnow that it'sGUARDED_BY(quorumVvecCs)(099909a) - running
p2p_quorum_data.pyrevealed a similar issue on my machine, this time forskShare(da44228)
3531516 to
a28e1f5
Compare
PastaPastaPasta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-utACK
|
Needs rebase |
9540a56 to
018a7aa
Compare
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
|
This pull request has conflicts, please rebase. |
8b82d6a to
be928d9
Compare
da89dcd to
7955fb3
Compare
|
This pull request has conflicts, please rebase. |
4543fab to
3778e3f
Compare
yes, i finished all fixes now, waiting for the builds. |
3778e3f to
9047608
Compare
|
There was any additional data race / deadlock I detected on a run of feature_block.py which I forwarded to @gabriel-bjg My understanding is he's planning to fix that and push |
|
This pull request has conflicts, please rebase. |
9047608 to
d7501e0
Compare
Regarding the data races, they are all triggered by zmq and libzmq. According to bitcoin tsan exceptions, zmq and libzmq data races are all suppressed (https://github.com/bitcoin/bitcoin/blob/master/test/sanitizer_suppressions/tsan). As agreed with @PastaPastaPasta we will ignore them for now and won't fix anything related to them in this PR. The possible deadlock triggered by tsan is also a false positive. I present below the scenario which triggers the lock order inversion: Basically what tsan detects is T19 locking: cs_main (first locking), mempool.cs, m_control_mutex, cs_main (second locking as it's already locked, but it's a recursive mutex), cgovernancemanager.cs while T18 is locking: cs_main, mempool.cs. The possible deadlock detection comes from the fact that tsan is not able to detect that cs_main has been already locked before locking mempool.cs on thread19. So it sees thread19 locking mempool.cs, m_control_mutex, cs_main while it sees thread18 locking cs_main and mempool.cs afterwards. |
Function CWallet::KeepKey requires locking as it has concurrent access to database and member nKeysLeftSinceAutoBackup. Avoid data race when reading setInventoryTxToSend size by locking the read. If locking happens after the read, the size may change. Lock cs_mnauth when reading verifiedProRegTxHash. Make fRPCRunning atomic as it can be read/written from different threads simultaneously. Make m_masternode_iqr_connection atomic as it can be read/written from different threads simultaneously. Use a recursive mutex to synchronize concurrent access to quorumVvec. Make m_masternode_connection atomic as it can be read/written from different threads simultaneously. Make m_masternode_probe_connection atomic as it can be read/written from different threads simultaneously. Use a recursive mutex in order to lock access to activeMasterNode. Use a recursive mutex to synchronize concurrent access to skShare. Guarded all mnauth fields of a CNode.
d7501e0 to
b85fc7a
Compare
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
PastaPastaPasta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK for squash merge
Function CWallet::KeepKey requires locking as it has concurrent access to database and member nKeysLeftSinceAutoBackup. Avoid data race when reading setInventoryTxToSend size by locking the read. If locking happens after the read, the size may change. Lock cs_mnauth when reading verifiedProRegTxHash. Make fRPCRunning atomic as it can be read/written from different threads simultaneously. Make m_masternode_iqr_connection atomic as it can be read/written from different threads simultaneously. Use a recursive mutex to synchronize concurrent access to quorumVvec. Make m_masternode_connection atomic as it can be read/written from different threads simultaneously. Make m_masternode_probe_connection atomic as it can be read/written from different threads simultaneously. Use a recursive mutex in order to lock access to activeMasterNode. Use a recursive mutex to synchronize concurrent access to skShare. Guarded all mnauth fields of a CNode. Co-authored-by: UdjinM6 <UdjinM6@users.noreply.github.com>
Thread sanitizer triggers data races on functional tests. Steps to reproduce:
Fixes: