Describe the bug
The message_tracker contains the responsability to store consensus messages to be processed in the future, and the tracker checks every second if some message should be handled by the grandpa engine. But there is a possible bug inside the handleTick() method, the handleTick() method locks the message map and only unlocks after the method ends.
After locking it starts to iterate over vote messages which is not a problem and then starts to iterate over the commit messages and here we could face the deadlock.
For each commit message stored by the tracker, we call the handleMessage method from the grandpa engine which will process the commit message via handleCommitMessage and inside this function, in some cases (e.g. the block hash in the commit is not in the block tree yet), it is possible that it calls the tracker addCommit() method and this method will try to lock the tracker map which is already locked by the handleTick() and then the problem happens since addCommit() will wait until the release but the locker will only be released once the handleTick() finishes which will never happen.
Expected Behavior
- Every second the tracker should retrieve and process the commits
Current Behavior
- In some cases the tracker could hang forever while processing a specific commit
Possible Solution
- We should lock the tracker map, retrieve all the commit messages from there, and then release the lock and then process each commit message.
Describe the bug
The
message_trackercontains the responsability to store consensus messages to be processed in the future, and the tracker checks every second if some message should be handled by the grandpa engine. But there is a possible bug inside thehandleTick()method, thehandleTick()method locks the message map and only unlocks after the method ends.After locking it starts to iterate over
vote messageswhich is not a problem and then starts to iterate over thecommit messagesand here we could face the deadlock.For each
commit messagestored by the tracker, we call thehandleMessagemethod from the grandpa engine which will process the commit message viahandleCommitMessageand inside this function, in some cases (e.g. the block hash in the commit is not in the block tree yet), it is possible that it calls the trackeraddCommit()method and this method will try to lock the tracker map which is already locked by thehandleTick()and then the problem happens sinceaddCommit()will wait until the release but the locker will only be released once thehandleTick()finishes which will never happen.Expected Behavior
Current Behavior
Possible Solution