tail: fix race condition#3798
Conversation
There exists a race condition (RC) that can occur if changes to a path happen after the initial print loop in `uu_tail()`, but before the path is added to the notify-Watcher thread in `follow()`. To minimize the window where the RC can occur, this moves starting the Watcher thread and adding paths to it from `follow()` to the initial print loop in `uu_tail()`. Additionally, to make sure the RC cannot happen in "gnu/tests/tail-2/F-headers.sh", the error message that is used as a trigger in this test, is delayed until the path is added to the Watcher thread.
Remove workarounds for "tests/tail-2/F-headers.sh" which are (presumably) no longer needed because of the race condition fix.
Remove workarounds for "tests/tail-2/F-headers.sh" which are (presumably) no longer needed because of the race condition fix.
|
@niyaznigmatullin how do you feel about this? thanks |
|
I like the change, but can we move the watcher addition before tail actually writes anything? |
|
I think I found another race condition on detecting the orphans. Because it's detected after follow is run, if a file didn't exist and it's parent directory didn't exist, we print that we cannot access it. And then if it's created, we don't watch it, and it's not detecting that it's an orphan, because the directory could have been created by that time. As I understand, the idea was, that we need to make sure for each file either it's being watched, or it's parent is watched or it's added to orphans. So the last is not true sometimes. |
|
@niyaznigmatullin Thanks for the feedback. I think both of your points are valid. I'll implement them soon. |
Move "adding paths to Watcher thread" to its own loop and run this loop before the initial tail-print-loop in order to minimize the window for race conditions.
3bdd064 to
85bfef8
Compare
|
This should be good to go. |
|
I tested all tests with timeouts, everything passed but I propose to use old opened reader, since it stores the file descriptor inside it, so no need to recreate one. Proposed change is here: niyaznigmatullin@fd145ee |
Nice, thanks for testing.
That looks like a good idea, please open a follow-up PR. |
fixes #3765
There exists a race condition (RC) that can occur if changes to a path
happen after the initial print loop in
uu_tail(), but before thepath is added to the notify-Watcher thread in
follow().@niyaznigmatullin found this RC and identified it as the reason why "gnu/tests/tail-2/F-headers.sh" failed sometimes in the CI and he also came up with a fix for it.
The proposed fix manually handles possible events that could have been missed before.
However, this does not reduce the possibility for the RC to occur, and it uses a lot of duplicate code instead of utilizing the functionality of the Watcher thread together with the
handle_event()function.To minimize the window where the RC can occur, this PR moves starting the
Watcher thread and adding paths to it from
follow()to the initialprint loop in
uu_tail().Additionally, to make sure the RC cannot happen in
"gnu/tests/tail-2/F-headers.sh", the error message that is used as a trigger
in this test, is delayed until the path is added to the Watcher thread.
The workarounds for "tests/tail-2/F-headers.sh" in build-gnu.sh should no longer be needed.