Skip to content

Conversation

@ktf
Copy link
Member

@ktf ktf commented Jun 30, 2021

In case we are not running, we can afford processing the callbacks
as soon as possible.

In case we are not running, we can afford processing the callbacks
as soon as possible.
@ktf ktf requested a review from a team as a code owner June 30, 2021 10:17
@ktf
Copy link
Member Author

ktf commented Jun 30, 2021

@davidrohr this should anticipate the callbacks as early as possible. The mutex should guarantee we do not get overlap even when the callback is received before we switch to running but terminates once we are already in running.

@davidrohr
Copy link
Collaborator

ok, thx, will have to try it in FST with DD to see that it doesn't break anything. Then we can merge it and once we have new RPMs we can try it with AliECS.

@davidrohr
Copy link
Collaborator

I tried this, and it breaks the full system test.
What I see is that the memory registration callback for the unmanaged region from the StfBuilder is only called after the first message with raw data was received, so the memory is not registered yet when the GPU tries to access it.

@ktf
Copy link
Member Author

ktf commented Jul 8, 2021

I guess it still comes too late? Does it fail without this PR?

@davidrohr
Copy link
Collaborator

davidrohr commented Jul 8, 2021 via email

@ktf
Copy link
Member Author

ktf commented Jul 8, 2021

Ok, the issue is that if the event arrives to late, after the event loop is sitting in uv_run, then the next chance to run the callback is actually after the data has been processed. The latest commit should fix it. Still we need a way to stay in "init" until all the regions have been registered or maybe I need to awake the loop when a memory region event was triggered, so that they get processed immediately.

@ironMann @davidrohr Can I assume that all the region events are fired before data starts to be sent?

@ironMann
Copy link
Contributor

ironMann commented Jul 9, 2021

@ktf I don't know when are they fired, but tfbuilder doesn't go to READY (or send data) until CreateUnmanagedRegion() returns. This presently takes ~20s on EPN. Regions are created in InitTask()

@rbx can clarify about event timings

@ktf
Copy link
Member Author

ktf commented Jul 9, 2021

@ironMann ok, thanks, but then I do not understand why this PR (or the one before) does not work: if you do not transition to ready, DPL will not transition to START, and therefore all the events should be handled synchronously in the callback, unless they are delivered late. I can easily add something which wakes up the event loop when a region event is delivered, but I have the impression the problem is that FairMQ itself delivers the events too late.

@rbx
Copy link
Contributor

rbx commented Jul 9, 2021

Region callback for a given region will fire shortly after its creation, theoretically even before CreateUnmanagedRegion() returns. At the end of CreateUnmanagedRegion() a shm condition variable is notified that all region event subscribers listen on (if they subscribe late, they will see and call the "old" events right away).

These callbacks are not synchronized to states.

@ktf
Copy link
Member Author

ktf commented Jul 9, 2021

Ok, but how is that possible that DPL is already in RUN state, waiting in the DPL event loop, when the notification happens? And that we know it's actually the case, because @davidrohr saw the crash with the previous incarnation of this PR.

@rbx
Copy link
Contributor

rbx commented Jul 9, 2021

  1. Are the states synchronized (= everyone has to be Ready before anyone goes to Running)?
  2. Does DPL wait in InitTask until it sees all the expected events?

If one of the above is not true, then indeed it can happen easily that DPL only sees the events during Running.

@ktf
Copy link
Member Author

ktf commented Jul 9, 2021

  1. That's my understanding from @teo
  2. No, it doesn't. DPL just runs the callback immediately if not "Running" or will queue it otherwise and then deliver them when data is processed. Again, I can awake the event loop as I get one, but I do not understand why it's not processing them immediately.

@davidrohr
Copy link
Collaborator

@ktf : With this fix, the callback is processed before the data.
But I think we still have the problem that the callback from the StfBuilder arrives late, i.e. with the first data.
The callback for the main shm region comes in between these two info messages

[2887173:tpc-tracker]: [10:42:17][STATE] READY ---> RUNNING
[2887173:tpc-tracker]: [10:42:17][ERROR] CALLBACK 0x7fddd4000000 68719476736
[2887173:tpc-tracker]: [10:42:17][INFO] DEVICE: Running...

Can I interpret this as this is in the state transision going from ready to running?
In any case, I think we can merge this PR, the problem that the StfCallback comes late must be fixed somewhere else.

@rbx
Copy link
Contributor

rbx commented Jul 9, 2021

  1. No, it doesn't. DPL just runs the callback immediately if not "Running" or will queue it otherwise and then deliver them when data is processed. Again, I can awake the event loop as I get one, but I do not understand why it's not processing them immediately.

The callback is called while holding a shm mutex that is used for the condition variable (and for region creation/destruction). So when multiple devices register, their callback execution of multiple devices will be serialized. This would explain why you don't see them right away, at least not on each device.

The idea behind the mutex here (besides the CV) is to avoid region creator destroying a region while a peer is still using it in his callback.

@ktf
Copy link
Member Author

ktf commented Jul 9, 2021

@davidrohr indeed if that is the place where the callback is delivered, maybe the state is already "Running" and that would explain why your code is not executed as the event arrives. What I can improve on my side is making the callback handling awake the event loop in DPL, so that we do not need to wait for the first data before handling it. Still that does not explain why the first invocation of handleRegionCallbacks was skipped, as that happens immediately in ConditionalRun, unless the message [10:42:17][INFO] DEVICE: Running... is somewhat asynchronous...

@ktf
Copy link
Member Author

ktf commented Jul 9, 2021

@TimoWilken what's up with the CS8 tests? They are in "Expected — Waiting for status to be reported" since yesterday night.
Merging this, since @davidrohr confirmed it works fine. I will provide a separate PR with the attempt to call the callbacks as early as possible for DPL.

@ktf ktf merged commit e0d179f into AliceO2Group:dev Jul 9, 2021
@ktf
Copy link
Member Author

ktf commented Jul 9, 2021

#6621 should improve the callback processing latency on DPL side. Not sure if that is enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants