Skip to content

Conversation

@nicolovalle
Copy link
Contributor

@nicolovalle nicolovalle commented Aug 4, 2024

@iravasen , @jovankaas, @IsakovAD, below the details of the fine tuning in this PR:

**outdated (see conversation below): **

The dead map builder is currently sampling online 1 TF every 350 using the condition "CurrentTF % 350 == 0".
I've noticed that few hiccups may occur resulting in gaps of 2x, 3x or 4x 350 TFs.

This PR tries to workaround this, by modifying the sampling condition as follows: "CurrentTF % 350 == 0 OR CurrentTF - LastProcessedTF > 350".

caveat: depending on the order of arrival of the TFs to the aggregator node, this may result in oversampling, good from the physics point of view but increasing the size of the object handled by the workflow.

** New implementation in the second commit **, commented below

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2024

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass3
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0

@alibuild
Copy link
Collaborator

alibuild commented Aug 5, 2024

Error while checking build/O2/fullCI for dc1e5d3 at 2024-08-05 04:55:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'

Full log here.

@iravasen
Copy link
Contributor

iravasen commented Aug 7, 2024

Thanks a lot @nicolovalle, sounds good to me! I do not expect a dramatic increase, right? Otherwise you should count the TFs arriving at the aggregator (that you are doing with mTFCounter) and then process a TF when mTFCounter % 350 == 0. Is there any disadvantage in doing this in your workflow? Thanks

@nicolovalle
Copy link
Contributor Author

nicolovalle commented Aug 7, 2024

Thanks a lot @nicolovalle, sounds good to me! I do not expect a dramatic increase, right? Otherwise you should count the TFs arriving at the aggregator (that you are doing with mTFCounter) and then process a TF when mTFCounter % 350 == 0. Is there any disadvantage in doing this in your workflow? Thanks

Hi @iravasen , thanks for the comment. Basing the criterion on the number of arrived TFs rather than the orbit number indeed has a disadvantage: it increases the likelihood of larger gaps between map snapshots (gaps = in human time: seconds or orbit number) . The logs demonstrate that TFs arrive at the aggregator in a random order. As a result, using the condition mTFCounter % 350 == 0 ensures that 1/350 of the run is processed on average, but the sampling remains entirely random as well.

Actually your comment prompted me to study the numbers a bit better... From the online logs, I see that if the n-th TF arriving at calib0 contains orbit X, then the (n+1)-th TF will contain an orbit approximately in the range [X-100k, X+100k]. It is very large... also my current proposal is ineffective!
I will mark the PR as WIP, thinking of a better solution

@nicolovalle nicolovalle changed the title ITS - (and MFT), better TF sampling in the deadmap builder [WIP] ITS - (and MFT), better TF sampling in the deadmap builder Aug 7, 2024
@iravasen
Copy link
Contributor

iravasen commented Aug 8, 2024

Thanks a lot for the clarification @nicolovalle! Ok yes, once I'm back from vacation let me know if you need help with some checks or thinking :-)

@nicolovalle
Copy link
Contributor Author

The improved sampling is now in place.

A new configurable (tolerance) is introduced. The sampling condition returns true if the TF index (calculated as orbit/TF_length) falls within any interval [k * tf_sampling, k * tf_sampling + tolerance) for some integer k, provided no other TFs have been found in the same interval.

This requires additional operations with containers. The code has been tested, and with default values (suitable for the online), the processing time for the accepted TFs (< 5% of the total) remains below 1 ms, comfortably within acceptable limits for online.

(I would merge this, if you agree @iravasen -no need to reply soon :)- )

@nicolovalle nicolovalle changed the title [WIP] ITS - (and MFT), better TF sampling in the deadmap builder ITS - (and MFT), better TF sampling in the deadmap builder Aug 10, 2024
@iravasen
Copy link
Contributor

Thanks a lot @nicolovalle, sounds good to me! I approved it. I let @mconcas / @shahor02 merging it.

@mconcas mconcas merged commit 84ee8e1 into AliceO2Group:dev Aug 14, 2024
@nicolovalle nicolovalle deleted the nv-deadmapsampling branch August 30, 2024 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants