-
Notifications
You must be signed in to change notification settings - Fork 484
Refactor MC part of AOD producers #12039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor MC part of AOD producers #12039
Conversation
|
This PR did not have any update in the last 30 days. Is it still needed? Unless further action in will be closed in 5 days. |
|
Error while checking build/O2/fullCI for ba227be at 2023-11-11 11:18: Full log here. |
|
Error while checking build/O2/fullCI for a2b651b at 2023-11-11 12:04: Full log here. |
|
Post mortem on checks
Yours, |
|
Error while checking build/O2/fullCI for 80f16d5 at 2023-11-11 23:49: Full log here. |
|
Now also Also, please remove Yours, |
|
Error while checking build/O2/fullCI for ff54b7e at 2023-11-13 14:12: Full log here. |
|
Hi all, I made an MR against ALICE's Yours, |
|
Error while checking build/O2/fullCI for ecca7a1 at 2023-11-14 13:55: Full log here. |
|
Hi all, Any news on this? Yours, |
|
Please advice on next steps to get this merged. Thanks. |
|
Sorry for the delay. We are still quite busy with commissioning work for upcoming PbPb productions and currently fixing known bugs in AOD conversion (which might interfere here). I am afraid, given the amount of changes, a proper review might still take some time but it's definitely on my list. You could do me a favour and clean up your PR. Currently, there are 88 commits and it would be good to rebase on dev and just force-push one single (or few) commit with the proposed changes. |
|
Hi Sandro,
The changes are not that big, and are isolated to the MC part only. That's why I did a thorough test on
Could you give some rather detailed instructions on how to do that? I'm a little worried that things will break. Thanks. Yours, |
0c45ab4 to
4a2d202
Compare
|
OK, I figured it out myself: For future reference (ignore for the discussion of the MR):
Note |
4a2d202 to
dc80f6a
Compare
|
Error while checking build/O2/fullCI for dc80f6a at 2023-12-10 08:40: Full log here. |
|
@cholmcc : I now find some time to take a look into this. But, unfortunately, some merge conflicts need to be fixed first of all. Could you take care of this (and force-push a single commit)? Thx. |
|
Hi Sandro, Sorry for the late reply - been a bit busy the past week or so. I will try to get around to fixing the merge conflicts. Note sure if I can make it before Christmas though. Yours, Christian |
|
Error while checking build/O2/fullCI for b81ed6f at 2024-01-11 23:46: Full log here. |
sawenzel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi. I started studying the code in this PR. At the moment, more work is needed because
the PR does not actually compile.
Thereafter, please also perform a validation step by running a simple O2DPG MC workflow:
(alibuild build O2sim; alienv enter O2sim; .... then run a script like)
#!/bin/bash
#
# A example workflow MC->RECO->AOD for a simple pp production
# excluding ZDC
NWORKERS=${NWORKERS:-8}
SIMENGINE=${SIMENGINE:-TGeant4}
# create workflow
${O2DPG_ROOT}/MC/bin/o2dpg_sim_workflow.py -eCM 14000 -col pp -gen pythia8 -proc "cdiff" -tf 4 -ns 200 -e ${SIMENGINE} -j ${NWORKERS} \
-run 303000 -seed 624 -interactionRate 50000
# run workflow
${O2DPG_ROOT}/MC/bin/o2_dpg_workflow_runner.py -f workflow.json -tt aod
I would say we can go ahead from my side once this succeeds and the AOD content is compatible with now.
|
OK, yet another error - this time in call to |
Are you using obsolete O2DPG? The requirement of TPC in the sources is recent and is accounted in the
|
Yay! It seemed to work. Looking in the various trees I see data as expected, including the new HepMC auxiliary tables (the HeayIon table is clearly empty because it was a pp simulation). |
|
I did (code from Rivet for O2 - in particular from followed by or alternatively to get the following plot I think the MR has lived up to its promise 😄 |
Yes, it was a little old - but not terribly.
None the less true - that code suffers from Fortrantitis in a bad way 😄 Typically when I see Python code, even by novice programmers, it is OK structured and clear, which is why this code surprised me. I guess it must be a pain to maintain. I understand how this comes about - I'll do quick fix here, then another fix here, and oh, its more important that it worked yesterday than it will work for eternity and I will come back and tidy it up - and before you know it, you have a mess on your hands. |
|
Error while checking build/O2/fullCI for 64ebd29 at 2024-01-18 06:00: Full log here. |
|
Error while checking build/O2/fullCI for 979e20a at 2024-01-18 07:57: Full log here. |
|
@cholmcc : Do you believe that insulting people in commit messages and PR discussions is helpful? Part of your challenges come from the fact that you are not using the collaboration rules and the official workflow of using alidist for software distribution, which defines a tested and curated software portfolio. In there, O2DPG is fully up to date and guaranteed to run the example workflow that I gave to you (because we test it nightly). I suggested running the above workflow, because when I took time to review your developments: |
I don't think I insulted anyone - if someone felt that way, well I'd say they have rather thin skin. I certainly didn't point anyone out in particular, only that the code is truly horrible. What I do think is helpful is to point out that some code is really poorly written and it would make sense, not least for maintenance purposes, fix it.
Should that be considered a threat? Because that is clearly not OK. Look, I did not do this work for my sake - I did it because I was tasked by the collaboration to do a job - namely integrate Rivet into O2. This MR is just one piece of that puzzle. So you are not doing me, but the collaboration, wrong if you chose to stop doing your job.
You mean like issuing veiled threats?
I really don't want to solve the problems of the DPG scripts - those scripts are so obtuse that it would take a considerable effort to do so - in fact, I think it would most likely be more efficient to start over again on those scripts. Trust, me I've ping many people on Mattermost and else where how to do a full MC->AOD test. It wasn't until you kindly gave the instruction that I knew how to do it. When it then turns out that that didn't really work because of problems in the DPG scripts - well - then I got even more frustrated than before.
Which rules? There's no hard and fast rule that you need to use
An no, the problems does not stem from me not using
I'm aware that not using As for "tested" - it seems the tests are basically - does it run on LXPLUS, OK, we're good. That's not how one does testing of software. Testing software involves meticulous attention to corner cases and presumptions. Try to deploy
OK, so there seems to be something going on in your test environment that does not comport with a vanilla environment. F.ex., without the option BTW, there's no guaranty - even with
This was due to some other change somewhere else in O2. That the
It would have been helpful if you could have reported the error you saw when you saw it. I have no way of gauging what the problem might have been for you without that input (or rather output 😄) - could be because of missing Anyways, I think we're at a point where this PR is fairly well tested, and I think it will make sense to merge it at this point. Then, I will get out of your hair so you can get around to something that is more fun for you. Thanks for all your help. Yours, Christian |
|
@cholmcc I am afraid you misunderstood what was the job description of Sandro. He is helping you not because this is his task, but because we want to move the things on. So this kind of discussion is not welcome in the PRs. If you have any question on this, I am happy to chat with you privately. At the end we have been working together for about 20 years :) |
|
@cholmcc I propose to run the original code, then run again the AOD creator with these modifications on the same input and see what is the difference between the original AODs and the newly created ones. If we are happy with the result, we can merge. |
That's a non-trivial thing to do, as you know. I means switch to the |
|
Error while checking build/O2/fullCI for 9a2cea6 at 2024-01-18 18:15: Full log here. |
The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`. The helpers are
- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
the `o2::aodHepMCSections` table. This uses the predefined key
constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
`updateParticle`.
These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
a default value.
For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
the decision of what to do is taken on the first event seen.
- If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
- If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
Note that the availability of keys is _not_ checked after the first
event.
That means, if the criteria isn't met on the first event, then
the tables will _never_ be update (as if the policy was
`HepMCUpdate::never`).
On the other hand, if the criteria was met, than the tables _will_ be
update an all events (as if the policy was `HepMCUpdate::always`).
Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda). This template could be moved to `AnalysisDataFormats.h` or
the like.
The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.
The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.
Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively. `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources. However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.
**Some important (late) bug fixes**
Since the commits are squashed into one, I give some more details here
on some later commits. For future reference.
**HepMC aux tables**
I found two problems with the HepMC aux tables after looking at the
data a little deeper
- I used the BC identifier as the collision index field in the HepMC
aux tables. This happened because I took my clues from the MC
producer. The MC producer does not generate a BC ID - even though
it sort of looked like it - the BC ID in the MC producer is
essentially an index. I've fixed this by passing the collision
index (called `collisionID` most places) to the relevant member
functions and the table updates.
- The producers were set up so that HepMC aux tables would _only_ be
written if the input had the corresponding data. If a model gave
the Cross-section, but not PDF nor Heavy-Ion information, then only
the cross-section table would be populated. Pythia, for example,
gives the cross-section and PDF information, but no Heavy-Ion
information.
All three tables would be produced, but some may not have any
entries.
Later on, when we want to process the events (by Rivet f.ex.), we
like to access as much HepMC aux information as possible so we may
build the most complete HepMC event possible. Thus, we would like
to read in
- MC Collisions
- MC particles
- 3 HepMC aux
However, if one or more of the AOD trees of the 3 HepMC aux tables
had no entries, it will cause the producer to crash
```
libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
```
I cannot quite figure out why that happens, but I guess it's a
problem triggered by `TreeToTable` or the call back on that object.
The (temporary) solution to the above problem is to set the update
policy on the HepMC aux tables to be `always`. That means we will
always write the tables and give them entries. The real solution to
the problem will be to fix `TreeToTable` or what ever is causing the
above `SIGSEGV`.
**MC track lables**
To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table. This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data. Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.
Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`. It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag. A zero or negative storage flag
means that the track is stored.
In `dev`, the algorithm is
if particle from EG: store it
else if particle is physical primary: store it
else if particle is physics: store it
if particle found in mapping:
mark mothers and daughters for storage
The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored. In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.
The correct algorithm is
if particle not previously marked for storage and
particle is not from EG and
particle is not physical primary and
particle is not physics:
do not store particle
go on to next particle
store particle and its mothers and daughters
In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.
Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results. Once this
MR is accepted, the select upgraded to correct algorithm.
9a2cea6 to
f87bad4
Compare
|
Hi all, I pushed a new singe commit to this MR. I found a few bugs after comparing to output from Here, I will show you the results of my comparisons between To be transparent about what I did:
What I see is that all comes out the same between I spent quite a bit of time on the I also did a closure test of the event before and after AOD generation. That is, I took the kinematics tree from one timeframe of the simulation and generated a Then, from both this and then illustrated a few events with from my O2Physics Rivet branch. I attach them here in a zip archive. Note that the Of course, I would be useful to also illustrate the event from the output of the EG. However, that data isn't explicitly stored by the DPG script because Pythia is run as an internal thing in the simulations. However, I've done that comparison before (EG and post simulation) and that checked out too. So the chain from EG->SIM->DIGIT->RECO->AOD seems to preserve the event structure - also with this MR 😄 Yours, Christian |
|
@cholmcc : Thanks for doing this validation. As you say, the important part is the agreement of the MC tables. Track tables may indeed vary a bit since some digitisation algos and TPC reconstruction has random number usage with multi-threading that is not completely reproducible between runs. Your script may be useful for doing this validation in more automatic ways in the future. To me all looks good now and if there are no further objections, the PR will be merged end of the day. |
|
Hi Sandro,
It was not (edit) as much work as I first thought it to be 😄
OK, so I guess its likely that randomness is part of the reason why the
By all means take it and put it somewhere accessible.
That's great. Looking forward to that. Could you have a peek at the issue I raised in the commit log - namely with particle selection I believe is currently broken in Thanks again. Yours, Christian |
|
Error while checking build/O2/fullCI for f87bad4 at 2024-01-19 14:39: Full log here. |
|
|
|
Error while checking build/O2/fullCI for 636ef7a at 2024-01-19 20:02: Full log here. |
|
Thank you for merging this. We are now one step closer to getting Rivet fully integrated into Thank you. Yours, Christian |
* Refactor MC part of AOD producers
The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`. The helpers are
- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
the `o2::aodHepMCSections` table. This uses the predefined key
constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
`updateParticle`.
These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
a default value.
For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
the decision of what to do is taken on the first event seen.
- If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
- If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
Note that the availability of keys is _not_ checked after the first
event.
That means, if the criteria isn't met on the first event, then
the tables will _never_ be update (as if the policy was
`HepMCUpdate::never`).
On the other hand, if the criteria was met, than the tables _will_ be
update an all events (as if the policy was `HepMCUpdate::always`).
Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda). This template could be moved to `AnalysisDataFormats.h` or
the like.
The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.
The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.
Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively. `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources. However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.
**Some important (late) bug fixes**
Since the commits are squashed into one, I give some more details here
on some later commits. For future reference.
**HepMC aux tables**
I found two problems with the HepMC aux tables after looking at the
data a little deeper
- I used the BC identifier as the collision index field in the HepMC
aux tables. This happened because I took my clues from the MC
producer. The MC producer does not generate a BC ID - even though
it sort of looked like it - the BC ID in the MC producer is
essentially an index. I've fixed this by passing the collision
index (called `collisionID` most places) to the relevant member
functions and the table updates.
- The producers were set up so that HepMC aux tables would _only_ be
written if the input had the corresponding data. If a model gave
the Cross-section, but not PDF nor Heavy-Ion information, then only
the cross-section table would be populated. Pythia, for example,
gives the cross-section and PDF information, but no Heavy-Ion
information.
All three tables would be produced, but some may not have any
entries.
Later on, when we want to process the events (by Rivet f.ex.), we
like to access as much HepMC aux information as possible so we may
build the most complete HepMC event possible. Thus, we would like
to read in
- MC Collisions
- MC particles
- 3 HepMC aux
However, if one or more of the AOD trees of the 3 HepMC aux tables
had no entries, it will cause the producer to crash
```
libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
```
I cannot quite figure out why that happens, but I guess it's a
problem triggered by `TreeToTable` or the call back on that object.
The (temporary) solution to the above problem is to set the update
policy on the HepMC aux tables to be `always`. That means we will
always write the tables and give them entries. The real solution to
the problem will be to fix `TreeToTable` or what ever is causing the
above `SIGSEGV`.
**MC track lables**
To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table. This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data. Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.
Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`. It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag. A zero or negative storage flag
means that the track is stored.
In `dev`, the algorithm is
if particle from EG: store it
else if particle is physical primary: store it
else if particle is physics: store it
if particle found in mapping:
mark mothers and daughters for storage
The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored. In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.
The correct algorithm is
if particle not previously marked for storage and
particle is not from EG and
particle is not physical primary and
particle is not physics:
do not store particle
go on to next particle
store particle and its mothers and daughters
In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.
Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results. Once this
MR is accepted, the select upgraded to correct algorithm.
* Refactor MC part of AOD producers
The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`. The helpers are
- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
the `o2::aodHepMCSections` table. This uses the predefined key
constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
`updateParticle`.
These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
a default value.
For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
the decision of what to do is taken on the first event seen.
- If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
- If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
Note that the availability of keys is _not_ checked after the first
event.
That means, if the criteria isn't met on the first event, then
the tables will _never_ be update (as if the policy was
`HepMCUpdate::never`).
On the other hand, if the criteria was met, than the tables _will_ be
update an all events (as if the policy was `HepMCUpdate::always`).
Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda). This template could be moved to `AnalysisDataFormats.h` or
the like.
The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.
The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.
Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively. `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources. However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.
**Some important (late) bug fixes**
Since the commits are squashed into one, I give some more details here
on some later commits. For future reference.
**HepMC aux tables**
I found two problems with the HepMC aux tables after looking at the
data a little deeper
- I used the BC identifier as the collision index field in the HepMC
aux tables. This happened because I took my clues from the MC
producer. The MC producer does not generate a BC ID - even though
it sort of looked like it - the BC ID in the MC producer is
essentially an index. I've fixed this by passing the collision
index (called `collisionID` most places) to the relevant member
functions and the table updates.
- The producers were set up so that HepMC aux tables would _only_ be
written if the input had the corresponding data. If a model gave
the Cross-section, but not PDF nor Heavy-Ion information, then only
the cross-section table would be populated. Pythia, for example,
gives the cross-section and PDF information, but no Heavy-Ion
information.
All three tables would be produced, but some may not have any
entries.
Later on, when we want to process the events (by Rivet f.ex.), we
like to access as much HepMC aux information as possible so we may
build the most complete HepMC event possible. Thus, we would like
to read in
- MC Collisions
- MC particles
- 3 HepMC aux
However, if one or more of the AOD trees of the 3 HepMC aux tables
had no entries, it will cause the producer to crash
```
libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
```
I cannot quite figure out why that happens, but I guess it's a
problem triggered by `TreeToTable` or the call back on that object.
The (temporary) solution to the above problem is to set the update
policy on the HepMC aux tables to be `always`. That means we will
always write the tables and give them entries. The real solution to
the problem will be to fix `TreeToTable` or what ever is causing the
above `SIGSEGV`.
**MC track lables**
To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table. This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data. Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.
Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`. It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag. A zero or negative storage flag
means that the track is stored.
In `dev`, the algorithm is
if particle from EG: store it
else if particle is physical primary: store it
else if particle is physics: store it
if particle found in mapping:
mark mothers and daughters for storage
The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored. In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.
The correct algorithm is
if particle not previously marked for storage and
particle is not from EG and
particle is not physical primary and
particle is not physics:
do not store particle
go on to next particle
store particle and its mothers and daughters
In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.
Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results. Once this
MR is accepted, the select upgraded to correct algorithm.
* Refactor MC part of AOD producers
The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`. The helpers are
- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
the `o2::aodHepMCSections` table. This uses the predefined key
constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
`updateParticle`.
These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
a default value.
For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
the decision of what to do is taken on the first event seen.
- If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
- If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
keys are present, then updating will be enabled for this and _all_
subsequent events.
Note that the availability of keys is _not_ checked after the first
event.
That means, if the criteria isn't met on the first event, then
the tables will _never_ be update (as if the policy was
`HepMCUpdate::never`).
On the other hand, if the criteria was met, than the tables _will_ be
update an all events (as if the policy was `HepMCUpdate::always`).
Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda). This template could be moved to `AnalysisDataFormats.h` or
the like.
The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.
The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.
Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively. `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources. However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.
**Some important (late) bug fixes**
Since the commits are squashed into one, I give some more details here
on some later commits. For future reference.
**HepMC aux tables**
I found two problems with the HepMC aux tables after looking at the
data a little deeper
- I used the BC identifier as the collision index field in the HepMC
aux tables. This happened because I took my clues from the MC
producer. The MC producer does not generate a BC ID - even though
it sort of looked like it - the BC ID in the MC producer is
essentially an index. I've fixed this by passing the collision
index (called `collisionID` most places) to the relevant member
functions and the table updates.
- The producers were set up so that HepMC aux tables would _only_ be
written if the input had the corresponding data. If a model gave
the Cross-section, but not PDF nor Heavy-Ion information, then only
the cross-section table would be populated. Pythia, for example,
gives the cross-section and PDF information, but no Heavy-Ion
information.
All three tables would be produced, but some may not have any
entries.
Later on, when we want to process the events (by Rivet f.ex.), we
like to access as much HepMC aux information as possible so we may
build the most complete HepMC event possible. Thus, we would like
to read in
- MC Collisions
- MC particles
- 3 HepMC aux
However, if one or more of the AOD trees of the 3 HepMC aux tables
had no entries, it will cause the producer to crash
```
libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
```
I cannot quite figure out why that happens, but I guess it's a
problem triggered by `TreeToTable` or the call back on that object.
The (temporary) solution to the above problem is to set the update
policy on the HepMC aux tables to be `always`. That means we will
always write the tables and give them entries. The real solution to
the problem will be to fix `TreeToTable` or what ever is causing the
above `SIGSEGV`.
**MC track lables**
To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table. This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data. Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.
Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`. It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag. A zero or negative storage flag
means that the track is stored.
In `dev`, the algorithm is
if particle from EG: store it
else if particle is physical primary: store it
else if particle is physics: store it
if particle found in mapping:
mark mothers and daughters for storage
The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored. In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.
The correct algorithm is
if particle not previously marked for storage and
particle is not from EG and
particle is not physical primary and
particle is not physics:
do not store particle
go on to next particle
store particle and its mothers and daughters
In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.
Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results. Once this
MR is accepted, the select upgraded to correct algorithm.

The MC part of the AOD producer workflow
o2::aodproducer::AODProducerWorkflowDPLando2::aodmcproducer::AODMcProducerWorkflowDPLis refactored to usefunctions from namespace
o2::aodmchelpers. The helpers areupdateMCCollisionswhich takes in theMCEventHeaderand writes tothe
o2::aod::McCollisionstable.updateHepMCXSectionwhich takes theMCEventHeaderand writes tothe
o2::aodHepMCSectionstable. This uses the predefined keyconstants as defined in
o2::dataformats::MCInfoKeysupdateHepMCPdfInfosimilar toupdateHepMCXSectionaboveupdateHepMCHeavyIonsimilar toupdateHepMCXSectionaboveupdateParticleuses information from ano2::MCTrackand writes itto the
o2::aod::McParticlestableupdateParticlesloops overo2::MCTrackobjects and callsupdateParticle.These functions, in particular
updateHepMC...uses the functionshasKeyswhich checks if theMCEventHeaderhas any or all of thekeys queried.
getEventInfogets auxiliary information from theMCEventHeaderora default value.
For the
o2::aod::HepMC...tables: Depending on the policy parameterpassed to the
updateHepMC...functions, these tables may or may not beupdated.
If the policy is
HepMCUpdate::neverthen the tables are neverupdated.
If the policy is
HepMCUpdate::alwaysthen the tables are alwaysupdated, possibly with default values.
If the policy is
HepMCUpdate::anyKey(default) orHepMCUpdate::allKeys, thenthe decision of what to do is taken on the first event seen.
HepMCUpdate::anyKey, then if any of the neededkeys are present, then updating will be enabled for this and all
subsequent events.
HepMCUpdate::allKeys, then if all of the neededkeys are present, then updating will be enabled for this and all
subsequent events.
Note that the availability of keys is not checked after the first
event.
That means, if the criteria isn't met on the first event, then
the tables will never be update (as if the policy was
HepMCUpdate::never).On the other hand, if the criteria was met, than the tables will be
update an all events (as if the policy was
HepMCUpdate::always).Note the slightly tricky template
TableCursorwhich allows us todefine a type that corresponds to a table cursor (which is really a
lambda). This template could be moved to
AnalysisDataFormats.horthe like.
The applications
o2-aod-producer-workflowando2-aod-mc-producer-workflowhave been updated (via their respectiveimplementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.
The utility
o2-sim-mctracks-to-aod(run/o2sim_mctracks_to_aod.cxx)has also been updated to use these common tools.
Both
o2-aod-mc-producer-workflowando2-sim-mctracks-to-aodhas beentested extensively.
o2-aod-producer-workflowhas not been testedsince it is not clear to me how to set-up such a test with limited
resources (I tried the
prodtest/full_system_test.shbut it failed in unrelated places - can't remember exactly where - due to some missing digits or the like). However, since the changes only effect the MC part, andthat part is now common between the two
o2-aod-{,mc-}producer-workflowapplications, I believe there is no reason to think that it wouldn't
work.
Note Currently the helper functions, implemented in
Detectors/AOD/src/AODMcProducerHelpers.cxxare directly compiled intoo2-aod-producer-workflow,o2-aod-mc-producer-workflow, ando2-sim-mctracks-to-aod. Ideally the code inDetectors/AOD/src/AODMcProducerHelpers.cxxshould probably live in a shared library linked to by these three applications. However, at the moment there's no clear - at least to me - library to put that code into.BTW, the list of commits looks really long (60+ commits), but that's only because this MR builds upon two previous MRs that have been merged. The number of files changed is small (12), though some of the changes are relatively large. This is because large fractions have been refactored out and in to
Detectors/AOD/src/AODMcProducerHelpers.cxx.Yours,
Christian