Skip to content

Conversation

@cholmcc
Copy link
Contributor

@cholmcc cholmcc commented Oct 7, 2023

The MC part of the AOD producer workflow
o2::aodproducer::AODProducerWorkflowDPL and
o2::aodmcproducer::AODMcProducerWorkflowDPL is refactored to use
functions from namespace o2::aodmchelpers. The helpers are

  • updateMCCollisions which takes in the MCEventHeader and writes to
    the o2::aod::McCollisions table.
  • updateHepMCXSection which takes the MCEventHeader and writes to
    the o2::aodHepMCSections table. This uses the predefined key
    constants as defined in o2::dataformats::MCInfoKeys
  • updateHepMCPdfInfo similar to updateHepMCXSection above
  • updateHepMCHeavyIon similar to updateHepMCXSection above
  • updateParticle uses information from an o2::MCTrack and writes it
    to the o2::aod::McParticles table
  • updateParticles loops over o2::MCTrack objects and calls
    updateParticle.

These functions, in particular updateHepMC... uses the functions

  • hasKeys which checks if the MCEventHeader has any or all of the
    keys queried.
  • getEventInfo gets auxiliary information from the MCEventHeader or
    a default value.

For the o2::aod::HepMC... tables: Depending on the policy parameter
passed to the updateHepMC... functions, these tables may or may not be
updated.

  • If the policy is HepMCUpdate::never then the tables are never
    updated.

  • If the policy is HepMCUpdate::always then the tables are always
    updated, possibly with default values.

  • If the policy is HepMCUpdate::anyKey (default) or HepMCUpdate::allKeys, then
    the decision of what to do is taken on the first event seen.

    • If the policy is HepMCUpdate::anyKey, then if any of the needed
      keys are present, then updating will be enabled for this and all
      subsequent events.
    • If the policy is HepMCUpdate::allKeys, then if all of the needed
      keys are present, then updating will be enabled for this and all
      subsequent events.

    Note that the availability of keys is not checked after the first
    event.

    That means, if the criteria isn't met on the first event, then
    the tables will never be update (as if the policy was
    HepMCUpdate::never).

    On the other hand, if the criteria was met, than the tables will be
    update an all events (as if the policy was HepMCUpdate::always).

Note the slightly tricky template TableCursor which allows us to
define a type that corresponds to a table cursor (which is really a
lambda). This template could be moved to AnalysisDataFormats.h or
the like.

The applications o2-aod-producer-workflow and
o2-aod-mc-producer-workflow have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.

The utility o2-sim-mctracks-to-aod (run/o2sim_mctracks_to_aod.cxx)
has also been updated to use these common tools.

Both o2-aod-mc-producer-workflow and o2-sim-mctracks-to-aod has been
tested extensively. o2-aod-producer-workflow has not been tested
since it is not clear to me how to set-up such a test with limited
resources (I tried the prodtest/full_system_test.sh but it failed in unrelated places - can't remember exactly where - due to some missing digits or the like). However, since the changes only effect the MC part, and
that part is now common between the two o2-aod-{,mc-}producer-workflow
applications, I believe there is no reason to think that it wouldn't
work.

Note Currently the helper functions, implemented in Detectors/AOD/src/AODMcProducerHelpers.cxx are directly compiled into o2-aod-producer-workflow, o2-aod-mc-producer-workflow, and o2-sim-mctracks-to-aod. Ideally the code in Detectors/AOD/src/AODMcProducerHelpers.cxx should probably live in a shared library linked to by these three applications. However, at the moment there's no clear - at least to me - library to put that code into.

BTW, the list of commits looks really long (60+ commits), but that's only because this MR builds upon two previous MRs that have been merged. The number of files changed is small (12), though some of the changes are relatively large. This is because large fractions have been refactored out and in to Detectors/AOD/src/AODMcProducerHelpers.cxx.

Yours,
Christian

@cholmcc cholmcc requested review from a team, benedikt-voelkel and sawenzel as code owners October 7, 2023 09:39
@cholmcc cholmcc marked this pull request as draft October 7, 2023 09:44
@github-actions
Copy link
Contributor

github-actions bot commented Nov 9, 2023

This PR did not have any update in the last 30 days. Is it still needed? Unless further action in will be closed in 5 days.

@github-actions github-actions bot added the stale label Nov 9, 2023
@cholmcc cholmcc marked this pull request as ready for review November 11, 2023 10:07
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for ba227be at 2023-11-11 11:18:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
CMake Error at cmake/O2AddExecutable.cmake:95 (add_executable):

Full log here.

@cholmcc cholmcc changed the title WIP: Refactor MC part of AOD producers Refactor MC part of AOD producers Nov 11, 2023
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for a2b651b at 2023-11-11 12:04:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/QualityControl-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerWorkflowSpec.cxx:124:24: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerWorkflowSpec.cxx:128:7: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:73:138: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:98:225: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:126:404: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:171:31: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:182:58: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:188:53: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:190:59: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:192:60: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:194:60: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:196:7: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:203:35: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:256:31: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:284:48: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:286:54: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:288:55: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:290:54: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:302:21: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:316:21: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:73:138: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:98:225: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:126:404: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:171:31: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:182:58: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:188:53: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:190:59: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:192:60: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12039-slc8_x86-64/0/Detectors/AOD/src/AODMcProducerHelpers.cxx:194:60: error: statement should be inside braces [readability-braces-around-statements]
[0 more errors; see full log]

Full log here.

@cholmcc
Copy link
Contributor Author

cholmcc commented Nov 11, 2023

Post mortem on checks

  • build/AliceO2/O2/o2/macOS: ALICE fork of Apache arrow does not protect against PREALLOCATE already defined in system headers. Not a problem from this MR but a problem in ALICE fork of arrow - False Negative
  • build/AliceO2/O2/o2/macOS-arm: Same as above - False Negative
  • build/O2/o2-cs8: A QC test multinode-test.json fails due to time-out. Seemingly unrelated to this MR - False Negative
  • build/O2/o2-dataflow-cs8: Three QC tests o2-qc-test-core, testTriggers, and testCheckWorkflow all fail due to time-outs, unrelated to this MR - False negative

Yours,
Christian

@alibuild
Copy link
Collaborator

alibuild commented Nov 11, 2023

Error while checking build/O2/fullCI for 80f16d5 at 2023-11-11 23:49:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2Physics-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'

Full log here.

@cholmcc
Copy link
Contributor Author

cholmcc commented Nov 12, 2023

Now also build/O2/o2 and build/O2/fullCI fails due to timeout in unrelated QC test.

Also, please remove stale label from this MR. Thanks.

Yours,
Christian

@pzhristov pzhristov removed the stale label Nov 13, 2023
@alibuild
Copy link
Collaborator

alibuild commented Nov 13, 2023

Error while checking build/O2/fullCI for ff54b7e at 2023-11-13 14:12:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/QualityControl-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2-full-system-test-latest/log
task timeout reached .. killing all processes


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/1cc3fbb077a51d40aa37a11dff411fc5aa671ba0/slc8_x86-64/o2checkcode/1.0-local97/etc/modulefiles
++ cat
--

Full log here.

@cholmcc
Copy link
Contributor Author

cholmcc commented Nov 14, 2023

Hi all,

I made an MR against ALICE's arrow fork to fix the problem with build/AliceO2/O2/o2/macOS{,-arm}. I also opened a bug report for upstream arrow.

Yours,
Christian

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for ecca7a1 at 2023-11-14 13:55:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/QualityControl-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2Physics-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2-full-system-test-latest/log
task timeout reached .. killing all processes
[47240:BadMapCalibSpec]: [12:25:41][ERROR] Insufficient statistics: 7 entries in lowE histo, do nothing
[47297:ctf-writer]: [12:25:43][ERROR] Some Lifetime::Timeframe data got dropped starting at 1
[47353:qc-task-TPC-Tracks]: [12:25:48][ERROR] o2::base::Propagator not properly initialized, MatLUT (0x0) and / or Field (0x0) missing, will not fill DCA histograms


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/e5e044806167a270985b4e667f8468e596f208c4/slc8_x86-64/o2checkcode/1.0-local76/etc/modulefiles
++ cat
--

Full log here.

@cholmcc
Copy link
Contributor Author

cholmcc commented Nov 28, 2023

Hi all,

Any news on this?

Yours,
Christian

@cholmcc
Copy link
Contributor Author

cholmcc commented Dec 4, 2023

Please advice on next steps to get this merged. Thanks.

@sawenzel
Copy link
Collaborator

sawenzel commented Dec 4, 2023

Sorry for the delay. We are still quite busy with commissioning work for upcoming PbPb productions and currently fixing known bugs in AOD conversion (which might interfere here). I am afraid, given the amount of changes, a proper review might still take some time but it's definitely on my list.

You could do me a favour and clean up your PR. Currently, there are 88 commits and it would be good to rebase on dev and just force-push one single (or few) commit with the proposed changes.

@cholmcc
Copy link
Contributor Author

cholmcc commented Dec 4, 2023

Hi Sandro,

I am afraid, given the amount of changes ...

The changes are not that big, and are isolated to the MC part only. That's why I did a thorough test on o2-aod-mc-publisher-workflow.

You could do me a favour and clean up your PR. Currently, there are 88 commits and it would be good to rebase on
dev and just force-push one single (or few) commit with the proposed changes.

Could you give some rather detailed instructions on how to do that? I'm a little worried that things will break. Thanks.

Yours,
Christian

@cholmcc cholmcc force-pushed the cholmcc_aod_producers_refactor branch from 0c45ab4 to 4a2d202 Compare December 5, 2023 13:09
@cholmcc
Copy link
Contributor Author

cholmcc commented Dec 5, 2023

OK, I figured it out myself:

git reset --soft $(git merge-base dev HEAD)
git commit 
git push --force

For future reference (ignore for the discussion of the MR):

  • git reset --soft $(git merge-base dev HEAD) resets working directory to last head of dev. Changed files are stil staged for commit.
  • git commit commits all changes in working directory as a single commit. Here, I used git log to get the commit message I originally used, and copy'n'pasted that
  • git push --force rewrites the branch history starting from $(git merge-base dev HEAD) and up to the last commit, so that it only contains a single commit.

Note git push --force can be a bit dangerours.

@cholmcc cholmcc force-pushed the cholmcc_aod_producers_refactor branch from 4a2d202 to dc80f6a Compare December 6, 2023 11:49
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for dc80f6a at 2023-12-10 08:40:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
/sw/SOURCES/O2/12039-slc8_x86-64/0/DataFormats/Headers/include/Headers/Stack.h:138:16: error: request for member 'size' in 'h', which is of non-class type 'o2::framework::Lifetime'
ninja: build stopped: subcommand failed.

Full log here.

@sawenzel
Copy link
Collaborator

@cholmcc : I now find some time to take a look into this. But, unfortunately, some merge conflicts need to be fixed first of all. Could you take care of this (and force-push a single commit)? Thx.

@cholmcc
Copy link
Contributor Author

cholmcc commented Dec 20, 2023

Hi Sandro,

Sorry for the late reply - been a bit busy the past week or so.

I will try to get around to fixing the merge conflicts. Note sure if I can make it before Christmas though.

Yours,

Christian

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for b81ed6f at 2024-01-11 23:46:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
/sw/SOURCES/O2/12039-slc8_x86-64/0/DataFormats/Headers/include/Headers/Stack.h:138:16: error: request for member 'size' in 'h', which is of non-class type 'o2::framework::Lifetime'
ninja: build stopped: subcommand failed.

Full log here.

Copy link
Collaborator

@sawenzel sawenzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. I started studying the code in this PR. At the moment, more work is needed because
the PR does not actually compile.

Thereafter, please also perform a validation step by running a simple O2DPG MC workflow:
(alibuild build O2sim; alienv enter O2sim; .... then run a script like)

#!/bin/bash

#
# A example workflow MC->RECO->AOD for a simple pp production
# excluding ZDC

NWORKERS=${NWORKERS:-8}
SIMENGINE=${SIMENGINE:-TGeant4}

# create workflow
${O2DPG_ROOT}/MC/bin/o2dpg_sim_workflow.py -eCM 14000 -col pp -gen pythia8 -proc "cdiff" -tf 4  -ns 200 -e ${SIMENGINE} -j ${NWORKERS}                               \
                                                     -run 303000 -seed 624 -interactionRate 50000 

# run workflow
${O2DPG_ROOT}/MC/bin/o2_dpg_workflow_runner.py -f workflow.json -tt aod

I would say we can go ahead from my side once this succeeds and the AOD content is compatible with now.

@cholmcc cholmcc requested a review from jgrosseo as a code owner January 17, 2024 17:51
@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 17, 2024

OK, yet another error - this time in call to o2-secondary-vertexing-workflow. The options listed a lot of track combinations with TPC (TPC-ITS, ...) but never TPC alone. This made the workflow fail immediately with a FATAL signal. Again a small fix to the Python script seemed to fix the issue.

@shahor02
Copy link
Collaborator

OK, yet another error - this time in call to o2-secondary-vertexing-workflow. The options listed a lot of track combinations with TPC (TPC-ITS, ...) but never TPC alone.

Are you using obsolete O2DPG? The requirement of TPC in the sources is recent and is accounted in the o2dpg_sim_workflow.py.

(*)I honestly didn't think it was possible to write that poorly structured Python code - ah well.
better to avoid such comments...

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 17, 2024

Launching task:  [ -f aodmerge_input.txt ] && rm aodmerge_input.txt; for i in `seq 1 2`; do echo "tf${i}/AO2D.root" >> aodmerge_input.txt; done; o2-aod-merger --input aodmerge_input.txt --output AO2D.root &> aodmerge.log &

**** Pipeline done success (global_runtime : 6.399s) *****

Yay! It seemed to work.

Looking in the various trees I see data as expected, including the new HepMC auxiliary tables (the HeayIon table is clearly empty because it was a pp simulation).

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 18, 2024

I did

o2-analysis-mm-rivet \
  --hepmc-recenter \
  --rivet-analysis  ALICE_YYYY_I1234567 \
  --rivet-pwd \
  --rivet-sources  ALICE_YYYY_I1234567.cc \
  --rivet-finalize --aod-file AO2D.root \
  --rivet-dump RivetResults.yoda \
  --batch 

(code from Rivet for O2 - in particular from PWGMM/Rivet)

followed by

./ALICE_YYYY_I1234567.py -s RivetResults.yoda 

or alternatively

root -l ../src/O2Physics/PWGMM/Rivet/macros/GetRivetAOs.macro
./ALICE_YYYY_I1234567.py -s RivetResults.yoda 

to get the following plot

RivetResults

I think the MR has lived up to its promise 😄

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 18, 2024

Are you using obsolete O2DPG? The requirement of TPC in the sources is recent and is accounted in the
o2dpg_sim_workflow.py.

Yes, it was a little old - but not terribly.

(*)I honestly didn't think it was possible to write that poorly structured Python code - ah well.
better to avoid such comments...

None the less true - that code suffers from Fortrantitis in a bad way 😄 Typically when I see Python code, even by novice programmers, it is OK structured and clear, which is why this code surprised me. I guess it must be a pain to maintain. I understand how this comes about - I'll do quick fix here, then another fix here, and oh, its more important that it worked yesterday than it will work for eternity and I will come back and tidy it up - and before you know it, you have a mess on your hands.

@cholmcc cholmcc requested review from sawenzel and shahor02 January 18, 2024 00:41
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for 64ebd29 at 2024-01-18 06:00:

## sw/BUILD/o2codechecker-latest/log
100% tests passed, 0 tests failed out of 1


## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2-full-system-test-latest/log
task timeout reached .. killing all processes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data FLP/DISTSUBTIMEFRAME/52443 was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data HMP/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data FLP/DISTSUBTIMEFRAME/0 was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data MID/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data CTP/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data TOF/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data TPC/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data ITS/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data MFT/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data MCH/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data PHS/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data TRD/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data CPV/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data EMC/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data FDD/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data FT0/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data FV0/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[213416:raw-file-reader]: [04:30:16][ERROR] Expected Lifetime::Timeframe data ZDC/RAWDATA was not created for timeslice 1 and might result in dropped timeframes


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/dc5ca44d83e12c656bbbaf57ffff9ba78470585f/slc8_x86-64/o2checkcode/1.0-local29/etc/modulefiles
++ cat
--

Full log here.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for 979e20a at 2024-01-18 07:57:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2-full-system-test-latest/log
command /sw/slc8_x86-64/O2/12039-slc8_x86-64-local8/prodtests/full-system-test/dpl-workflow.sh had nonzero exit code 128
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data FLP/DISTSUBTIMEFRAME/52443 was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data HMP/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data FLP/DISTSUBTIMEFRAME/0 was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data MID/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data CTP/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data TOF/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data TPC/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data ITS/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data MFT/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data MCH/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data PHS/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data TRD/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data CPV/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data EMC/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data FDD/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data FT0/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data FV0/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47541:raw-file-reader]: [06:56:50][ERROR] Expected Lifetime::Timeframe data ZDC/RAWDATA was not created for timeslice 1 and might result in dropped timeframes
[47566:BadMapCalibSpec]: [06:57:02][ERROR] Insufficient statistics: 3 entries in lowE histo, do nothing
[47622:ctf-writer]: [06:57:03][ERROR] Some Lifetime::Timeframe data got dropped starting at 1
[47665:qc-task-TPC-Tracks]: [06:57:08][ERROR] o2::base::Propagator not properly initialized, MatLUT (0x0) and / or Field (false) missing, will not fill DCA histograms
[47666:qc-task-TRD-Digits]: [06:57:17][ERROR] Dropping incomplete DS/trdall2/0 Lifetime::qos data in slot 0 with timestamp 0 < 2 as it can never be completed.
[47666:qc-task-TRD-Digits]: [06:57:17][ERROR] Dropping incomplete DS/trdall1/0 Lifetime::qos data in slot 0 with timestamp 0 < 2 as it can never be completed.
[47666:qc-task-TRD-Digits]: [06:57:17][ERROR] Dropping incomplete DS/trdall0/0 Lifetime::qos data in slot 0 with timestamp 0 < 2 as it can never be completed.
[47666:qc-task-TRD-Digits]: [06:57:17][ERROR] Missing DS/trdall3/0 (lifetime:qos) while dropping incomplete data in slot 0 with timestamp 0 < 2.
[47666:qc-task-TRD-Digits]: [06:57:17][ERROR] Missing DS/trdall5/0 (lifetime:qos) while dropping incomplete data in slot 0 with timestamp 0 < 2.
[47666:qc-task-TRD-Digits]: [06:57:17][ERROR] Missing DS/trdall4/0 (lifetime:qos) while dropping incomplete data in slot 0 with timestamp 0 < 2.
[47667:qc-task-TRD-Tracklets]: [06:57:17][ERROR] Dropping incomplete DS/trdall0/0 Lifetime::qos data in slot 0 with timestamp 0 < 2 as it can never be completed.
[47667:qc-task-TRD-Tracklets]: [06:57:17][ERROR] Dropping incomplete DS/trdall1/0 Lifetime::qos data in slot 0 with timestamp 0 < 2 as it can never be completed.
[47667:qc-task-TRD-Tracklets]: [06:57:17][ERROR] Dropping incomplete DS/trdall2/0 Lifetime::qos data in slot 0 with timestamp 0 < 2 as it can never be completed.
[47667:qc-task-TRD-Tracklets]: [06:57:17][ERROR] Missing DS/trdall3/0 (lifetime:qos) while dropping incomplete data in slot 0 with timestamp 0 < 2.
[47667:qc-task-TRD-Tracklets]: [06:57:17][ERROR] Missing DS/trdall4/0 (lifetime:qos) while dropping incomplete data in slot 0 with timestamp 0 < 2.
[47667:qc-task-TRD-Tracklets]: [06:57:17][ERROR] Missing DS/trdall5/0 (lifetime:qos) while dropping incomplete data in slot 0 with timestamp 0 < 2.
[ERROR] Workflow crashed - PID 47666 (qc-task-TRD-Digits) did not exit correctly however it's not clear why. Exit code forced to 128.
[ERROR]  - Device qc-task-TRD-Digits: pid 47666 (exit 128)
[ERROR] SEVERE: Device qc-task-TRD-Digits (47666) returned with 128


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
[0 more errors; see full log]

Full log here.

@sawenzel
Copy link
Collaborator

sawenzel commented Jan 18, 2024

@cholmcc : Do you believe that insulting people in commit messages and PR discussions is helpful?
If it continues like this, my motivation to review your developments and give support will just decrease.
I honestly believe there are more professional ways of discussing issues. Sometimes it is just enough to shortly ping about problems on mattermost and get hints of how to resolve them.

Part of your challenges come from the fact that you are not using the collaboration rules and the official workflow of using alidist for software distribution, which defines a tested and curated software portfolio. In there, O2DPG is fully up to date and guaranteed to run the example workflow that I gave to you (because we test it nightly).

I suggested running the above workflow, because when I took time to review your developments:
a) your PR did not compile
b) even after fixing your PR privately, the AOD conversion step crashed

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 18, 2024

@cholmcc : Do you believe that insulting people in commit messages and PR discussions is helpful?

I don't think I insulted anyone - if someone felt that way, well I'd say they have rather thin skin. I certainly didn't point anyone out in particular, only that the code is truly horrible.

What I do think is helpful is to point out that some code is really poorly written and it would make sense, not least for maintenance purposes, fix it.

If it continues like this, my motivation to review your developments and give support will just decrease.

Should that be considered a threat? Because that is clearly not OK.

Look, I did not do this work for my sake - I did it because I was tasked by the collaboration to do a job - namely integrate Rivet into O2. This MR is just one piece of that puzzle. So you are not doing me, but the collaboration, wrong if you chose to stop doing your job.

I honestly believe there are more professional ways of discussing issues.

You mean like issuing veiled threats?

Sometimes it is just enough to shortly ping about problems on mattermost and get hints of how to resolve them.

I really don't want to solve the problems of the DPG scripts - those scripts are so obtuse that it would take a considerable effort to do so - in fact, I think it would most likely be more efficient to start over again on those scripts.

Trust, me I've ping many people on Mattermost and else where how to do a full MC->AOD test. It wasn't until you kindly gave the instruction that I knew how to do it. When it then turns out that that didn't really work because of problems in the DPG scripts - well - then I got even more frustrated than before.

Part of your challenges come from the fact that you are not using the collaboration rules

Which rules? There's no hard and fast rule that you need to use alidist.

... and the official workflow of using alidist for software distribution, ...

An no, the problems does not stem from me not using alidist at all, but rather that the tools don't actually work as intended. If you think that the problems arise from not using alidst, then I'm afraid there's something you have misunderstood.

.. which defines a tested and curated software portfolio.

I'm aware that not using alidist puts a burden on me to solve some issues on my own, and I routinely do so (much more than I care for - it is no fun digging through thousands of lines of code only to find that a certain assumption was made about the something in the environment which will only ever be true in very specific circumstances).

As for "tested" - it seems the tests are basically - does it run on LXPLUS, OK, we're good. That's not how one does testing of software. Testing software involves meticulous attention to corner cases and presumptions. Try to deploy alidist on some other platform than CentOS and you will find it rather burdensome.

In there, O2DPG is fully up to date and guaranteed to run the example workflow that I gave to you (because we test it nightly).

OK, so there seems to be something going on in your test environment that does not comport with a vanilla environment. F.ex., without the option --no-require-ctpinputs-emc the code will crash - it's just a matter of looking at the sources to see that. This is what I mean by "testing" is done in a very special environment. It would be one thing if this environment was properly documented, but sadly it isn't.

BTW, there's no guaranty - even with alidist, and especially when doing development - that the various tools are in sync. It can be done somewhat in deployment targets, but even there it can break (vis the debacle over fastjet).

I suggested running the above workflow, because when I took time to review your developments:
a) your PR did not compile

This was due to some other change somewhere else in O2. That the o2::framework::Output dropped the Lifetime argument (commit #f6abfb04d5eedf284fd6913bf01820ed82af6a3b on the 6th of December by Guilio), so you cannot pin that on the PR. Of course, had you looked at the PR earlier - say in November, it would not have been a problem,

b) even after fixing your PR privately, the AOD conversion step crashed

It would have been helpful if you could have reported the error you saw when you saw it. I have no way of gauging what the problem might have been for you without that input (or rather output 😄) - could be because of missing --no-require-ctpinputs-emc or something entirely different. After I fixed the various merge conflicts (not that many) between this MR and the latest dev, I did not see any more problem in the code but only had trouble with the DPG scripts.

Anyways, I think we're at a point where this PR is fairly well tested, and I think it will make sense to merge it at this point. Then, I will get out of your hair so you can get around to something that is more fun for you. Thanks for all your help.

Yours,

Christian

@pzhristov
Copy link
Contributor

@cholmcc I am afraid you misunderstood what was the job description of Sandro. He is helping you not because this is his task, but because we want to move the things on. So this kind of discussion is not welcome in the PRs. If you have any question on this, I am happy to chat with you privately. At the end we have been working together for about 20 years :)

@pzhristov
Copy link
Contributor

@cholmcc I propose to run the original code, then run again the AOD creator with these modifications on the same input and see what is the difference between the original AODs and the newly created ones. If we are happy with the result, we can merge.

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 18, 2024

@cholmcc I propose to run the original code, then run again the AOD creator with these modifications on the same input and see what is the difference between the original AODs and the newly created ones. If we are happy with the result, we can merge.

That's a non-trivial thing to do, as you know. I means switch to the dev branch and running with that. See what I can do.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for 9a2cea6 at 2024-01-18 18:15:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
/sw/SOURCES/O2/12039-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:78:19: error: statement should be inside braces [readability-braces-around-statements]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`.   The helpers are

- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
  the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
  the `o2::aodHepMCSections` table.  This uses the predefined key
  constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
  to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
  `updateParticle`.

These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
  keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
  a default value.

For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
  updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
  updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
  the decision of what to do is taken on the first event seen.
  - If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  - If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  Note that the availability of keys is _not_ checked after the first
  event.

  That means, if the criteria isn't met on the first event, then
  the tables will _never_ be update (as if the policy was
  `HepMCUpdate::never`).

  On the other hand, if the criteria was met, than the tables _will_ be
  update an all events (as if the policy was `HepMCUpdate::always`).

Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda).   This template could be moved to `AnalysisDataFormats.h` or
the like.

The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.

The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.

Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively.  `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources.  However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.

**Some important (late) bug fixes**

Since the commits are squashed into one, I give some more details here
on some later commits.  For future reference.

**HepMC aux tables**

I found two problems with the HepMC aux tables after looking at the
data a little deeper

- I used the BC identifier as the collision index field in the HepMC
  aux tables.  This happened because I took my clues from the MC
  producer.  The MC producer does not generate a BC ID - even though
  it sort of looked like it - the BC ID in the MC producer is
  essentially an index.  I've fixed this by passing the collision
  index (called `collisionID` most places) to the relevant member
  functions and the table updates.

- The producers were set up so that HepMC aux tables would _only_ be
  written if the input had the corresponding data.  If a model gave
  the Cross-section, but not PDF nor Heavy-Ion information, then only
  the cross-section table would be populated.  Pythia, for example,
  gives the cross-section and PDF information, but no Heavy-Ion
  information.

  All three tables would be produced, but some may not have any
  entries.

  Later on, when we want to process the events (by Rivet f.ex.), we
  like to access as much HepMC aux information as possible so we may
  build the most complete HepMC event possible.  Thus, we would like
  to read in

  - MC Collisions
  - MC particles
  - 3 HepMC aux

  However, if one or more of the AOD trees of the 3 HepMC aux tables
  had no entries, it will cause the producer to crash

  ```
  libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
  libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
  ```

  I cannot quite figure out why that happens, but I guess it's a
  problem triggered by `TreeToTable` or the call back on that object.

  The (temporary) solution to the above problem is to set the update
  policy on the HepMC aux tables to be `always`.  That means we will
  always write the tables and give them entries.  The real solution to
  the problem will be to fix `TreeToTable` or what ever is causing the
  above `SIGSEGV`.

**MC track lables**

To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table.  This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data.  Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.

Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`.  It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag.  A zero or negative storage flag
means that the track is stored.

In `dev`, the algorithm is

    if      particle from EG:  store it
    else if particle is physical primary: store it
    else if particle is physics: store it

    if particle found in mapping:
         mark mothers and daughters for storage

The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored.  In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.

The correct algorithm is

    if particle not previously marked for storage and
       particle is not from EG and
       particle is not physical primary and
       particle is not physics:
           do not store particle
           go on to next particle

    store particle and its mothers and daughters

In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.

Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results.  Once this
MR is accepted, the select upgraded to correct algorithm.
@cholmcc cholmcc force-pushed the cholmcc_aod_producers_refactor branch from 9a2cea6 to f87bad4 Compare January 19, 2024 09:17
@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 19, 2024

Hi all,

I pushed a new singe commit to this MR.

I found a few bugs after comparing to output from dev. I won't go in to all the details - you can find them in the commit message.

Here, I will show you the results of my comparisons between dev and this MR.

To be transparent about what I did:

  • I ran script posted above (including the fix to DPG to pass ----no-require-ctpinputs-emc when digitising) for this MR.

  • I copied the generated AO2D.root to storage mine/A02D.root

  • I did

    rm -f tf?/aod_?.log_done aodmerge.log_done 
    
  • Then I switch to the dev branch and ran the posted script again (again, thanks Sandro for providing this script).

  • I copied the generated AO2D.root to storage dev/A02D.root

  • Then I ran
    this script (remove .txt from the file name) to get
    this report

What I see is that all comes out the same between dev and this MR - except trackqa.

I spent quite a bit of time on the trackqa discrepancies. The important thing to keep in mind is, as far as I can tell, that the filling of the trackqa tables does not, in any way, depend on MC data. This MR solely effects the MC part (mc...), and the data (reco) part is entire unchanged. My best guess is that the QA part has some randomness built into it. I believe I say different differences between runs, which would comport with that. If you have some other ideas, please let me know. Thanks.

I also did a closure test of the event before and after AOD generation. That is, I took the kinematics tree from one timeframe of the simulation and generated a kine_AO2D.root file from that

o2-sim-kine-publisher --kineFileName sgn_1 | o2-sim-mctracks-to-aod 
mv AnalysisResults_trees.root kine_AO2D.root 

Then, from both this kine_AO2D.root and the AO2D.root files above, I generated HepMC files

for part in dev mine kine : do 
    o2-aod-mc-to-hepmc --hepmc-dump ${part}/AO2D.kine --aod-file ${part}/AO2D.root ; done 

and then illustrated a few events with

$O2_PHYSICS_SOURCE/PWGMM/Rivet/Tools/hepmc.py 

from my O2Physics Rivet branch. I attach them here in a zip archive. Note that the kine illustrations will have more particles than the AOD ones, because kine contains the full event, including particles created during simulation transport, while the AOD ones only has those stored in the AODs. What we can learn from this is that the event structure is preserved from simulation output to the AOD output - not a bad thing 😄

Of course, I would be useful to also illustrate the event from the output of the EG. However, that data isn't explicitly stored by the DPG script because Pythia is run as an internal thing in the simulations. However, I've done that comparison before (EG and post simulation) and that checked out too.

So the chain from EG->SIM->DIGIT->RECO->AOD seems to preserve the event structure - also with this MR 😄

Yours,

Christian

@sawenzel
Copy link
Collaborator

@cholmcc : Thanks for doing this validation. As you say, the important part is the agreement of the MC tables. Track tables may indeed vary a bit since some digitisation algos and TPC reconstruction has random number usage with multi-threading that is not completely reproducible between runs.

Your script may be useful for doing this validation in more automatic ways in the future.

To me all looks good now and if there are no further objections, the PR will be merged end of the day.

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 19, 2024

Hi Sandro,

@cholmcc : Thanks for doing this validation.

It was not (edit) as much work as I first thought it to be 😄

Track tables may indeed vary a bit since some digitisation algos and TPC reconstruction has random number usage
with multi-threading that is not completely reproducible between runs.

OK, so I guess its likely that randomness is part of the reason why the trackqa tables do not match exactly. Great.

Your script may be useful for doing this validation in more automatic ways in the future.

By all means take it and put it somewhere accessible.

To me all looks good now and if there are no further objections, the PR will be merged end of the day.

That's great. Looking forward to that.

Could you have a peek at the issue I raised in the commit log - namely with particle selection I believe is currently broken in dev and in this MR. I think, if we agree that there's a problem in dev, that the best approach is to take this MR as-is, and then I do a simple PR to fix that selection. I think it creates the best most-tractable history.

Thanks again.

Yours,

Christian

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for f87bad4 at 2024-01-19 14:39:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
/sw/SOURCES/O2/12039-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:78:19: error: statement should be inside braces [readability-braces-around-statements]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 19, 2024

Formatting / PR formatting / clang-format (pull_request_target) seems to fail for some weird reason - it cannot find the dev branch, whatever that means. I did check all the code for formatting, and all was good. I would classify it as a false negative.

@sawenzel sawenzel merged commit d3bcb9d into AliceO2Group:dev Jan 19, 2024
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for 636ef7a at 2024-01-19 20:02:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2-full-system-test-latest/log
task timeout reached .. killing all processes


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/f3e1fa7431faa72a76d5bfd4f41f049650fe0487/slc8_x86-64/o2checkcode/1.0-local36/etc/modulefiles
++ cat
--

Full log here.

@cholmcc
Copy link
Contributor Author

cholmcc commented Jan 20, 2024

Thank you for merging this. We are now one step closer to getting Rivet fully integrated into $O^2$. With this in $O^2$ an analyser can now use official MC productions to test out their implementation of Rivet analyses, and use those analyses for model comparison when preparing manuscripts for publication. The other steps missing are getting Rivet properly deployed in the various computation environments and getting the Rivet wrapper accepted into $O^2\mathrm{Physics}$.

Thank you.

Yours,

Christian

andreasmolander pushed a commit to andreasmolander/AliceO2 that referenced this pull request Apr 12, 2024
* Refactor MC part of AOD producers

The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`.   The helpers are

- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
  the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
  the `o2::aodHepMCSections` table.  This uses the predefined key
  constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
  to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
  `updateParticle`.

These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
  keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
  a default value.

For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
  updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
  updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
  the decision of what to do is taken on the first event seen.
  - If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  - If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  Note that the availability of keys is _not_ checked after the first
  event.

  That means, if the criteria isn't met on the first event, then
  the tables will _never_ be update (as if the policy was
  `HepMCUpdate::never`).

  On the other hand, if the criteria was met, than the tables _will_ be
  update an all events (as if the policy was `HepMCUpdate::always`).

Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda).   This template could be moved to `AnalysisDataFormats.h` or
the like.

The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.

The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.

Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively.  `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources.  However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.

**Some important (late) bug fixes**

Since the commits are squashed into one, I give some more details here
on some later commits.  For future reference.

**HepMC aux tables**

I found two problems with the HepMC aux tables after looking at the
data a little deeper

- I used the BC identifier as the collision index field in the HepMC
  aux tables.  This happened because I took my clues from the MC
  producer.  The MC producer does not generate a BC ID - even though
  it sort of looked like it - the BC ID in the MC producer is
  essentially an index.  I've fixed this by passing the collision
  index (called `collisionID` most places) to the relevant member
  functions and the table updates.

- The producers were set up so that HepMC aux tables would _only_ be
  written if the input had the corresponding data.  If a model gave
  the Cross-section, but not PDF nor Heavy-Ion information, then only
  the cross-section table would be populated.  Pythia, for example,
  gives the cross-section and PDF information, but no Heavy-Ion
  information.

  All three tables would be produced, but some may not have any
  entries.

  Later on, when we want to process the events (by Rivet f.ex.), we
  like to access as much HepMC aux information as possible so we may
  build the most complete HepMC event possible.  Thus, we would like
  to read in

  - MC Collisions
  - MC particles
  - 3 HepMC aux

  However, if one or more of the AOD trees of the 3 HepMC aux tables
  had no entries, it will cause the producer to crash

  ```
  libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
  libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
  ```

  I cannot quite figure out why that happens, but I guess it's a
  problem triggered by `TreeToTable` or the call back on that object.

  The (temporary) solution to the above problem is to set the update
  policy on the HepMC aux tables to be `always`.  That means we will
  always write the tables and give them entries.  The real solution to
  the problem will be to fix `TreeToTable` or what ever is causing the
  above `SIGSEGV`.

**MC track lables**

To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table.  This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data.  Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.

Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`.  It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag.  A zero or negative storage flag
means that the track is stored.

In `dev`, the algorithm is

    if      particle from EG:  store it
    else if particle is physical primary: store it
    else if particle is physics: store it

    if particle found in mapping:
         mark mothers and daughters for storage

The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored.  In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.

The correct algorithm is

    if particle not previously marked for storage and
       particle is not from EG and
       particle is not physical primary and
       particle is not physics:
           do not store particle
           go on to next particle

    store particle and its mothers and daughters

In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.

Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results.  Once this
MR is accepted, the select upgraded to correct algorithm.
andreasmolander pushed a commit to andreasmolander/AliceO2 that referenced this pull request Apr 12, 2024
* Refactor MC part of AOD producers

The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`.   The helpers are

- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
  the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
  the `o2::aodHepMCSections` table.  This uses the predefined key
  constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
  to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
  `updateParticle`.

These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
  keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
  a default value.

For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
  updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
  updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
  the decision of what to do is taken on the first event seen.
  - If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  - If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  Note that the availability of keys is _not_ checked after the first
  event.

  That means, if the criteria isn't met on the first event, then
  the tables will _never_ be update (as if the policy was
  `HepMCUpdate::never`).

  On the other hand, if the criteria was met, than the tables _will_ be
  update an all events (as if the policy was `HepMCUpdate::always`).

Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda).   This template could be moved to `AnalysisDataFormats.h` or
the like.

The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.

The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.

Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively.  `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources.  However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.

**Some important (late) bug fixes**

Since the commits are squashed into one, I give some more details here
on some later commits.  For future reference.

**HepMC aux tables**

I found two problems with the HepMC aux tables after looking at the
data a little deeper

- I used the BC identifier as the collision index field in the HepMC
  aux tables.  This happened because I took my clues from the MC
  producer.  The MC producer does not generate a BC ID - even though
  it sort of looked like it - the BC ID in the MC producer is
  essentially an index.  I've fixed this by passing the collision
  index (called `collisionID` most places) to the relevant member
  functions and the table updates.

- The producers were set up so that HepMC aux tables would _only_ be
  written if the input had the corresponding data.  If a model gave
  the Cross-section, but not PDF nor Heavy-Ion information, then only
  the cross-section table would be populated.  Pythia, for example,
  gives the cross-section and PDF information, but no Heavy-Ion
  information.

  All three tables would be produced, but some may not have any
  entries.

  Later on, when we want to process the events (by Rivet f.ex.), we
  like to access as much HepMC aux information as possible so we may
  build the most complete HepMC event possible.  Thus, we would like
  to read in

  - MC Collisions
  - MC particles
  - 3 HepMC aux

  However, if one or more of the AOD trees of the 3 HepMC aux tables
  had no entries, it will cause the producer to crash

  ```
  libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
  libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
  ```

  I cannot quite figure out why that happens, but I guess it's a
  problem triggered by `TreeToTable` or the call back on that object.

  The (temporary) solution to the above problem is to set the update
  policy on the HepMC aux tables to be `always`.  That means we will
  always write the tables and give them entries.  The real solution to
  the problem will be to fix `TreeToTable` or what ever is causing the
  above `SIGSEGV`.

**MC track lables**

To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table.  This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data.  Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.

Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`.  It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag.  A zero or negative storage flag
means that the track is stored.

In `dev`, the algorithm is

    if      particle from EG:  store it
    else if particle is physical primary: store it
    else if particle is physics: store it

    if particle found in mapping:
         mark mothers and daughters for storage

The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored.  In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.

The correct algorithm is

    if particle not previously marked for storage and
       particle is not from EG and
       particle is not physical primary and
       particle is not physics:
           do not store particle
           go on to next particle

    store particle and its mothers and daughters

In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.

Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results.  Once this
MR is accepted, the select upgraded to correct algorithm.
mwinn2 pushed a commit to mwinn2/AliceO2 that referenced this pull request Apr 25, 2024
* Refactor MC part of AOD producers

The MC part of the AOD producer workflow
`o2::aodproducer::AODProducerWorkflowDPL` and
`o2::aodmcproducer::AODMcProducerWorkflowDPL` is refactored to use
functions from namespace `o2::aodmchelpers`.   The helpers are

- `updateMCCollisions` which takes in the `MCEventHeader` and writes to
  the `o2::aod::McCollisions` table.
- `updateHepMCXSection` which takes the `MCEventHeader` and writes to
  the `o2::aodHepMCSections` table.  This uses the predefined key
  constants as defined in `o2::dataformats::MCInfoKeys`
- `updateHepMCPdfInfo` similar to `updateHepMCXSection` above
- `updateHepMCHeavyIon` similar to `updateHepMCXSection` above
- `updateParticle` uses information from an `o2::MCTrack` and writes it
  to the `o2::aod::McParticles` table
- `updateParticles` loops over `o2::MCTrack` objects and calls
  `updateParticle`.

These functions, in particular `updateHepMC...` uses the functions
- `hasKeys` which checks if the `MCEventHeader` has any or all of the
  keys queried.
- `getEventInfo` gets auxiliary information from the `MCEventHeader` or
  a default value.

For the `o2::aod::HepMC...` tables: Depending on the policy parameter
passed to the `updateHepMC...` functons, these tables may or may not be
updated.
- If the policy is `HepMCUpdate::never` then the tables are never
  updated.
- If the policy is `HepMCUpdate::always` then the tables are _always_
  updated, possibly with default values.
- If the policy is `HepMCUpdate::anyKey` (default) or `HepMCUpdate::allKeys`, then
  the decision of what to do is taken on the first event seen.
  - If the policy is `HepMCUpdate::anyKey`, then if _any_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  - If the policy is `HepMCUpdate::allKeys`, then if _all_ of the needed
    keys are present, then updating will be enabled for this and _all_
    subsequent events.
  Note that the availability of keys is _not_ checked after the first
  event.

  That means, if the criteria isn't met on the first event, then
  the tables will _never_ be update (as if the policy was
  `HepMCUpdate::never`).

  On the other hand, if the criteria was met, than the tables _will_ be
  update an all events (as if the policy was `HepMCUpdate::always`).

Note the slightly tricky template `TableCursor` which allows us to
define a type that correponds to a table curser (which is really a
lambda).   This template could be moved to `AnalysisDataFormats.h` or
the like.

The applications `o2-aod-producer-workflow` and
`o2-aod-mc-producer-workflow` have been updated (via their respective
implementation classes) to use these tools, thus unifying how the MC
information is propagated to AODs.

The utility `o2-sim-mctracks-to-aod` (`run/o2sim_mctracks_to_aod.cxx`)
has _also_ been updated to use these common tools.

Both `o2-aod-mc-producer-workflow` and `o2-sim-mctracks-to-aod` has been
tested extensively.  `o2-aod-producer-workflow` has _not_ been tested
since it is not clear to me how to set-up such a test with limited
resources.  However, since the changes _only_ effect the MC part, and
that part is now common between the two `o2-aod-{,mc-}producer-workflow`
applications, I believe there is no reason to think that it wouldn't
work.

**Some important (late) bug fixes**

Since the commits are squashed into one, I give some more details here
on some later commits.  For future reference.

**HepMC aux tables**

I found two problems with the HepMC aux tables after looking at the
data a little deeper

- I used the BC identifier as the collision index field in the HepMC
  aux tables.  This happened because I took my clues from the MC
  producer.  The MC producer does not generate a BC ID - even though
  it sort of looked like it - the BC ID in the MC producer is
  essentially an index.  I've fixed this by passing the collision
  index (called `collisionID` most places) to the relevant member
  functions and the table updates.

- The producers were set up so that HepMC aux tables would _only_ be
  written if the input had the corresponding data.  If a model gave
  the Cross-section, but not PDF nor Heavy-Ion information, then only
  the cross-section table would be populated.  Pythia, for example,
  gives the cross-section and PDF information, but no Heavy-Ion
  information.

  All three tables would be produced, but some may not have any
  entries.

  Later on, when we want to process the events (by Rivet f.ex.), we
  like to access as much HepMC aux information as possible so we may
  build the most complete HepMC event possible.  Thus, we would like
  to read in

  - MC Collisions
  - MC particles
  - 3 HepMC aux

  However, if one or more of the AOD trees of the 3 HepMC aux tables
  had no entries, it will cause the producer to crash

  ```
  libO2FrameworkAnalysisSupport.so: std::function<void (o2::framework::TreeToTable&)>::operator()(o2::framework::TreeToTable&) const
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::release()
  libO2FrameworkAnalysisSupport.so: o2::framework::LifetimeHolder<o2::framework::TreeToTable>::~LifetimeHolder()
  libO2FrameworkAnalysisSupport.so: o2::framework::DataInputDescriptor::readTree(o2::framework::DataAllocator&, o2::header::DataHeader, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, unsigned long&)
  ```

  I cannot quite figure out why that happens, but I guess it's a
  problem triggered by `TreeToTable` or the call back on that object.

  The (temporary) solution to the above problem is to set the update
  policy on the HepMC aux tables to be `always`.  That means we will
  always write the tables and give them entries.  The real solution to
  the problem will be to fix `TreeToTable` or what ever is causing the
  above `SIGSEGV`.

**MC track lables**

To get MC labels correct, the member
`AODProducerWorkflowSpec::mToKeep` needs to be updated with the actual
index positions in the output table.  This was easily fixed by passing
in the relevant mapping by reference instead of by const reference.
Again, this is a case where I did not see this problem initially
because I was dealing solely with MC data.  Thanks to Sandro for
providing the instructions for how to run a full MC->AOD chain.

Also, to reproduce results from `dev`, I had to implement a (faulty)
track selection into `AODMcProducerHelpers::updateParticles`.  It is
important to keep in mind that
`AODProducerWorkflowSpec::mToKeep[source][event]` is a mapping from
particle number to storage flag.  A zero or negative storage flag
means that the track is stored.

In `dev`, the algorithm is

    if      particle from EG:  store it
    else if particle is physical primary: store it
    else if particle is physics: store it

    if particle found in mapping:
         mark mothers and daughters for storage

The important part is the last thing: `if particle found in mapping`.
The particle _may_ be stored in the mapping with a negative flag -
meaning it should not be stored - which means that we may end up
triggering storage of mothers and daughters of a particle that isn't
it self stored.  In my test of 100 pp events with roughly 100k
particles stored in total, this happend 25 times.

The correct algorithm is

    if particle not previously marked for storage and
       particle is not from EG and
       particle is not physical primary and
       particle is not physics:
           do not store particle
           go on to next particle

    store particle and its mothers and daughters

In this way, mothers and daughters will _only_ be marked for storage
if the particle it self is in fact stored.

Currently, the selection implements the `dev` algorithm only so that
we can test that `dev` and this MR gives the same results.  Once this
MR is accepted, the select upgraded to correct algorithm.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants