Skip to content

Conversation

@cholmcc
Copy link
Contributor

@cholmcc cholmcc commented Oct 6, 2023

This superseeds this merge request

The text of that merge request:

Please see the commit logs of the individual commits.

TL;DR:

  • MCEventHeader: Pre-defined keys for information
  • GeneratorPythia8: Export all available information to MCEventHeader
  • GeneratorHepMC:
    • Export all available information to MCEventHeader
    • Read events from external, spawned background program via FIFO
  • GeneratorTParticle:
    • Reads from TChain with TClonesArray branch with TParticle objects
      • Can read from existing files, or
      • from file produced by external, spawned background program

The changes to GeneratorHepMC allows us to use any event generator that can write HepMC events.

Examples of use are given in log messages and code comments.

Yours,

Christian

cholmcc and others added 30 commits September 19, 2023 13:53
A number of keys into the ME event header information
mapping is defined.   This is to ensure that code will
use the same keys when ever information is set.

Additional, non-predefined keys, are still possible.

This makes it much more robust when we ask for specific
MC information from the event header, such as

- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
  - Npart in projectile and target
  - Ncoll in various views
    - Overall
    - Hard
    - wounded-nucleon on nucleon
    - nucleon on wounded-nucleon
    - wounded on wounded
- Parton distribution function parameters

This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.
The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters.  In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.

Note, the current code counts up the number of collisions
by it self.  However, the authors of Pythia have another
way of doing that.   The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.

We should decide which is the appropriate way to count
Ncoll.  I would recommend to follow how the Pythia
authors do it.
This change does two things:

**Full header**

_All_ information available in the HepMC event header is
propagated to the MC event header information map.  This
includes

- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined

This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses

** External program **

The functionality of the generatator is expanded so that it may
spawn an event generator program, say `eg`.

- The generator opens a FIFO
- The generator then executes the program `eg` in the background
  - The `eg` program is assumed to write HepMC event records on
    standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO

For this to work, a number of conditions _must_ be met by the
`eg` program:

- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
  standard output
- It _must_ accept the command line option `-n NEVENTS` to
  set the number of events to generate.

If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there.  Thus,
we can provide the script

    #!/bin/sh

    crmc $@ -o hepmc3 -f /dev/stdout | \
       sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'

which simply filters the output of `crmc`.  Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like

    #!/bin/sh
    cmdline="eg-program -o /dev/stdout "

    while test $# -gt 0 ; do
       case x$1 in
       x-n) cmdline="$cmdline -n $2"; shift ;;
       *)   cmdline="$cmdline $1" ;;
       esac

       shift
    done

    $cmdline

The command line to run is specified as

    --configKeyValues "HepMC.progCmd=<program and options>"

and can include not only the program name but also other
options to the program.  For example

    --configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"

for Pb-Pb collisions with Hijing.

With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.
The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.

The generator can operate in two modes

- Data is read from a file(s)
- Data is read from a file being generated by a child
  program

The first mode is selected by

    -g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"

The second mode is selected by

    -g tparticle --configKeyValues "TParticle.progCmd=<program and options>"

For this latter mode, see also recent commit to `GeneratorHepMC`

Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible

    eg -n NEVENTS -o OUTPUT_FILENAME

That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).

The name of the `TTree` object in the file(s) can be set with

    --configKeyValues "TParticle.treeName=<name>"

(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`

    --configKeyValues "TParticle.branchName=<name>"

(defaults to `Particles`).

The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record.   Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.
Please consider the following formatting changes to AliceO2Group#11913
Please consider the following formatting changes to AliceO2Group#11913
The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to
derive from the (second) base class `GeneratorFileOrCmd`.

`GeneratorFileOrCmd` provides common infrastructure to specify

- File(s) to read events from (ROOT files in case of
  `GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`),
  _or_
- Which commmand to execute and with which options.

It also provides infrastructure to make unique temporary names, a FIFO
to read from (child program writes to), and so on.

These are all configured through configuration keys prefixed by
`FileOrCmd.`.

Other changes include

- `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII,
  compressed ASCII, HEPEVT, etc.
- Through the use of `GeneratorFileOrCmd` the command line option flags
  for specifying seed (default: `-s`), number of events (default `-n`),
  largest impact parameter (defautl: `-b`), output (default: `>`), and
  so on can be configured via configuration key values
- `GeneratorHepMC` and `GeneratorTParticle` are passed the
  `GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam`
  and `GeneratorTParticleParam`, respectively, objects by
  `GeneratorFactor` and sets the internal parameters accordingly. This
  hides the specifics of the parameters from `GeneratorFactory`.
…nd makes life so much more difficult than it needs to be for very little gain
@alibuild
Copy link
Collaborator

alibuild commented Oct 9, 2023

Error while checking build/O2/fullCI for fba7f8c at 2023-10-09 13:27:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/QualityControl-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/O2Physics-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[ERROR] Function RecoDecay::getMassPDG is deprecated and will be removed soon.
[ERROR] Please use the Mass function in the O2DatabasePDG service instead.
[ERROR] See the example of usage in Tutorials/src/usingPDGService.cxx.
[0 more errors; see full log]

Full log here.

Copy link
Collaborator

@sawenzel sawenzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment about HepMC.filename

- The test case `o2sim-hepmc` fixed to use proper config key
- `GeneratorHepMC` will give error if old config key used
- `GeneratorHepMC` will give warning if version key is set - the code
  deduces the version on its own now.
- Added superflous `{...}` around single statement `if`, `while`, ... -
  boy those checks are silly and counter productive.
@cholmcc
Copy link
Contributor Author

cholmcc commented Oct 9, 2023

see comment about HepMC.filename

Fixed - please remove change request.

@cholmcc cholmcc requested a review from sawenzel October 10, 2023 06:18
@AliceO2Group AliceO2Group deleted a comment from cholmcc Oct 11, 2023
@AliceO2Group AliceO2Group deleted a comment from cholmcc Oct 11, 2023
@AliceO2Group AliceO2Group deleted a comment from cholmcc Oct 11, 2023
Copy link
Collaborator

@sawenzel sawenzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a more thorough look/review into this. Overall the PR is very valuable and a good way forward. However, a couple of things still need attention or careful discussion:

a) My first and foremost worry concerns the introduction backward-incompatible restructuring of how HepMC filenames need to be given by users. HepMC.filename is no longer supported. This will need follow-up changes for instance in O2DPG where several PWGs already use this. Is there no way to keep the current HepMC.filename key? (you may forward this key to some common internal structure such as FileOrCmd)

b) My second comment concerns the newly introduced configurable value FileOrCmd. As previously argued with key TParticle, I do not find the choice of name appropriate. The key doesn't tell the user what this is for. I understand that this configures how certain external generators (such as HepMC or others) are read/connected. So I would suggest to use something like GenFileOrCmd, GeneratorFileOrCmd or similar (a name that relates to generator). I can then more easily understand GenFileOrCmd.filenames, GenFileOrCmd.cmd when I read them (like a namespace).

c) I tried to run the examples in SimExamples/HepMC folder. At the moment I was not successful:

  • crmc does not have an option -b
  • the CRMC that we ship in ALICE with alidist is out of date and does not know -o hepmc3
  • even fixing these 2 ... the child.sh example does not run for me. It tells WARNING::ReaderAscii: found unsupported expression in header. Will close the input. HepMC::As EPOS used with FUSION option. This may depend on the precise crmc commit of course or just need a slightly modified sed filter.

It would be good to make sure that the example actually runs fine. I am also trying to setup a second example for STARlight. We should in particular make sure that the mechanism works for HepMC2 and for HepMC3.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI for 330feea at 2023-10-14 02:47:

No log files found

Full log here.

- `GeneratorHepMC` accepts the old `HepMC.fileName` configuration
  option (with a deprecation warning)
- `HepMC.version` is honoured when spawning a child process.
  That is, if it is set to 2, then use compatibility reader
  `HepMC3::ReaderAsciiVersion2`, otherwise use the version 3
  reader.

  If option is given with file names, then give warning that
  it is not used.

- Configuration namespace `FileOrCmd` is renamed to
  `GeneratorFileOrCmd`

- Documentation has been updated to reflect these changes
@cholmcc
Copy link
Contributor Author

cholmcc commented Oct 15, 2023

Error while checking build/O2/fullCI for 330feea at 2023-10-14 02:47:

No log files found

Full log here.

Not sure if this failure has anything to do with this MR. It seems some git clones failed

Git command for package 'MLModels' failed.
Command: git fetch -f --tags https://github.com/alisw/MLModels.git +refs/heads/*:refs/heads/*
In directory: /build/nomad/alloc/78727433-50b3-e91c-7f54-7ec8b17f3bdc/ci/local/o2-fullci/sw/MIRROR/mlmodels
Exit code: 128
Git command for package 'capstone' failed.
Command: git fetch -f --tags https://github.com/aquynh/capstone +refs/heads/*:refs/heads/*
In directory: /build/nomad/alloc/78727433-50b3-e91c-7f54-7ec8b17f3bdc/ci/local/o2-fullci/sw/MIRROR/capstone
Exit code: 128

I would chalk it up as a false negative

@cholmcc
Copy link
Contributor Author

cholmcc commented Oct 15, 2023

I had a more thorough look/review into this. Overall the PR is very valuable and a good way forward. However, a couple of things still need attention or careful discussion:

OK

a) My first and foremost worry concerns the introduction backward-incompatible restructuring of how HepMC filenames need to be given by users. HepMC.filename is no longer supported. This will need follow-up changes for instance in O2DPG where several PWGs already use this. Is there no way to keep the current HepMC.filename key? (you may forward this key to some common internal structure such as FileOrCmd)

OK, so as you see below, I've made the changes backward compatible - albeit with a deprecation warning. Hopefully users will upgrade their code and we can remove this "hack" later on down the road.

b) My second comment concerns the newly introduced configurable value FileOrCmd. As previously argued with key TParticle, I do not find the choice of name appropriate. The key doesn't tell the user what this is for. I understand that this configures how certain external generators (such as HepMC or others) are read/connected. So I would suggest to use something like GenFileOrCmd, GeneratorFileOrCmd or similar (a name that relates to generator). I can then more easily understand GenFileOrCmd.filenames, GenFileOrCmd.cmd when I read them (like a namespace).

The namespaces for the Generator... parameters are all over the places. Some prefix with Gen, some with Generator, others, nothing at all. To spell it out, I changed it to GeneratorFileOrCmd. I don't care too much about this, only that they names may get so long that users may find it a hassle.

c) I tried to run the examples in SimExamples/HepMC folder. At the moment I was not successful:

  • crmc does not have an option -b

In the example, I explicitly set the bMaxSwitch=none so that the option isn't passed. Actually a good way of illustrating that feature. Note, one can specify a max $b$ for CRMC, but I think it is a setting in the crmc.params file. Of course, the crmc.sh script could "fix that up", but I didn't want to complicate the example too much.

  • the CRMC that we ship in ALICE with alidist is out of date and does not know -o hepmc3

Well, hopefully that will get upgraded soon :-)

With the latest change, we could do -o hepmc and then pass HepMC.version=2 as a configuration key-value

If CRMC installed with alidist only uses HepMC2, then I'm afraid that it is terribly out of date, and the HepMC events written are not correct. I believe so because I was the one making the changes for HepMC3 in CRMC and fixed other issues at the same time. See here, here, and here

  • even fixing these 2 ... the child.sh example does not run for me. It tells WARNING::ReaderAscii: found unsupported expression in header. Will close the input. HepMC::As EPOS used with FUSION option. This may depend on the precise crmc commit of course or just need a slightly modified sed filter.

That would be fixed by passing HepMC.version=2.

Unfortunately the HepMC3.deduceReader (or something like that) will not work with child command, since it is rather aggressive at opening the input file.

It would be good to make sure that the example actually runs fine. I am also trying to setup a second example for STARlight. We should in particular make sure that the mechanism works for HepMC2 and for HepMC3.

OK, should work with the latest commit.

Yours,
Christian

- Forgot to change key for test in `run/CMakeLists.txt`
- `crmc.sh` uses `-o hepmc` instead of `-o hepmc3` to accomodate
  older installation of CRMC with `aliBuild`.
@cholmcc
Copy link
Contributor Author

cholmcc commented Oct 17, 2023

Hi all,

The last commit, I believe, fixes up the request changes. All tests but build/AliceO2/O2/o2/macOS succeed, and that test fails due to a deeper MacOS problem launching curl.

Yours,
Christian

@martenole
Copy link
Contributor

Hi Christian, Sandro is away this week, but should be back on Monday. The macOS check is broken at the moment, so what concerns the CI your PR is good to go.

@alibuild
Copy link
Collaborator

alibuild commented Oct 18, 2023

Error while checking build/O2/fullCI for 353068e at 2023-10-19 05:11:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
/sw/SOURCES/O2/12032-slc8_x86-64/0/Generators/src/GeneratorPythia8.cxx:383:14: error: statement should be inside braces [readability-braces-around-statements]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

@cholmcc cholmcc requested a review from sawenzel October 19, 2023 11:52
@sawenzel sawenzel merged commit 8fe191e into AliceO2Group:dev Oct 20, 2023
christianreckziegel pushed a commit to LucasFerrandi/AliceO2 that referenced this pull request Nov 9, 2023
Several improvements for event generators:

- standardization of info keys in MC event headers
- new GeneratorTParticle generator, being able to read simple TParticle events from file or process 
- ability to read HepMC events from process via FIFO and other improvements for HepMC


Summary of individual commit messages 
-----------
* Introduce pre-defined keys for MC event header info

A number of keys into the ME event header information
mapping is defined.   This is to ensure that code will
use the same keys when ever information is set.

Additional, non-predefined keys, are still possible.

This makes it much more robust when we ask for specific
MC information from the event header, such as

- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
  - Npart in projectile and target
  - Ncoll in various views
    - Overall
    - Hard
    - wounded-nucleon on nucleon
    - nucleon on wounded-nucleon
    - wounded on wounded
- Parton distribution function parameters

This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.

* Export _full_ header information to MC event header

The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters.  In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.

Note, the current code counts up the number of collisions
by it self.  However, the authors of Pythia have another
way of doing that.   The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.

We should decide which is the appropriate way to count
Ncoll.  I would recommend to follow how the Pythia
authors do it.

* Full header read-in and external program

This change does two things:

**Full header**

_All_ information available in the HepMC event header is
propagated to the MC event header information map.  This
includes

- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined

This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses

** External program **

The functionality of the generator is expanded so that it may
spawn an event generator program, say `eg`.

- The generator opens a FIFO
- The generator then executes the program `eg` in the background
  - The `eg` program is assumed to write HepMC event records on
    standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO

For this to work, a number of conditions _must_ be met by the
`eg` program:

- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
  standard output
- It _must_ accept the command line option `-n NEVENTS` to
  set the number of events to generate.

If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there.  Thus,
we can provide the script

    #!/bin/sh

    crmc $@ -o hepmc3 -f /dev/stdout | \
       sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'

which simply filters the output of `crmc`.  Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like

    #!/bin/sh
    cmdline="eg-program -o /dev/stdout "

    while test $# -gt 0 ; do
       case x$1 in
       x-n) cmdline="$cmdline -n $2"; shift ;;
       *)   cmdline="$cmdline $1" ;;
       esac

       shift
    done

    $cmdline

The command line to run is specified as

    --configKeyValues "HepMC.progCmd=<program and options>"

and can include not only the program name but also other
options to the program.  For example

    --configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"

for Pb-Pb collisions with Hijing.

With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.

* New generator GeneratorTParticle

The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.

The generator can operate in two modes

- Data is read from a file(s)
- Data is read from a file being generated by a child
  program

The first mode is selected by

    -g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"

The second mode is selected by

    -g tparticle --configKeyValues "TParticle.progCmd=<program and options>"

For this latter mode, see also recent commit to `GeneratorHepMC`

Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible

    eg -n NEVENTS -o OUTPUT_FILENAME

That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).

The name of the `TTree` object in the file(s) can be set with

    --configKeyValues "TParticle.treeName=<name>"

(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`

    --configKeyValues "TParticle.branchName=<name>"

(defaults to `Particles`).

The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record.   Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.

* Refactoring to HepMC and TParticle Generators

The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to
derive from the (second) base class `GeneratorFileOrCmd`.

`GeneratorFileOrCmd` provides common infrastructure to specify

- File(s) to read events from (ROOT files in case of
  `GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`),
  _or_
- Which commmand to execute and with which options.

It also provides infrastructure to make unique temporary names, a FIFO
to read from (child program writes to), and so on.

These are all configured through configuration keys prefixed by
`GeneratorFileOrCmd.`.

Other changes include

- `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII,
  compressed ASCII, HEPEVT, etc.
- Through the use of `GeneratorFileOrCmd` the command line option flags
  for specifying seed (default: `-s`), number of events (default `-n`),
  largest impact parameter (defautl: `-b`), output (default: `>`), and
  so on can be configured via configuration key values
- `GeneratorHepMC` and `GeneratorTParticle` are passed the
  `GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam`
  and `GeneratorTParticleParam`, respectively, objects by
  `GeneratorFactor` and sets the internal parameters accordingly. This
  hides the specifics of the parameters from `GeneratorFactory`.

- `GeneratorHepMC` accepts the old `HepMC.fileName` configuration
  option (with a deprecation warning)

- `HepMC.version` is honoured when spawning a child process.
  That is, if it is set to 2, then use compatibility reader
  `HepMC3::ReaderAsciiVersion2`, otherwise use the version 3
  reader.

  If option is given with file names, then give warning that
  it is not used.
leo-barreto pushed a commit to leo-barreto/AliceO2 that referenced this pull request Nov 16, 2023
Several improvements for event generators:

- standardization of info keys in MC event headers
- new GeneratorTParticle generator, being able to read simple TParticle events from file or process 
- ability to read HepMC events from process via FIFO and other improvements for HepMC


Summary of individual commit messages 
-----------
* Introduce pre-defined keys for MC event header info

A number of keys into the ME event header information
mapping is defined.   This is to ensure that code will
use the same keys when ever information is set.

Additional, non-predefined keys, are still possible.

This makes it much more robust when we ask for specific
MC information from the event header, such as

- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
  - Npart in projectile and target
  - Ncoll in various views
    - Overall
    - Hard
    - wounded-nucleon on nucleon
    - nucleon on wounded-nucleon
    - wounded on wounded
- Parton distribution function parameters

This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.

* Export _full_ header information to MC event header

The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters.  In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.

Note, the current code counts up the number of collisions
by it self.  However, the authors of Pythia have another
way of doing that.   The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.

We should decide which is the appropriate way to count
Ncoll.  I would recommend to follow how the Pythia
authors do it.

* Full header read-in and external program

This change does two things:

**Full header**

_All_ information available in the HepMC event header is
propagated to the MC event header information map.  This
includes

- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined

This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses

** External program **

The functionality of the generator is expanded so that it may
spawn an event generator program, say `eg`.

- The generator opens a FIFO
- The generator then executes the program `eg` in the background
  - The `eg` program is assumed to write HepMC event records on
    standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO

For this to work, a number of conditions _must_ be met by the
`eg` program:

- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
  standard output
- It _must_ accept the command line option `-n NEVENTS` to
  set the number of events to generate.

If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there.  Thus,
we can provide the script

    #!/bin/sh

    crmc $@ -o hepmc3 -f /dev/stdout | \
       sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'

which simply filters the output of `crmc`.  Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like

    #!/bin/sh
    cmdline="eg-program -o /dev/stdout "

    while test $# -gt 0 ; do
       case x$1 in
       x-n) cmdline="$cmdline -n $2"; shift ;;
       *)   cmdline="$cmdline $1" ;;
       esac

       shift
    done

    $cmdline

The command line to run is specified as

    --configKeyValues "HepMC.progCmd=<program and options>"

and can include not only the program name but also other
options to the program.  For example

    --configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"

for Pb-Pb collisions with Hijing.

With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.

* New generator GeneratorTParticle

The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.

The generator can operate in two modes

- Data is read from a file(s)
- Data is read from a file being generated by a child
  program

The first mode is selected by

    -g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"

The second mode is selected by

    -g tparticle --configKeyValues "TParticle.progCmd=<program and options>"

For this latter mode, see also recent commit to `GeneratorHepMC`

Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible

    eg -n NEVENTS -o OUTPUT_FILENAME

That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).

The name of the `TTree` object in the file(s) can be set with

    --configKeyValues "TParticle.treeName=<name>"

(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`

    --configKeyValues "TParticle.branchName=<name>"

(defaults to `Particles`).

The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record.   Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.

* Refactoring to HepMC and TParticle Generators

The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to
derive from the (second) base class `GeneratorFileOrCmd`.

`GeneratorFileOrCmd` provides common infrastructure to specify

- File(s) to read events from (ROOT files in case of
  `GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`),
  _or_
- Which commmand to execute and with which options.

It also provides infrastructure to make unique temporary names, a FIFO
to read from (child program writes to), and so on.

These are all configured through configuration keys prefixed by
`GeneratorFileOrCmd.`.

Other changes include

- `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII,
  compressed ASCII, HEPEVT, etc.
- Through the use of `GeneratorFileOrCmd` the command line option flags
  for specifying seed (default: `-s`), number of events (default `-n`),
  largest impact parameter (defautl: `-b`), output (default: `>`), and
  so on can be configured via configuration key values
- `GeneratorHepMC` and `GeneratorTParticle` are passed the
  `GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam`
  and `GeneratorTParticleParam`, respectively, objects by
  `GeneratorFactor` and sets the internal parameters accordingly. This
  hides the specifics of the parameters from `GeneratorFactory`.

- `GeneratorHepMC` accepts the old `HepMC.fileName` configuration
  option (with a deprecation warning)

- `HepMC.version` is honoured when spawning a child process.
  That is, if it is set to 2, then use compatibility reader
  `HepMC3::ReaderAsciiVersion2`, otherwise use the version 3
  reader.

  If option is given with file names, then give warning that
  it is not used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants