-
Notifications
You must be signed in to change notification settings - Fork 484
Upgrades for Generators #12032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrades for Generators #12032
Conversation
A number of keys into the ME event header information
mapping is defined. This is to ensure that code will
use the same keys when ever information is set.
Additional, non-predefined keys, are still possible.
This makes it much more robust when we ask for specific
MC information from the event header, such as
- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
- Npart in projectile and target
- Ncoll in various views
- Overall
- Hard
- wounded-nucleon on nucleon
- nucleon on wounded-nucleon
- wounded on wounded
- Parton distribution function parameters
This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.
The generator has been changed so that it exports _all_ relevant and available information from Pythia to the MC event header, including heavy-ion "geometry" parameters. In particular, the information is stored in an HepMC compatible way for later user by f.ex. Rivet. Note, the current code counts up the number of collisions by it self. However, the authors of Pythia have another way of doing that. The code is now there to do it the same way as the Pythia authors, but is currenly disabled. We should decide which is the appropriate way to count Ncoll. I would recommend to follow how the Pythia authors do it.
This change does two things:
**Full header**
_All_ information available in the HepMC event header is
propagated to the MC event header information map. This
includes
- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined
This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses
** External program **
The functionality of the generatator is expanded so that it may
spawn an event generator program, say `eg`.
- The generator opens a FIFO
- The generator then executes the program `eg` in the background
- The `eg` program is assumed to write HepMC event records on
standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO
For this to work, a number of conditions _must_ be met by the
`eg` program:
- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
standard output
- It _must_ accept the command line option `-n NEVENTS` to
set the number of events to generate.
If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there. Thus,
we can provide the script
#!/bin/sh
crmc $@ -o hepmc3 -f /dev/stdout | \
sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'
which simply filters the output of `crmc`. Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like
#!/bin/sh
cmdline="eg-program -o /dev/stdout "
while test $# -gt 0 ; do
case x$1 in
x-n) cmdline="$cmdline -n $2"; shift ;;
*) cmdline="$cmdline $1" ;;
esac
shift
done
$cmdline
The command line to run is specified as
--configKeyValues "HepMC.progCmd=<program and options>"
and can include not only the program name but also other
options to the program. For example
--configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"
for Pb-Pb collisions with Hijing.
With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.
The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.
The generator can operate in two modes
- Data is read from a file(s)
- Data is read from a file being generated by a child
program
The first mode is selected by
-g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"
The second mode is selected by
-g tparticle --configKeyValues "TParticle.progCmd=<program and options>"
For this latter mode, see also recent commit to `GeneratorHepMC`
Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible
eg -n NEVENTS -o OUTPUT_FILENAME
That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).
The name of the `TTree` object in the file(s) can be set with
--configKeyValues "TParticle.treeName=<name>"
(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`
--configKeyValues "TParticle.branchName=<name>"
(defaults to `Particles`).
The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record. Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.
Please consider the following formatting changes to AliceO2Group#11913
Please consider the following formatting changes to AliceO2Group#11913
The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to derive from the (second) base class `GeneratorFileOrCmd`. `GeneratorFileOrCmd` provides common infrastructure to specify - File(s) to read events from (ROOT files in case of `GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`), _or_ - Which commmand to execute and with which options. It also provides infrastructure to make unique temporary names, a FIFO to read from (child program writes to), and so on. These are all configured through configuration keys prefixed by `FileOrCmd.`. Other changes include - `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII, compressed ASCII, HEPEVT, etc. - Through the use of `GeneratorFileOrCmd` the command line option flags for specifying seed (default: `-s`), number of events (default `-n`), largest impact parameter (defautl: `-b`), output (default: `>`), and so on can be configured via configuration key values - `GeneratorHepMC` and `GeneratorTParticle` are passed the `GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam` and `GeneratorTParticleParam`, respectively, objects by `GeneratorFactor` and sets the internal parameters accordingly. This hides the specifics of the parameters from `GeneratorFactory`.
…nd makes life so much more difficult than it needs to be for very little gain
|
Error while checking build/O2/fullCI for fba7f8c at 2023-10-09 13:27: Full log here. |
sawenzel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment about HepMC.filename
- The test case `o2sim-hepmc` fixed to use proper config key
- `GeneratorHepMC` will give error if old config key used
- `GeneratorHepMC` will give warning if version key is set - the code
deduces the version on its own now.
- Added superflous `{...}` around single statement `if`, `while`, ... -
boy those checks are silly and counter productive.
Fixed - please remove change request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a more thorough look/review into this. Overall the PR is very valuable and a good way forward. However, a couple of things still need attention or careful discussion:
a) My first and foremost worry concerns the introduction backward-incompatible restructuring of how HepMC filenames need to be given by users. HepMC.filename is no longer supported. This will need follow-up changes for instance in O2DPG where several PWGs already use this. Is there no way to keep the current HepMC.filename key? (you may forward this key to some common internal structure such as FileOrCmd)
b) My second comment concerns the newly introduced configurable value FileOrCmd. As previously argued with key TParticle, I do not find the choice of name appropriate. The key doesn't tell the user what this is for. I understand that this configures how certain external generators (such as HepMC or others) are read/connected. So I would suggest to use something like GenFileOrCmd, GeneratorFileOrCmd or similar (a name that relates to generator). I can then more easily understand GenFileOrCmd.filenames, GenFileOrCmd.cmd when I read them (like a namespace).
c) I tried to run the examples in SimExamples/HepMC folder. At the moment I was not successful:
crmcdoes not have an option-b- the CRMC that we ship in ALICE with alidist is out of date and does not know
-o hepmc3 - even fixing these 2 ... the
child.shexample does not run for me. It tellsWARNING::ReaderAscii: found unsupported expression in header. Will close the input. HepMC::As EPOS used with FUSION option. This may depend on the precise crmc commit of course or just need a slightly modifiedsedfilter.
It would be good to make sure that the example actually runs fine. I am also trying to setup a second example for STARlight. We should in particular make sure that the mechanism works for HepMC2 and for HepMC3.
- `GeneratorHepMC` accepts the old `HepMC.fileName` configuration option (with a deprecation warning) - `HepMC.version` is honoured when spawning a child process. That is, if it is set to 2, then use compatibility reader `HepMC3::ReaderAsciiVersion2`, otherwise use the version 3 reader. If option is given with file names, then give warning that it is not used. - Configuration namespace `FileOrCmd` is renamed to `GeneratorFileOrCmd` - Documentation has been updated to reflect these changes
Not sure if this failure has anything to do with this MR. It seems some I would chalk it up as a false negative |
OK
OK, so as you see below, I've made the changes backward compatible - albeit with a deprecation warning. Hopefully users will upgrade their code and we can remove this "hack" later on down the road.
The namespaces for the
In the example, I explicitly set the
Well, hopefully that will get upgraded soon :-) With the latest change, we could do If CRMC installed with
That would be fixed by passing Unfortunately the
OK, should work with the latest commit. Yours, |
- Forgot to change key for test in `run/CMakeLists.txt` - `crmc.sh` uses `-o hepmc` instead of `-o hepmc3` to accomodate older installation of CRMC with `aliBuild`.
|
Hi all, The last commit, I believe, fixes up the request changes. All tests but Yours, |
|
Hi Christian, Sandro is away this week, but should be back on Monday. The macOS check is broken at the moment, so what concerns the CI your PR is good to go. |
|
Error while checking build/O2/fullCI for 353068e at 2023-10-19 05:11: Full log here. |
Several improvements for event generators:
- standardization of info keys in MC event headers
- new GeneratorTParticle generator, being able to read simple TParticle events from file or process
- ability to read HepMC events from process via FIFO and other improvements for HepMC
Summary of individual commit messages
-----------
* Introduce pre-defined keys for MC event header info
A number of keys into the ME event header information
mapping is defined. This is to ensure that code will
use the same keys when ever information is set.
Additional, non-predefined keys, are still possible.
This makes it much more robust when we ask for specific
MC information from the event header, such as
- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
- Npart in projectile and target
- Ncoll in various views
- Overall
- Hard
- wounded-nucleon on nucleon
- nucleon on wounded-nucleon
- wounded on wounded
- Parton distribution function parameters
This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.
* Export _full_ header information to MC event header
The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters. In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.
Note, the current code counts up the number of collisions
by it self. However, the authors of Pythia have another
way of doing that. The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.
We should decide which is the appropriate way to count
Ncoll. I would recommend to follow how the Pythia
authors do it.
* Full header read-in and external program
This change does two things:
**Full header**
_All_ information available in the HepMC event header is
propagated to the MC event header information map. This
includes
- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined
This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses
** External program **
The functionality of the generator is expanded so that it may
spawn an event generator program, say `eg`.
- The generator opens a FIFO
- The generator then executes the program `eg` in the background
- The `eg` program is assumed to write HepMC event records on
standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO
For this to work, a number of conditions _must_ be met by the
`eg` program:
- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
standard output
- It _must_ accept the command line option `-n NEVENTS` to
set the number of events to generate.
If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there. Thus,
we can provide the script
#!/bin/sh
crmc $@ -o hepmc3 -f /dev/stdout | \
sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'
which simply filters the output of `crmc`. Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like
#!/bin/sh
cmdline="eg-program -o /dev/stdout "
while test $# -gt 0 ; do
case x$1 in
x-n) cmdline="$cmdline -n $2"; shift ;;
*) cmdline="$cmdline $1" ;;
esac
shift
done
$cmdline
The command line to run is specified as
--configKeyValues "HepMC.progCmd=<program and options>"
and can include not only the program name but also other
options to the program. For example
--configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"
for Pb-Pb collisions with Hijing.
With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.
* New generator GeneratorTParticle
The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.
The generator can operate in two modes
- Data is read from a file(s)
- Data is read from a file being generated by a child
program
The first mode is selected by
-g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"
The second mode is selected by
-g tparticle --configKeyValues "TParticle.progCmd=<program and options>"
For this latter mode, see also recent commit to `GeneratorHepMC`
Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible
eg -n NEVENTS -o OUTPUT_FILENAME
That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).
The name of the `TTree` object in the file(s) can be set with
--configKeyValues "TParticle.treeName=<name>"
(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`
--configKeyValues "TParticle.branchName=<name>"
(defaults to `Particles`).
The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record. Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.
* Refactoring to HepMC and TParticle Generators
The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to
derive from the (second) base class `GeneratorFileOrCmd`.
`GeneratorFileOrCmd` provides common infrastructure to specify
- File(s) to read events from (ROOT files in case of
`GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`),
_or_
- Which commmand to execute and with which options.
It also provides infrastructure to make unique temporary names, a FIFO
to read from (child program writes to), and so on.
These are all configured through configuration keys prefixed by
`GeneratorFileOrCmd.`.
Other changes include
- `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII,
compressed ASCII, HEPEVT, etc.
- Through the use of `GeneratorFileOrCmd` the command line option flags
for specifying seed (default: `-s`), number of events (default `-n`),
largest impact parameter (defautl: `-b`), output (default: `>`), and
so on can be configured via configuration key values
- `GeneratorHepMC` and `GeneratorTParticle` are passed the
`GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam`
and `GeneratorTParticleParam`, respectively, objects by
`GeneratorFactor` and sets the internal parameters accordingly. This
hides the specifics of the parameters from `GeneratorFactory`.
- `GeneratorHepMC` accepts the old `HepMC.fileName` configuration
option (with a deprecation warning)
- `HepMC.version` is honoured when spawning a child process.
That is, if it is set to 2, then use compatibility reader
`HepMC3::ReaderAsciiVersion2`, otherwise use the version 3
reader.
If option is given with file names, then give warning that
it is not used.
Several improvements for event generators:
- standardization of info keys in MC event headers
- new GeneratorTParticle generator, being able to read simple TParticle events from file or process
- ability to read HepMC events from process via FIFO and other improvements for HepMC
Summary of individual commit messages
-----------
* Introduce pre-defined keys for MC event header info
A number of keys into the ME event header information
mapping is defined. This is to ensure that code will
use the same keys when ever information is set.
Additional, non-predefined keys, are still possible.
This makes it much more robust when we ask for specific
MC information from the event header, such as
- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
- Npart in projectile and target
- Ncoll in various views
- Overall
- Hard
- wounded-nucleon on nucleon
- nucleon on wounded-nucleon
- wounded on wounded
- Parton distribution function parameters
This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.
* Export _full_ header information to MC event header
The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters. In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.
Note, the current code counts up the number of collisions
by it self. However, the authors of Pythia have another
way of doing that. The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.
We should decide which is the appropriate way to count
Ncoll. I would recommend to follow how the Pythia
authors do it.
* Full header read-in and external program
This change does two things:
**Full header**
_All_ information available in the HepMC event header is
propagated to the MC event header information map. This
includes
- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined
This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses
** External program **
The functionality of the generator is expanded so that it may
spawn an event generator program, say `eg`.
- The generator opens a FIFO
- The generator then executes the program `eg` in the background
- The `eg` program is assumed to write HepMC event records on
standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO
For this to work, a number of conditions _must_ be met by the
`eg` program:
- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
standard output
- It _must_ accept the command line option `-n NEVENTS` to
set the number of events to generate.
If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there. Thus,
we can provide the script
#!/bin/sh
crmc $@ -o hepmc3 -f /dev/stdout | \
sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'
which simply filters the output of `crmc`. Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like
#!/bin/sh
cmdline="eg-program -o /dev/stdout "
while test $# -gt 0 ; do
case x$1 in
x-n) cmdline="$cmdline -n $2"; shift ;;
*) cmdline="$cmdline $1" ;;
esac
shift
done
$cmdline
The command line to run is specified as
--configKeyValues "HepMC.progCmd=<program and options>"
and can include not only the program name but also other
options to the program. For example
--configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"
for Pb-Pb collisions with Hijing.
With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.
* New generator GeneratorTParticle
The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.
The generator can operate in two modes
- Data is read from a file(s)
- Data is read from a file being generated by a child
program
The first mode is selected by
-g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"
The second mode is selected by
-g tparticle --configKeyValues "TParticle.progCmd=<program and options>"
For this latter mode, see also recent commit to `GeneratorHepMC`
Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible
eg -n NEVENTS -o OUTPUT_FILENAME
That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).
The name of the `TTree` object in the file(s) can be set with
--configKeyValues "TParticle.treeName=<name>"
(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`
--configKeyValues "TParticle.branchName=<name>"
(defaults to `Particles`).
The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record. Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.
* Refactoring to HepMC and TParticle Generators
The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to
derive from the (second) base class `GeneratorFileOrCmd`.
`GeneratorFileOrCmd` provides common infrastructure to specify
- File(s) to read events from (ROOT files in case of
`GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`),
_or_
- Which commmand to execute and with which options.
It also provides infrastructure to make unique temporary names, a FIFO
to read from (child program writes to), and so on.
These are all configured through configuration keys prefixed by
`GeneratorFileOrCmd.`.
Other changes include
- `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII,
compressed ASCII, HEPEVT, etc.
- Through the use of `GeneratorFileOrCmd` the command line option flags
for specifying seed (default: `-s`), number of events (default `-n`),
largest impact parameter (defautl: `-b`), output (default: `>`), and
so on can be configured via configuration key values
- `GeneratorHepMC` and `GeneratorTParticle` are passed the
`GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam`
and `GeneratorTParticleParam`, respectively, objects by
`GeneratorFactor` and sets the internal parameters accordingly. This
hides the specifics of the parameters from `GeneratorFactory`.
- `GeneratorHepMC` accepts the old `HepMC.fileName` configuration
option (with a deprecation warning)
- `HepMC.version` is honoured when spawning a child process.
That is, if it is set to 2, then use compatibility reader
`HepMC3::ReaderAsciiVersion2`, otherwise use the version 3
reader.
If option is given with file names, then give warning that
it is not used.
This superseeds this merge request
The text of that merge request:
Please see the commit logs of the individual commits.
TL;DR:
MCEventHeader: Pre-defined keys for informationGeneratorPythia8: Export all available information toMCEventHeaderGeneratorHepMC:MCEventHeaderGeneratorTParticle:TChainwithTClonesArraybranch withTParticleobjectsThe changes to
GeneratorHepMCallows us to use any event generator that can write HepMC events.Examples of use are given in log messages and code comments.
Yours,
Christian