Skip to content

Conversation

@cholmcc
Copy link
Contributor

@cholmcc cholmcc commented Sep 19, 2023

Please see the commit logs of the individual commits.

TL;DR:

  • MCEventHeader: Pre-defined keys for information
  • GeneratorPythia8: Export all available information to MCEventHeader
  • GeneratorHepMC:
    • Export all available information to MCEventHeader
    • Read events from external, spawned background program via FIFO
  • GeneratorTParticle:
    • Reads from TChain with TClonesArray branch with TParticle objects
      • Can read from existing files, or
      • from file produced by external, spawned background program

The changes to GeneratorHepMC allows us to use any event generator that can write HepMC events.

Examples of use are given in log messages and code comments.

Yours,

Christian

A number of keys into the ME event header information
mapping is defined.   This is to ensure that code will
use the same keys when ever information is set.

Additional, non-predefined keys, are still possible.

This makes it much more robust when we ask for specific
MC information from the event header, such as

- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
  - Npart in projectile and target
  - Ncoll in various views
    - Overall
    - Hard
    - wounded-nucleon on nucleon
    - nucleon on wounded-nucleon
    - wounded on wounded
- Parton distribution function parameters

This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.
The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters.  In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.

Note, the current code counts up the number of collisions
by it self.  However, the authors of Pythia have another
way of doing that.   The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.

We should decide which is the appropriate way to count
Ncoll.  I would recommend to follow how the Pythia
authors do it.
This change does two things:

**Full header**

_All_ information available in the HepMC event header is
propagated to the MC event header information map.  This
includes

- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined

This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses

** External program **

The functionality of the generatator is expanded so that it may
spawn an event generator program, say `eg`.

- The generator opens a FIFO
- The generator then executes the program `eg` in the background
  - The `eg` program is assumed to write HepMC event records on
    standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO

For this to work, a number of conditions _must_ be met by the
`eg` program:

- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
  standard output
- It _must_ accept the command line option `-n NEVENTS` to
  set the number of events to generate.

If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there.  Thus,
we can provide the script

    #!/bin/sh

    crmc $@ -o hepmc3 -f /dev/stdout | \
       sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'

which simply filters the output of `crmc`.  Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like

    #!/bin/sh
    cmdline="eg-program -o /dev/stdout "

    while test $# -gt 0 ; do
       case x$1 in
       x-n) cmdline="$cmdline -n $2"; shift ;;
       *)   cmdline="$cmdline $1" ;;
       esac

       shift
    done

    $cmdline

The command line to run is specified as

    --configKeyValues "HepMC.progCmd=<program and options>"

and can include not only the program name but also other
options to the program.  For example

    --configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"

for Pb-Pb collisions with Hijing.

With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.
The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.

The generator can operate in two modes

- Data is read from a file(s)
- Data is read from a file being generated by a child
  program

The first mode is selected by

    -g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"

The second mode is selected by

    -g tparticle --configKeyValues "TParticle.progCmd=<program and options>"

For this latter mode, see also recent commit to `GeneratorHepMC`

Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible

    eg -n NEVENTS -o OUTPUT_FILENAME

That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).

The name of the `TTree` object in the file(s) can be set with

    --configKeyValues "TParticle.treeName=<name>"

(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`

    --configKeyValues "TParticle.branchName=<name>"

(defaults to `Particles`).

The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record.   Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.
@cholmcc cholmcc marked this pull request as draft September 19, 2023 13:10
@cholmcc cholmcc marked this pull request as ready for review September 19, 2023 13:22
@sawenzel
Copy link
Collaborator

Looks reasonable but would discuss this in the simulation meeting first of all.

@cholmcc
Copy link
Contributor Author

cholmcc commented Sep 19, 2023

Hi,

Looks reasonable ...

Great.

... but would discuss this in the simulation meeting first of all.

OK, a few issues

  • Can you advice as to when such a meeting will take place? Would you want my presence there?
  • This effort should be considered with the effort to integrate Rivet into the $\mathrm{O}^{2}$ pipeline.
    • That effort is taking place in the project O2Rivet. Please see there fore more on that project
    • Rivet Analyses may require more than just particles and vertexes, but also the auxiliary information typically present in the HepMC event header.
  • The idea of the changes to GeneratorHepMC is to allow integration of more event generators into the $\mathrm{O}^{2}$ pipeline. In particular, more and more EGs support HepMC. I've integrated a number of EGs my self - f.ex. AMPT.

Yours,

Christian

///
/// --configKeyValues "TParticle.fileNames=foo.root,bar.root"
///
class GeneratorTParticle : public Generator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit unsure about this one: We already have a generator "GeneratorFromFile" which reads TParticles from a Run2 - like ROOT file.

A quick check if we still need this one would good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a difference between the GeneratorFromFile and GeneratorTParticle, mainly in that GeneratorTParticle

  • is more flexible
  • can read data from external program rather than from file
  • can read from a TChain of files
  • assumes all events are in the same TTree, unlike GeneratorFromFile which assumes there's one TTree for each event in the TDirectory named Event<_event-no_> in the single input file.

As such, the two classes GeneratorFromFile and `GeneratorTParticle serves different needs

  • GeneratorFromFile is designed to read from an AliROOT Kinematics.root file
  • GeneratorTParticle is designed to read from a general ROOT event tree, for example produced by a TGenerator.

Hope that clarifies the differences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and another difference. Using the key progCmd, one could execute a TGenerator as a background process. Say we have the script MyEG.C

class MyGenerator : public TGenerator 
{
   ...
};
 
void MyEG(const char* output, Int_t nEv, Int_t seed, Float_t bmax)
{
   gRandom->SetSeed(seed);
  
   MyGenerator generator(bMax);
  TFile* file = TFile::Open(output, "RECREATE");
  TTree* tree = new TTree("T");
  TClonesArray* particles = new TClonesArray("TParticle");
  tree->Branch("Particles", &particles);
  tree->SetDirectory(file);

  for (Int_t iEv = 0; iEv < nEv; iEv++) {
     generator.GenerateEvent();
     generator.ImportParticles(particles);
     tree->Fill();
     tree->AutoSave("SaveSelf FlushBaskets Overwrite");
  }
  file->Write();
  file->Close();
}

and a simple script like myeg.sh

#!/bin/sh 

nev=1
out=myeg.root 
seed=0
bmax=0

while test $# -gt 0; do 
   case $1 in 
   -n|--nevents) nev=$2 ; shift ;;
   -o|--output)   out=$2 ; shift ;;
   -b|--bMax)     bmax=$2; shift;;  # Future proof 
   -s|--seed)       seed=$2; shift;;
  --help) echo "Usage: $0 [-n NEV] [-o OUTPUT] [-s SEED] [-b bmax]"; exit 0;;
  esac 
root MyEG.C++\( \"$out\",$nev,$seed,$bmax\)

then one can do

o2-sim -g tparticle --configKeyValues "TParticle.progCmd=myeg.sh"

which means all the old AliGenerators can be used more or less directly :-)

@benedikt-voelkel
Copy link
Contributor

Would it make sense to factorise the 3 things in this PR?

  • GeneratorTParticle
  • aligning MCEventHeader fields
  • HepMC devs

In my opinion, your HepMC and MCEventHeader developments here might worth its own PR and should bring already a lot of flexibility you might need for your O2Rivet developments and tests.
In fact, it gives users immediately the possibility to play with other (very custom generators).

Could you remind me maybe what the exact reason for having the GeneratorTParticle? Maybe I missed that in your presentation yesterday. Is it a blocker for O2Rivet developments or could they go forward at this point without another generator type?

After that, another PR could introduce further O2 generator developments.

@cholmcc
Copy link
Contributor Author

cholmcc commented Sep 28, 2023

Hi Sandro,

I missed your last comment. Perhaps the point raised above could be enough to convince you that GeneratorTParticle should also go in #11699

GeneratorTParticle is not a stumbling block - in any way - for me, just another flexible option for everyone to play with.

If you insist on taking out GeneratorTParticle, could you please advice how to best take that out of the MR. I suspect doing git rm to the files and some other edits and then a new commit to the MR branch, but if you know of a better way, please let me know. Thanks.

@cholmcc
Copy link
Contributor Author

cholmcc commented Oct 2, 2023

Hi Sandro,

Any news on this? Please advice as to what you see as the way forward (vis-a-vis the discussion here and on MatterMost). Thanks.

@sawenzel
Copy link
Collaborator

sawenzel commented Oct 4, 2023

I don't mind merging this as a whole and there is no problem introducing the GeneratorTParticle class (the existing one is indeed specific to AliRoot).

However, I'd like to insist that
(a) the configurable key is not just called "TParticle". It is just not clear from this name, that it is configuring a generator.

and

(b) while you are modifying the PR, please also add the usage examples (for HepMC, GeneratorTParticle) in /run/SimExamples.

@cholmcc cholmcc requested a review from a team as a code owner October 6, 2023 09:50
sawenzel
sawenzel previously approved these changes Oct 6, 2023
…nd makes life so much more difficult than it needs to be for very little gain
… that it is so pedantic that you cannot have two blank lines at the end of a simple shell script - this is studpid and _very_ counter productive. Please please _please_ change the policy on this
This is exactly what happens when you make the stupid arse, over zealous
code checkers.  I get frustrated and out of sheer oversight, commit some
stuff that shouldn't be committed and then I have to go back and fixt it
again.  That is so annoying, counter-productive, and down right silly.

This could all be avoided if the code checker did _reasonable_ checks.
Who cares if there's an extra blank line in a shell script, or other
such annoying things.

In fact, if it is so important, then git hooks that _automatically_
fixes these issues should be enstated instead of relying on the
developers to do this f**ked up things over and over again.

Please, please, _please_ fix this.
@sawenzel sawenzel closed this Oct 9, 2023
@cholmcc cholmcc deleted the cholmcc_generatos branch November 21, 2023 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants