Skip to content

Add Gaudi Functional C++ Class Generator#372

Open
ianna wants to merge 20 commits intokey4hep:mainfrom
ianna:ianna/gaudi_functional_generator
Open

Add Gaudi Functional C++ Class Generator#372
ianna wants to merge 20 commits intokey4hep:mainfrom
ianna:ianna/gaudi_functional_generator

Conversation

@ianna
Copy link
Copy Markdown

@ianna ianna commented Jan 16, 2026

BEGINRELEASENOTES

  • Gaudi Functional C++ Class Generator

ENDRELEASENOTES

Gaudi Functional C++ Class Generator

This script generates Gaudi Functional C++ classes with appropriate structure and boilerplate code based on user-defined specifications. It supports both Gaudi::Functional and k4FWCore frameworks.

Features

  • Framework support: Both Gaudi and k4FWCore (default: k4FWCore)
  • 4 functional types: Consumer, Producer, Transformer, FilterPredicate
  • Automatic template generation: Creates proper template signatures based on type
  • Smart type parsing: Handles complex types like podio::UserDataCollection<float>
  • Constructor scaffolding: Generates KeyValue/KeyValues initialization automatically
  • Type-safe operator(): Creates the correct signature and basic implementation template
  • Gaudi properties: Add configurable properties with descriptions
  • EDM4hep support: Automatic includes for edm4hep and podio types
  • Command tracking: Records the generation command in the output file
  • Consistent conventions: k4FWCore uses struct by default, Gaudi uses class by default
  • Help system: Run with -h to see all functional types and examples

Input Format

For inputs/outputs: TypeName:LocationName

  • TypeName: Required (supports templates like podio::UserDataCollection<float>)
  • LocationName: Optional (defaults to type name without namespace/Collection suffix)

For properties: Type:Name:DefaultValue:Description

Examples

k4FWCore Producer (reproduces ExampleFunctionalProducerMultiple)

python gaudi_gen.py MyProducer producer \
  -o 'podio::UserDataCollection<float>:VectorFloat' \
     'edm4hep::MCParticleCollection:MCParticles1' \
     'edm4hep::MCParticleCollection:MCParticles2' \
     'edm4hep::SimTrackerHitCollection:SimTrackerHits' \
     'edm4hep::TrackerHit3DCollection:TrackerHits' \
     'edm4hep::TrackCollection:Tracks' \
     'edm4hep::ReconstructedParticleCollection:RecoParticles' \
     'edm4hep::RecoMCParticleLinkCollection:Links' \
  -p 'int:ExampleInt:3:Example int that can be used in the algorithm' \
     'int:magicNumberOffset:0:Integer to add to the dummy values'

k4FWCore Consumer

python gaudi_gen.py MyConsumer consumer \
  -i 'edm4hep::MCParticleCollection:MCParticles'

k4FWCore Transformer

python gaudi_gen.py MyTransformer transformer \
  -i 'edm4hep::TrackCollection:InputTracks' \
  -o 'edm4hep::ReconstructedParticleCollection:RecoParticles'

Gaudi Framework Examples

# Simple transformer (like MySum example)
python gaudi_gen.py MySum transformer \
  -i "Input1:Input1Loc" "Input2:Input2Loc" \
  -o "OutputData:OutputLoc" \
  --framework gaudi

# Consumer
python gaudi_gen.py EventMonitor consumer \
  -i "EventData:EventLoc" \
  --framework gaudi

# Filter predicate
python gaudi_gen.py MyFilter filter \
  -i "Track:TrackLoc" \
  -o "bool:FilterResult" \
  --framework gaudi

# With namespace
python gaudi_gen.py MyAlgorithm transformer \
  -i "InputType:InputLoc" \
  -o "OutputType:OutputLoc" \
  -n "MyNamespace" \
  --framework gaudi

Command-Line Options

positional arguments:
  class_name            Name of the C++ class to generate
  functional_type       Type of functional (consumer, producer, transformer, filter)

optional arguments:
  -i, --inputs          Input data specifications
  -o, --outputs         Output data specifications
  -n, --namespace       Namespace for the class
  -f, --output-file     Output file name (default: <ClassName>.cpp)
  --framework           Target framework: gaudi or k4fwcore (default: k4fwcore)
  --class               Generate as class instead of struct (struct is default for k4fwcore)
  -p, --properties      Gaudi properties (format: "Type:Name:Default:Description")
  -h, --help            Show help message

ianna added 5 commits January 13, 2026 12:07
This script generates Gaudi Functional C++ classes with appropriate structure and boilerplate code based on user-defined specifications.
Updated the Gaudi Functional C++ Class Generator to support k4FWCore framework, improved input/output specifications, and added new functionalities for property parsing and class generation.
Added command_line parameter to generate_class function and updated its usage in main.
@ianna
Copy link
Copy Markdown
Author

ianna commented Jan 16, 2026

@BrieucF - please, check. Thanks!

@jmcarcell
Copy link
Copy Markdown
Member

Without any tests, inevitably this will not work in the future if something changes and we won't be able to know.

@ianna
Copy link
Copy Markdown
Author

ianna commented Jan 19, 2026

Without any tests, inevitably this will not work in the future if something changes and we won't be able to know.

Thanks for looking into it. A very good point! Shall I add the generation and compilation for it to CI tests? Thanks

@Zehvogel
Copy link
Copy Markdown
Contributor

@tmadlener weren't you also working on something like this? :)

@tmadlener
Copy link
Copy Markdown
Member

Yes, sort of. I have a semi-working thing for a static website. But that also doesn't have the tests it should have. In the end, I don't care too much which version lands as long as one does.

Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this. I think the general idea of the script goes into the exact right direction, i.e. we want something that generates boilerplate for something that the user wants. I didn't look in extreme detail, but I see a few general improvements (also partially highlighted in the inline comments):

  • Currently the user needs to know which functional type they want. But this is entirely specified by the number of inputs and outputs, so I would just determine that from there.
  • Generally the implementation could probably be improved by sprinkling a few classes into the whole thing for holding intermediate information instead of passing around tuples of strings of various lengths. This would probably also make it easier to simply do some of the parsing / processing up-front and then pass the information around instead of re-parsing it several times.
  • The current implementation does not handle the possibility of variable length inputs / outputs (I think). This is only possible in k4FWCore Functional algorithms though.

I think this is borderline complex enough to warrant the use of a template engine to handle all the string formatting. It introduces some overhead, and would require writing some templates, but it would simplify the python script parts by quite potentially.

As already mentioned we definitely need tests that ensure that the outputs compile. That would probably be the first thing I do, because once that is in place refactoring and extending the implementation can be done with some guard rails.

Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment on lines +132 to +133
if len(out_types) == 0:
return "void()"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case can never happen as this is already ruled out in the parsing of the arguments. Also, I think even if this were to happen, it doesn't compile because a Producer needs at least one output.

Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
# Remove Collection suffix and namespace
clean_name = typ.split('::')[-1].replace('Collection', '')
loc = clean_name
lines.append(f'KeyValues("{loc}", {{"{loc}"}})')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think from the current implementation there is no way we will ever need the KeyValues, but we always want the KeyValue. Unless the current version already handles inputs/outputs of type std::vector<const XYZCollection*> in which case there should be a differentiation here and only those get a KeyValues (of the appropriate length), single inputs should get a KeyValue.

Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment on lines +293 to +297
if 'edm4hep::' in typ:
# Extract all collection types (handle nested templates)
# Match patterns like edm4hep::MCParticleCollection
pattern = r'edm4hep::(\w+Collection)'
matches = re.findall(pattern, typ)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this differentiation is necessary at all. Algorithms will always take Collections as inputs/outputs. (I think trying to do anything different will not even compile).

Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment on lines +138 to +144
elif functional_type == 'transformer':
in_sig = ', '.join([f"const {t}&" for t in in_types]) if in_types else ""
if len(out_types) == 1:
return f"{out_types[0]}({in_sig})"
else:
out_sig = ', '.join(out_types)
return f"std::tuple<{out_sig}>({in_sig})"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this compiles for multiple outputs. It's either a Transformer (single output) or a MultiTransformer multiple outputs so depending on that we also need to generate a different class here.

Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the meeting today already it would be nice to have tests for this. Given that you also generate a minimal cmake script this should be fairly straight forward, essentially for various options (I would say each type of algorithm at least once) one would

  • Run the script (probably inside a "sandbox" directory) to generate source code and cmake
  • Run cmake
  • Run build

The main problem with that is that we would have to hook that into the existing test harness in the CMakeLists.txt if we just want to run it as part of the "standard" CI. An alternative option would be a custom github actions workflow where we could run this without that harness. (Opinions @jmcarcell?).

In any case I think a first step would to just write some bash scripts for each option with the three steps above and then we can decide later where / how we hook them up to CI).


Some other general comments:

  • gaudi_gen.py is probably not descriptive enough. In the end people would like to use this from within a Key4hep environment and I think something like generatFunctional.py or generateAlgorithm.py (or even without the .py suffix) would make that a bit more explicit
  • This needs to be installed for it to be truly useful, otherwise people will have to clone k4FWCore and get the script from there.

For that something similar to what we do for k4run:

gaudi_install(SCRIPTS)

In this case it would probably have to be something like

gaudi_install(SCRIPTS helpers)

or alternatively via an explicit call to cmake install.

Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
Comment thread k4FWCore/helpers/gaudi_gen.py Outdated
@jmcarcell
Copy link
Copy Markdown
Member

I tried this and there are several options that are not explained, then I tried them and they seem not to do anything. Like --all-keyvalues, or --type-aliases.

The main problem with that is that we would have to hook that into the existing test harness in the CMakeLists.txt if we just want to run it as part of the "standard" CI. An alternative option would be a custom github actions workflow where we could run this without that harness. (Opinions @jmcarcell?).
Hmm if it doesn't run in ctest then it's not tested regularly, but things don't change that much here so that it would stop working. If it's going to use the cmake it creates then I think it has to be outside because it calls find_package(k4FWCore) that won't be there until k4FWCore is installed, and probably is more similar to how people will use it in the future. In that case we can have a different workflow in .github/workflows that is unrelated to the build one, that may also build k4FWCore.

I wonder at which point, in a repo that already has algorithms, it's just easier to copy and paste an existing algorithm than get the command right to run the script. If there are many different times one has to copy and paste all of them with all the edm4hep::. Maybe edm4hep:: could be removed and assume it's an EDM4hep collection (true for now, but probably not in the future). Also one has to remember the colon separators:

python test.py MyProducer -o 'edm4hep::MCParticleCollection:MCParticles' 'edm4hep::TrackCollection:Tracks' -i "edm4hep::MCParticleCollection" -p 'int:ExampleInt:3:An example integer property'

ianna and others added 8 commits April 15, 2026 23:05
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
Add documentation for gaudi_gen.py script, including usage, requirements, arguments, and examples.
Refactor gaudi_gen.py to address reviewers comments, improve code organization and readability. Changes include removing unused imports, updating function signatures, and enhancing argument help descriptions.
@ianna ianna requested review from andresailer and tmadlener April 29, 2026 15:50
ianna added 2 commits April 29, 2026 17:54
Updated shebang to use 'uv run' for script execution and added metadata for dependencies.
Clarified requirements and usage instructions for the script, specifying the need for Jinja2 only in certain contexts and providing detailed invocation methods.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants