Add support for parsing decay descriptors by admorris · Pull Request #573 · scikit-hep/decaylanguage

admorris · 2026-04-27T18:31:44Z

Implements the rest of #200

Added DecayChain.from_string method

Decay descriptors are parsed with Lark. A Transformer class converts them into DecayChainDict objects, which are then used to initialise DecayChain objects.

Custom descriptor formats can be used by pointing to another .lark file in an argument of DecayChain.from_string. These pretty much only have the freedom to modify ARROW, LPAR and RPAR. The rest of the structure is assumed by the Transformer. i.e. I did not find a way to support sub-decays written with the mother outside of braces like A -> B (-> C D) E

One glaring limitation (which is inherent to DecayChain/_build_decay_modes) is that sub-decays of identically named particles are not supported: e.g.. "B_s0 -> (phi -> K+ K-) (phi -> K+ K-)" will result in an exception. This could possibly be handled by adding internal/hidden uniqueness when duplicates are encountered.

eduardo-rodrigues · 2026-04-30T09:48:36Z

Hi @admorris, I am not forgetting to check this. It's just that I have been working on urgent and important suff. Will get back to you very soon.

eduardo-rodrigues · 2026-05-01T13:05:51Z

+
+// Particle names start with alphanumeric/underscore and can then include
+// common descriptor suffix symbols such as +, -, *, ', and ~.
+PARTICLE: /[A-Za-z0-9_][A-Za-z0-9_+*'~̄-]*/


This definition does not match that in https://github.com/scikit-hep/decaylanguage/blob/main/src/decaylanguage/data/decfile.lark. Some name may be overlooked, such as those with parentheses?

Allowing parentheses is causing a headache where it matches the parentheses from a sub-decay as part of the first or last particle name. I am trying to debug it without coming up with grotesque regex patterns.

Ah, indeed that's not an easy one!

eduardo-rodrigues · 2026-05-01T13:12:24Z

+    def sub_decay(self, items: list[Any]) -> DecayChainDict:
+        # sub_decay: LPAR decay RPAR
+        for item in items:
+            if isinstance(item, dict):


Just for my understanding - this code here would only ever return the first found dict in the list. Is there some subtlety I'm missing for things to work overall, likely thanks to the way the Lark Transformer works?

The items parameter is always a list. In this case we should always expect a list of length 1. Maybe I could raise an exception if it's a different lenght.

Seems better IMO. Basically anything that is not in the expected format should get an exception.

eduardo-rodrigues · 2026-05-01T13:23:03Z

        return cls(mother, decay_modes)

+    @classmethod
+    def from_string(


At some point it would make sense to "synchronise" this from_string function with the existing to_string one, since they should effectively be the "mirror of each other". Else one would name this function to from_descriptor. WDYT?

eduardo-rodrigues · 2026-05-01T13:23:36Z

+        descriptor : str
+            The decay descriptor string, e.g.
+            ``"D*+ -> (D0 -> K+ pi-) pi+"``.
+        grammar_file : str or Path, optional


This could be a place to state where the default grammar is available? Seems useful and relevant. This would be similar to the docs of edit_model_name_terminals in https://github.com/scikit-hep/decaylanguage/blob/main/src/decaylanguage/dec/dec.py. WDYT?

eduardo-rodrigues · 2026-05-01T13:28:38Z

+        cls,
+        descriptor: str,
+        *,
+        grammar_file: str | Path | None = None,


You could do similary to https://github.com/scikit-hep/decaylanguage/blob/main/src/decaylanguage/dec/dec.py#L300, meaning provide the name of the default grammar, stating where it is located (I realise the code docstring I am refering to could be improved a bit). Similarly, you could use https://github.com/scikit-hep/decaylanguage/blob/main/src/decaylanguage/dec/dec.py#L313 below?

eduardo-rodrigues · 2026-05-01T13:31:26Z

+
+// Particle names start with alphanumeric/underscore and can then include
+// common descriptor suffix symbols such as +, -, *, ', and ~.
+PARTICLE: /[A-Za-z0-9_][A-Za-z0-9_+*'~̄-]*/


In any case, why do you need this copy? Then worth having a comment about it in the file, I reckon.

I see you do test with a double arrow. Yeah, just add a comment about it to be trivial to any reader :).

eduardo-rodrigues

Thank you so much for this, @admorris! It's a really nice enhancement 👍.

I left a few little suggestions but this is looking excellent anway.

I am well aware of the limitation you point out. It does annoy me.

Co-authored-by: Eduardo Rodrigues <eduardo.rodrigues@cern.ch>

admorris added 2 commits April 27, 2026 18:53

Add support for parsing decay descriptors

2ef31a0

add test for alternative .lark file

d536b82

eduardo-rodrigues added the enhancement New feature or request label May 1, 2026

eduardo-rodrigues reviewed May 1, 2026

View reviewed changes

Comment thread src/decaylanguage/data/descriptor.lark

eduardo-rodrigues reviewed May 1, 2026

View reviewed changes

Comment thread src/decaylanguage/decay/decay.py

eduardo-rodrigues reviewed May 1, 2026

View reviewed changes

Comment thread src/decaylanguage/decay/decay.py Outdated

eduardo-rodrigues reviewed May 1, 2026

View reviewed changes

admorris and others added 3 commits May 7, 2026 17:18

Apply suggestion

187e64a

Co-authored-by: Eduardo Rodrigues <eduardo.rodrigues@cern.ch>

describe descriptor.lark in the local README

0c82435

use lexer="auto"

2ce2166

Conversation

admorris commented Apr 27, 2026

Uh oh!

eduardo-rodrigues commented Apr 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eduardo-rodrigues May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eduardo-rodrigues left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eduardo-rodrigues May 1, 2026 •

edited

Loading