PyOMeta is an implementation of OMeta2, an object-oriented pattern-matching language developed by Alessandro Warth. PyOMeta provides a compact syntax based on Parsing Expression Grammars for common lexing, parsing and tree-transforming activities in a way that's easy to reason about for Python programmers.
PyOMeta compiles a grammar to a Python class, with the rules as methods. The rules specify parsing expressions, which consume input and return values if they succeed in matching.
foo ::= ....
: Define a rule named foo.
expr1 expr2
: Match expr1 and then match expr2 if it succeeds, returning the value of
expr2. Like Python's and.
expr1 | expr2
: Try to match expr1 --- if it fails, match expr2 instead. Like Python's or.
expr*
: Match expr zero or more times, returning a list of matches.
expr+
: Match expr one or more times, returning a list of matches.
expr?
: Try to match expr. Returns None if it fails to match.
~expr
: Fail if the next item in the input matches expr.
ruleName
: Call the rule ruleName.
'x'
: Match the literal character 'x'.
expr:name
: Bind the result of expr to the local variable name.
-> pythonExpression
: Evaluate the given Python expression and return its result.
!(pythonExpression)
: Evaluate the given Python expression and return its result (this is used in the rule
definition part).
# this is a comment
: Comments are like Python comments, they start with # and extend to the end of the line.
<expr>
: Consumed-by operator returns a sub-sequence of the input that contains the elements
matched by the enclosed expression expr
@<expr>
: Index-consumed-by operator returns an array with the start and end indices of the
elements consumed by the enclosed expression expr
| "kind of thing" | PyOMeta | Note |
|---|---|---|
| boolean | true |
|
| number | 123 |
|
| character | 'x' |
|
| string | "foo" |
|
| rule application | expr |
|
r(x, y) |
1 | |
^digit |
4 | |
| list | ['x' 1] |
|
| grouping | (foo bar) |
|
| negation | ~'x' |
|
| look-ahead | ~~'x' |
|
| semantic predicate | ?(x > y) |
3 |
| semantic action | -> (x + y) |
3 |
!(x + y) |
3 | |
| binding | expr:x |
|
:x |
Note 1: the arguments do not necessarily have to be statement expressions - they can be any Python expression.
Note 2: not yet in the grammar, only via Python subclassing.
Note 3: semantic predicates and actions are written in Python. More specifically, they are either primary expressions, e.g., 123 x foo.bar() or something called "statement expressions", which have the form "{" * "}" For example, { x += 2; y = "foo"; f(x) } The value of a statement expression is equal to that of its last expression.
Note 4: "super" is just like any other rule (not a special form), so you have to
quote the rule name that you pass in as an argument, e.g., both ^r(1, 2)
and super("r", 1, 2) are valid super-sends.
The starting point for defining a new grammar is pyometa.grammar.OMeta.makeGrammar,
which takes a grammar definition and a dict of variable bindings for its embedded expressions
and produces a Python class.
Grammars can be subclassed as usual, and makeGrammar can be called on these classes to override
rules and provide new ones. To invoke a grammar rule, call grammarObject.apply() with its name.
>>> from pyometa.grammar import OMeta
>>> exampleGrammar = """
ones = '1' '1' -> 1 # comment
twos = '2' '2' -> 2
stuff = (ones | twos)+
"""
>>> Example = OMeta.makeGrammar(exampleGrammar, {})
>>> g = Example("11221111")
>>> result, error = g.apply("stuff")
>>> result
[1, 2, 1, 1]
Say you want to add consumed-by operator (it is already in the grammar, by the way)
to the basic OMeta grammar.
The steps you would need to follow are:
- change
grammar.pyand add
| token('<') expr:e token('>') -> self.builder.consumed_by(e)
to expr1 definition
- add the
nullOptimizationGrammarwith the new node ingrammar.py:
| ['ConsumedBy' opt:expr] -> self.builder.consumed_by(expr)
- add a new method to
TreeBuilderclass inbuilder.py:
def consumed_by(self, exprs):
return ["ConsumedBy", exprs]
- add
generate_ConsumedBymethod inPythonWriterclass inbuilder.py:
def generate_ConsumedBy(self, expr):
fname = self._newThunkFor("consumed_by", expr)
return self._expr("consumed_by", "self.consumed_by(%s)" % (fname,))
-
generate a test for the new extension
-
generate a boot grammar:
$ export PYTHONPATH=$PWD/src:$PYTHONPATH
$ mv src/pyomets/boot.py src/pyomets/boot.orig.py
$ python src/pyometa/bootgenerator.py
$ mv src/pyometa/boot_generated.py src/pyomets/boot.py
- run the tests and make sure that everything runs successfully
This fork would not have been possible without the (real hard) work of Allen Short who first implemented a Python version of OMeta. The work of Waldemar Kornewald has further pushed Allen's implementation towards OMeta2 syntax and behaviour.
I, Enrico Spinielli, have
- improved and updated README
- included
consumed-byandindex-consumed-byfrom Benjamin Dauvergne's code - added tests
consumed-byandindex-consumed-by - added
Makefile - document grammar extension
- add more examples/tests
- improve debugging/error reporting