TatSu

At least for the people who send me mail about a new language that they’re designing, the general advice is: do it to learn about how to write a compiler. Don’t have any expectations that anyone will use it, unless you hook up with some sort of organization in a position to push it hard. It’s a lottery, and some can buy a lot of the tickets. There are plenty of beautiful languages (more beautiful than C) that didn’t catch on. But someone does win the lottery, and doing a language at least teaches you something.

Dennis Ritchie (1941-2011) Creator of the C programming language and of Unix

TatSu is a tool that takes grammars in a variation of EBNF as input, and outputs memoizing (Packrat) PEG parsers in Python.

Why use a PEG parser? Because `regular languages`_ (those parsable with Python’s re package) “cannot count”. Any language with nested structures or with balancing of demarcatiors requires more than regular expressions to be parsed.

TatSu can compile a grammar stored in a string into a tatsu.grammars.Grammar object that can be used to parse any given input, much like the re module does with regular expressions, or it can generate a Python module that implements the parser.

TatSu supports left-recursive rules in PEG grammars, and it honors left-associativity in the resulting parse trees.