Introduction¶

竜 TatSu is different from other PEG parser generators:

Generated parsers use Python’s efficient exception-handling system to backtrack. 竜 TatSu generated parsers simply assert what must be parsed. No complicated if-then-else sequences for decision making or backtracking are present. Memoization allows going over the same input sequence several times in linear time.
Positive and negative lookaheads, and the cut element (with its cleaning of the memoization cache) allow for additional, hand-crafted optimizations at the grammar level.
Delegation to Python’s re module for lexemes allows for (Perl-like) powerful and efficient lexical analysis.
The use of Python’s context managers considerably reduces the size of the generated parsers for code clarity, and enhanced CPU-cache hits.
Include files, rule inheritance, and rule inclusion give 竜 TatSu grammars considerable expressive power.
Automatic generation of Abstract Syntax Trees_ and Object Models, along with Model Walkers and Code Generators make analysis and translation approachable

The parser generator, the run-time support, and the generated parsers have measurably low Cyclomatic complexity. At around 5 KLOC of Python, it is possible to study all its source code in a single session.

The only dependencies are on the Python standard library. The graphviz is required for producing diagrams of the grammars.

竜 TatSu is feature-complete and currently being used with complex grammars to parse, analyze, and translate hundreds of thousands of lines of input text. That includes source code in several programming languages.