orphan:

Introduction#

TatSu is different from other PEG parser generators:

  • Generated parsers use Python’s efficient exception-handling system to backtrack. 竜 TatSu generated parsers simply assert what must be parsed. No complicated if-then-else sequences for decision making or backtracking are present. Memoization allows going over the same input sequence several times in linear time.

  • Positive and negative lookaheads, and the cut element (with its cleaning of the memoization cache) allow for additional, hand-crafted optimizations at the grammar level.

  • Delegation to Python’s re module for lexemes allows for (Perl-like) powerful and efficient lexical analysis.

  • The use of Python’s context managers considerably reduces the size of the generated parsers for code clarity, and enhanced CPU-cache hits.

  • Include files, rule inheritance, and rule inclusion give 竜 TatSu grammars considerable expressive power.

  • Automatic generation of Abstract Syntax Trees_ and Object Models, along with Model Walkers and Code Generators make analysis and translation approachable

The parser generator, the run-time support, and the generated parsers have measurably low Cyclomatic complexity. At around 5 KLOC of Python, it is possible to study all its source code in a single session.

The only dependencies are on the Python standard library. The graphviz is required for producing diagrams of the grammars.

TatSu is feature-complete and currently being used with complex grammars to parse, analyze, and translate hundreds of thousands of lines of input text. That includes source code in several programming languages.