orphan:

Parser Configuration#

TatSu has many configuration options. They are all defined in tatsu.config.ParserConfig. With the introduction of ParserConfig there’s no need to declare every configuration parameter as an optional named argument in entry points and internal methods.

The defaults set in ParserConfig are suitable for most cases, and they are easy to override.

Entry points still accept configuration options as named keyword arguments, but those are gathered in **settings (aka **kwargs) argument for a``ParserConfig`` to validate when called.

@dataclass
class ParserConfig:
    name: str | None = 'Test'
    filename: str = ''

    start: str | None = None

    semantics: type | None = None

    comment_recovery: bool = False   # warning: not implemented

    memoization: bool = True
    perlinememos: float = DEFAULT_MEMOS_PER_LINE

    colorize: bool = True  # INFO: requires the colorama library
    trace: bool = False
    trace_filename: bool = False
    trace_length: int = 72
    trace_separator: str = C_DERIVE

    grammar: str | None = None
    left_recursion: bool = True

    comments: str | None = None
    eol_comments: str | None = None
    keywords: set[str] = field(default_factory=set)

    ignorecase: bool | None = False
    namechars: str = ''
    nameguard: bool | None = None  # implied by namechars
    whitespace: str | None = undefined

    parseinfo: bool = False

Entry points and internal methods in 竜 TatSu have an optional config: ParserConfig | None = None argument.

def parse(
    grammar,
    input,
    start=None,
    name=None,
    semantics=None,
    asmodel=False,
    config: ParserConfig | None = None,
    **settings,
):

If no ParserConfig is passed, a default one is created. Configuration attributes may be overridden by relevant arguments in **settings.

These are different ways to apply a configuration setting:

config = tatsu.config.ParserConfig()
config.left_recursion = False
ast = tatsu.parse(grammar, text, config=config)

config = tatsu.config.ParserConfig(left_recursion=False)
ast = tatsu.parse(grammar, text, config=config)

ast = tatsu.parse(grammar, text, left_recursion=False)

name#

name: str | None = 'Test'

The name of the grammar. It’s used in generated Python parsers and may be used in error reporting.

filename#

filename: str = ''

The file name from which the grammar was read. It may be used in error reporting.

start#

start: str | None = None

The name of the rule on which to start parsing. It may be used to invoke only a specific part of the parser.

ast = parse(grammar, '(2+2)*2', start='expression')

semantics#

semantics: type | None = None

The class implementing parser semantics. See other sections of the documentation for meaning, implementation and default and generated semantic classes and objects.

memoization#

memoization: bool = True

Enable or disable memoization in the parser. Only specific input languages require this to be False. Note that parsing times cease t o be linear when memoization is disabled.

perlinememos#

Sets a (perlinememos * linecount) bound on the total number of memoization entries that are allowed.

colorize#

colorize: bool = True

Colorize trace output. Colorization requires that the colorama library is available.

trace#

trace: bool = False

Produce a trace of the parsing process. See the Traces section for more information.

trace_filename#

trace_filename: bool = False

Include the input text’s filename in trace output.

trace_length#

trace_length: int = 72

The max width of a line in a trace.

trace_separator#

trace_separator: str = C_DERIVE

The separator to use between lines in a trace.

grammar#

grammar: str | None = None

An alias for the name option.

left_recursion#

left_recursion: bool = True

Enable or disable left recursion in analysis and parsing.

comments#

comments: str | None = None

A regular expression describing comments in the input. Comments are skipped during parsing.

eol_comments#

eol_comments: str | None = None

A regular expression describing end-of-line comments in the input. Comments are skipped during parsing.

keywords#

keywords: set[str] = field(default_factory=set)

The list of keywords in the input language. See Reserved Words and Keywords for more information.

ignorecase#

ignorecase: bool | None = False

namechars#

namechars: str = ''

Additional characters that can be part of an identifier (for example namechars='$@'’).

nameguard#

nameguard: bool = False  # implied by namechars

When set to True, avoids matching tokens when the next character in the input sequence is alphanumeric or a @@namechar. Defaults to False. See token expression for an explanation.

whitespace#

whitespace: str | None = undefined

Provides a regular expression for the whitespace to be ignored by the parser. See the @@whitespace section for more information.

parseinfo#

parseinfo: bool = False

When parseinfo==True, a parseinfo entry is added to AST nodes that are dict-like. The entry provides information about what was parsed and where. See Abstract Syntax Trees for more information.

class ParseInfo(NamedTuple):
    cursor: Cursor
    rule: str
    pos: int
    endpos: int
    line: int
    endline: int
    alerts: list[Alert] = []  # noqa: RUF012