- orphan:
Parser Configuration#
竜 TatSu has many configuration options. They are all defined in
tatsu.config.ParserConfig. With the introduction of ParserConfig
there’s no need to declare every configuration parameter as an optional named
argument in entry points and internal methods.
The defaults set in ParserConfig are suitable for most cases, and they are
easy to override.
Entry points still accept configuration options as named keyword arguments, but
those are gathered in **settings (aka **kwargs) argument for a``ParserConfig``
to validate when called.
@dataclass
class ParserConfig:
name: str | None = 'Test'
filename: str = ''
start: str | None = None
semantics: type | None = None
comment_recovery: bool = False # warning: not implemented
memoization: bool = True
perlinememos: float = DEFAULT_MEMOS_PER_LINE
colorize: bool = True # INFO: requires the colorama library
trace: bool = False
trace_filename: bool = False
trace_length: int = 72
trace_separator: str = C_DERIVE
grammar: str | None = None
left_recursion: bool = True
comments: str | None = None
eol_comments: str | None = None
keywords: set[str] = field(default_factory=set)
ignorecase: bool | None = False
namechars: str = ''
nameguard: bool | None = None # implied by namechars
whitespace: str | None = undefined
parseinfo: bool = False
Entry points and internal methods in 竜 TatSu have an optional
config: ParserConfig | None = None argument.
def parse(
grammar,
input,
start=None,
name=None,
semantics=None,
asmodel=False,
config: ParserConfig | None = None,
**settings,
):
If no ParserConfig is passed, a default one is created. Configuration
attributes may be overridden by relevant arguments in **settings.
These are different ways to apply a configuration setting:
config = tatsu.config.ParserConfig()
config.left_recursion = False
ast = tatsu.parse(grammar, text, config=config)
config = tatsu.config.ParserConfig(left_recursion=False)
ast = tatsu.parse(grammar, text, config=config)
ast = tatsu.parse(grammar, text, left_recursion=False)
name#
name: str | None = 'Test'
The name of the grammar. It’s used in generated Python parsers and may be used in error reporting.
filename#
filename: str = ''
The file name from which the grammar was read. It may be used in error reporting.
start#
start: str | None = None
The name of the rule on which to start parsing. It may be used to invoke only a specific part of the parser.
ast = parse(grammar, '(2+2)*2', start='expression')
semantics#
semantics: type | None = None
The class implementing parser semantics. See other sections of the documentation for meaning, implementation and default and generated semantic classes and objects.
memoization#
memoization: bool = True
Enable or disable memoization in the parser. Only specific input languages
require this to be False. Note that parsing times cease t o be linear when
memoization is disabled.
perlinememos#
Sets a (perlinememos * linecount) bound on the total number of memoization
entries that are allowed.
colorize#
colorize: bool = True
Colorize trace output. Colorization requires that the colorama library
is available.
trace#
trace: bool = False
Produce a trace of the parsing process. See the Traces section for more information.
trace_filename#
trace_filename: bool = False
Include the input text’s filename in trace output.
trace_length#
trace_length: int = 72
The max width of a line in a trace.
trace_separator#
trace_separator: str = C_DERIVE
The separator to use between lines in a trace.
grammar#
grammar: str | None = None
An alias for the name option.
left_recursion#
left_recursion: bool = True
Enable or disable left recursion in analysis and parsing.
eol_comments#
eol_comments: str | None = None
A regular expression describing end-of-line comments in the input. Comments are skipped during parsing.
keywords#
keywords: set[str] = field(default_factory=set)
The list of keywords in the input language. See Reserved Words and Keywords for more information.
ignorecase#
ignorecase: bool | None = False
namechars#
namechars: str = ''
Additional characters that can be part of an identifier
(for example namechars='$@'’).
nameguard#
nameguard: bool = False # implied by namechars
When set to True, avoids matching tokens when the next character in the input sequence is
alphanumeric or a @@namechar. Defaults to False.
See token expression for an explanation.
whitespace#
whitespace: str | None = undefined
Provides a regular expression for the whitespace to be ignored by the parser. See the @@whitespace section for more information.
parseinfo#
parseinfo: bool = False
When parseinfo==True, a parseinfo entry is added to AST nodes
that are dict-like. The entry provides information about what was parsed and
where. See Abstract Syntax Trees for more information.
class ParseInfo(NamedTuple):
cursor: Cursor
rule: str
pos: int
endpos: int
line: int
endline: int
alerts: list[Alert] = [] # noqa: RUF012
comments#
A regular expression describing comments in the input. Comments are skipped during parsing.