3
<title>PLY Internals</title>
5
<body bgcolor="#ffffff">
15
<b>PLY Version: 3.0</b>
19
<div class="sectiontoc">
21
<li><a href="#internal_nn1">Introduction</a>
22
<li><a href="#internal_nn2">Grammar Class</a>
23
<li><a href="#internal_nn3">Productions</a>
24
<li><a href="#internal_nn4">LRItems</a>
25
<li><a href="#internal_nn5">LRTable</a>
26
<li><a href="#internal_nn6">LRGeneratedTable</a>
27
<li><a href="#internal_nn7">LRParser</a>
28
<li><a href="#internal_nn8">ParserReflect</a>
29
<li><a href="#internal_nn9">High-level operation</a>
35
<H2><a name="internal_nn1"></a>1. Introduction</H2>
38
This document describes classes and functions that make up the internal
39
operation of PLY. Using this programming interface, it is possible to
40
manually build an parser using a different interface specification
41
than what PLY normally uses. For example, you could build a gramar
42
from information parsed in a completely different input format. Some of
43
these objects may be useful for building more advanced parsing engines
47
It should be stressed that using PLY at this level is not for the
48
faint of heart. Generally, it's assumed that you know a bit of
49
the underlying compiler theory and how an LR parser is put together.
51
<H2><a name="internal_nn2"></a>2. Grammar Class</H2>
54
The file <tt>ply.yacc</tt> defines a class <tt>Grammar</tt> that
55
is used to hold and manipulate information about a grammar
56
specification. It encapsulates the same basic information
57
about a grammar that is put into a YACC file including
58
the list of tokens, precedence rules, and grammar rules.
59
Various operations are provided to perform different validations
60
on the grammar. In addition, there are operations to compute
61
the first and follow sets that are needed by the various table
62
generation algorithms.
65
<tt><b>Grammar(terminals)</b></tt>
68
Creates a new grammar object. <tt>terminals</tt> is a list of strings
69
specifying the terminals for the grammar. An instance <tt>g</tt> of
70
<tt>Grammar</tt> has the following methods:
74
<b><tt>g.set_precedence(term,assoc,level)</tt></b>
76
Sets the precedence level and associativity for a given terminal <tt>term</tt>.
77
<tt>assoc</tt> is one of <tt>'right'</tt>,
78
<tt>'left'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is a positive integer. The higher
79
the value of <tt>level</tt>, the higher the precedence. Here is an example of typical
83
g.set_precedence('PLUS', 'left',1)
84
g.set_precedence('MINUS', 'left',1)
85
g.set_precedence('TIMES', 'left',2)
86
g.set_precedence('DIVIDE','left',2)
87
g.set_precedence('UMINUS','left',3)
90
This method must be called prior to adding any productions to the
91
grammar with <tt>g.add_production()</tt>. The precedence of individual grammar
92
rules is determined by the precedence of the right-most terminal.
96
<b><tt>g.add_production(name,syms,func=None,file='',line=0)</tt></b>
98
Adds a new grammar rule. <tt>name</tt> is the name of the rule,
99
<tt>syms</tt> is a list of symbols making up the right hand
100
side of the rule, <tt>func</tt> is the function to call when
101
reducing the rule. <tt>file</tt> and <tt>line</tt> specify
102
the filename and line number of the rule and are used for
103
generating error messages.
106
The list of symbols in <tt>syms</tt> may include character
107
literals and <tt>%prec</tt> specifiers. Here are some
111
g.add_production('expr',['expr','PLUS','term'],func,file,line)
112
g.add_production('expr',['expr','"+"','term'],func,file,line)
113
g.add_production('expr',['MINUS','expr','%prec','UMINUS'],func,file,line)
117
If any kind of error is detected, a <tt>GrammarError</tt> exception
118
is raised with a message indicating the reason for the failure.
122
<b><tt>g.set_start(start=None)</tt></b>
124
Sets the starting rule for the grammar. <tt>start</tt> is a string
125
specifying the name of the start rule. If <tt>start</tt> is omitted,
126
the first grammar rule added with <tt>add_production()</tt> is taken to be
127
the starting rule. This method must always be called after all
128
productions have been added.
132
<b><tt>g.find_unreachable()</tt></b>
134
Diagnostic function. Returns a list of all unreachable non-terminals
135
defined in the grammar. This is used to identify inactive parts of
136
the grammar specification.
140
<b><tt>g.infinite_cycle()</tt></b>
142
Diagnostic function. Returns a list of all non-terminals in the
143
grammar that result in an infinite cycle. This condition occurs if
144
there is no way for a grammar rule to expand to a string containing
145
only terminal symbols.
149
<b><tt>g.undefined_symbols()</tt></b>
151
Diagnostic function. Returns a list of tuples <tt>(name, prod)</tt>
152
corresponding to undefined symbols in the grammar. <tt>name</tt> is the
153
name of the undefined symbol and <tt>prod</tt> is an instance of
154
<tt>Production</tt> which has information about the production rule
155
where the undefined symbol was used.
159
<b><tt>g.unused_terminals()</tt></b>
161
Diagnostic function. Returns a list of terminals that were defined,
162
but never used in the grammar.
166
<b><tt>g.unused_rules()</tt></b>
168
Diagnostic function. Returns a list of <tt>Production</tt> instances
169
corresponding to production rules that were defined in the grammar,
170
but never used anywhere. This is slightly different
171
than <tt>find_unreachable()</tt>.
175
<b><tt>g.unused_precedence()</tt></b>
177
Diagnostic function. Returns a list of tuples <tt>(term, assoc)</tt>
178
corresponding to precedence rules that were set, but never used the
179
grammar. <tt>term</tt> is the terminal name and <tt>assoc</tt> is the
180
precedence associativity (e.g., <tt>'left'</tt>, <tt>'right'</tt>,
181
or <tt>'nonassoc'</tt>.
185
<b><tt>g.compute_first()</tt></b>
187
Compute all of the first sets for all symbols in the grammar. Returns a dictionary
188
mapping symbol names to a list of all first symbols.
192
<b><tt>g.compute_follow()</tt></b>
194
Compute all of the follow sets for all non-terminals in the grammar.
195
The follow set is the set of all possible symbols that might follow a
196
given non-terminal. Returns a dictionary mapping non-terminal names
197
to a list of symbols.
201
<b><tt>g.build_lritems()</tt></b>
203
Calculates all of the LR items for all productions in the grammar. This
204
step is required before using the grammar for any kind of table generation.
205
See the section on LR items below.
209
The following attributes are set by the above methods and may be useful
210
in code that works with the grammar. All of these attributes should be
211
assumed to be read-only. Changing their values directly will likely
215
<b><tt>g.Productions</tt></b>
217
A list of all productions added. The first entry is reserved for
218
a production representing the starting rule. The objects in this list
219
are instances of the <tt>Production</tt> class, described shortly.
223
<b><tt>g.Prodnames</tt></b>
225
A dictionary mapping the names of nonterminals to a list of all
226
productions of that nonterminal.
230
<b><tt>g.Terminals</tt></b>
232
A dictionary mapping the names of terminals to a list of the
233
production numbers where they are used.
237
<b><tt>g.Nonterminals</tt></b>
239
A dictionary mapping the names of nonterminals to a list of the
240
production numbers where they are used.
244
<b><tt>g.First</tt></b>
246
A dictionary representing the first sets for all grammar symbols. This is
247
computed and returned by the <tt>compute_first()</tt> method.
251
<b><tt>g.Follow</tt></b>
253
A dictionary representing the follow sets for all grammar rules. This is
254
computed and returned by the <tt>compute_follow()</tt> method.
258
<b><tt>g.Start</tt></b>
260
Starting symbol for the grammar. Set by the <tt>set_start()</tt> method.
263
For the purposes of debugging, a <tt>Grammar</tt> object supports the <tt>__len__()</tt> and
264
<tt>__getitem__()</tt> special methods. Accessing <tt>g[n]</tt> returns the nth production
268
<H2><a name="internal_nn3"></a>3. Productions</H2>
271
<tt>Grammar</tt> objects store grammar rules as instances of a <tt>Production</tt> class. This
272
class has no public constructor--you should only create productions by calling <tt>Grammar.add_production()</tt>.
273
The following attributes are available on a <tt>Production</tt> instance <tt>p</tt>.
276
<b><tt>p.name</tt></b>
278
The name of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>'A'</tt>.
282
<b><tt>p.prod</tt></b>
284
A tuple of symbols making up the right-hand side of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>('B','C','D')</tt>.
288
<b><tt>p.number</tt></b>
290
Production number. An integer containing the index of the production in the grammar's <tt>Productions</tt> list.
294
<b><tt>p.func</tt></b>
296
The name of the reduction function associated with the production.
297
This is the function that will execute when reducing the entire
298
grammar rule during parsing.
302
<b><tt>p.callable</tt></b>
304
The callable object associated with the name in <tt>p.func</tt>. This is <tt>None</tt>
305
unless the production has been bound using <tt>bind()</tt>.
309
<b><tt>p.file</tt></b>
311
Filename associated with the production. Typically this is the file where the production was defined. Used for error messages.
315
<b><tt>p.lineno</tt></b>
317
Line number associated with the production. Typically this is the line number in <tt>p.file</tt> where the production was defined. Used for error messages.
321
<b><tt>p.prec</tt></b>
323
Precedence and associativity associated with the production. This is a tuple <tt>(assoc,level)</tt> where
324
<tt>assoc</tt> is one of <tt>'left'</tt>,<tt>'right'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is
325
an integer. This value is determined by the precedence of the right-most terminal symbol in the production
326
or by use of the <tt>%prec</tt> specifier when adding the production.
330
<b><tt>p.usyms</tt></b>
332
A list of all unique symbols found in the production.
336
<b><tt>p.lr_items</tt></b>
338
A list of all LR items for this production. This attribute only has a meaningful value if the
339
<tt>Grammar.build_lritems()</tt> method has been called. The items in this list are
340
instances of <tt>LRItem</tt> described below.
344
<b><tt>p.lr_next</tt></b>
346
The head of a linked-list representation of the LR items in <tt>p.lr_items</tt>.
347
This attribute only has a meaningful value if the <tt>Grammar.build_lritems()</tt>
348
method has been called. Each <tt>LRItem</tt> instance has a <tt>lr_next</tt> attribute
349
to move to the next item. The list is terminated by <tt>None</tt>.
353
<b><tt>p.bind(dict)</tt></b>
355
Binds the production function name in <tt>p.func</tt> to a callable object in
356
<tt>dict</tt>. This operation is typically carried out in the last step
357
prior to running the parsing engine and is needed since parsing tables are typically
358
read from files which only include the function names, not the functions themselves.
362
<tt>Production</tt> objects support
363
the <tt>__len__()</tt>, <tt>__getitem__()</tt>, and <tt>__str__()</tt>
365
<tt>len(p)</tt> returns the number of symbols in <tt>p.prod</tt>
366
and <tt>p[n]</tt> is the same as <tt>p.prod[n]</tt>.
368
<H2><a name="internal_nn4"></a>4. LRItems</H2>
371
The construction of parsing tables in an LR-based parser generator is primarily
372
done over a set of "LR Items". An LR item represents a stage of parsing one
373
of the grammar rules. To compute the LR items, it is first necessary to
374
call <tt>Grammar.build_lritems()</tt>. Once this step, all of the productions
375
in the grammar will have their LR items attached to them.
378
Here is an interactive example that shows what LR items look like if you
379
interactively experiment. In this example, <tt>g</tt> is a <tt>Grammar</tt>
384
>>> <b>g.build_lritems()</b>
387
Production(statement -> ID = expr)
392
In the above code, <tt>p</tt> represents the first grammar rule. In
393
this case, a rule <tt>'statement -> ID = expr'</tt>.
396
Now, let's look at the LR items for <tt>p</tt>.
400
>>> <b>p.lr_items</b>
401
[LRItem(statement -> . ID = expr),
402
LRItem(statement -> ID . = expr),
403
LRItem(statement -> ID = . expr),
404
LRItem(statement -> ID = expr .)]
409
In each LR item, the dot (.) represents a specific stage of parsing. In each LR item, the dot
410
is advanced by one symbol. It is only when the dot reaches the very end that a production
411
is successfully parsed.
414
An instance <tt>lr</tt> of <tt>LRItem</tt> has the following
415
attributes that hold information related to that specific stage of
419
<b><tt>lr.name</tt></b>
421
The name of the grammar rule. For example, <tt>'statement'</tt> in the above example.
425
<b><tt>lr.prod</tt></b>
427
A tuple of symbols representing the right-hand side of the production, including the
428
special <tt>'.'</tt> character. For example, <tt>('ID','.','=','expr')</tt>.
432
<b><tt>lr.number</tt></b>
434
An integer representing the production number in the grammar.
438
<b><tt>lr.usyms</tt></b>
440
A set of unique symbols in the production. Inherited from the original <tt>Production</tt> instance.
444
<b><tt>lr.lr_index</tt></b>
446
An integer representing the position of the dot (.). You should never use <tt>lr.prod.index()</tt>
447
to search for it--the result will be wrong if the grammar happens to also use (.) as a character
452
<b><tt>lr.lr_after</tt></b>
454
A list of all productions that can legally appear immediately to the right of the
455
dot (.). This list contains <tt>Production</tt> instances. This attribute
456
represents all of the possible branches a parse can take from the current position.
457
For example, suppose that <tt>lr</tt> represents a stage immediately before
458
an expression like this:
462
LRItem(statement -> ID = . expr)
466
Then, the value of <tt>lr.lr_after</tt> might look like this, showing all productions that
467
can legally appear next:
470
>>> <b>lr.lr_after</b>
471
[Production(expr -> expr PLUS expr),
472
Production(expr -> expr MINUS expr),
473
Production(expr -> expr TIMES expr),
474
Production(expr -> expr DIVIDE expr),
475
Production(expr -> MINUS expr),
476
Production(expr -> LPAREN expr RPAREN),
477
Production(expr -> NUMBER),
478
Production(expr -> ID)]
485
<b><tt>lr.lr_before</tt></b>
487
The grammar symbol that appears immediately before the dot (.) or <tt>None</tt> if
488
at the beginning of the parse.
492
<b><tt>lr.lr_next</tt></b>
494
A link to the next LR item, representing the next stage of the parse. <tt>None</tt> if <tt>lr</tt>
498
<tt>LRItem</tt> instances also support the <tt>__len__()</tt> and <tt>__getitem__()</tt> special methods.
499
<tt>len(lr)</tt> returns the number of items in <tt>lr.prod</tt> including the dot (.). <tt>lr[n]</tt>
500
returns <tt>lr.prod[n]</tt>.
503
It goes without saying that all of the attributes associated with LR
504
items should be assumed to be read-only. Modifications will very
505
likely create a small black-hole that will consume you and your code.
507
<H2><a name="internal_nn5"></a>5. LRTable</H2>
510
The <tt>LRTable</tt> class is used to represent LR parsing table data. This
511
minimally includes the production list, action table, and goto table.
514
<b><tt>LRTable()</tt></b>
516
Create an empty LRTable object. This object contains only the information needed to
520
An instance <tt>lrtab</tt> of <tt>LRTable</tt> has the following methods:
523
<b><tt>lrtab.read_table(module)</tt></b>
525
Populates the LR table with information from the module specified in <tt>module</tt>.
526
<tt>module</tt> is either a module object already loaded with <tt>import</tt> or
527
the name of a Python module. If it's a string containing a module name, it is
528
loaded and parsing data is extracted. Returns the signature value that was used
529
when initially writing the tables. Raises a <tt>VersionError</tt> exception if
530
the module was created using an incompatible version of PLY.
534
<b><tt>lrtab.bind_callables(dict)</tt></b>
536
This binds all of the function names used in productions to callable objects
537
found in the dictionary <tt>dict</tt>. During table generation and when reading
538
LR tables from files, PLY only uses the names of action functions such as <tt>'p_expr'</tt>,
539
<tt>'p_statement'</tt>, etc. In order to actually run the parser, these names
540
have to be bound to callable objects. This method is always called prior to
544
After <tt>lrtab</tt> has been populated, the following attributes are defined.
547
<b><tt>lrtab.lr_method</tt></b>
549
The LR parsing method used (e.g., <tt>'LALR'</tt>)
554
<b><tt>lrtab.lr_productions</tt></b>
556
The production list. If the parsing tables have been newly
557
constructed, this will be a list of <tt>Production</tt> instances. If
558
the parsing tables have been read from a file, it's a list
559
of <tt>MiniProduction</tt> instances. This, together
560
with <tt>lr_action</tt> and <tt>lr_goto</tt> contain all of the
561
information needed by the LR parsing engine.
565
<b><tt>lrtab.lr_action</tt></b>
567
The LR action dictionary that implements the underlying state machine.
568
The keys of this dictionary are the LR states.
572
<b><tt>lrtab.lr_goto</tt></b>
574
The LR goto table that contains information about grammar rule reductions.
578
<H2><a name="internal_nn6"></a>6. LRGeneratedTable</H2>
581
The <tt>LRGeneratedTable</tt> class represents constructed LR parsing tables on a
582
grammar. It is a subclass of <tt>LRTable</tt>.
585
<b><tt>LRGeneratedTable(grammar, method='LALR',log=None)</tt></b>
587
Create the LR parsing tables on a grammar. <tt>grammar</tt> is an instance of <tt>Grammar</tt>,
588
<tt>method</tt> is a string with the parsing method (<tt>'SLR'</tt> or <tt>'LALR'</tt>), and
589
<tt>log</tt> is a logger object used to write debugging information. The debugging information
590
written to <tt>log</tt> is the same as what appears in the <tt>parser.out</tt> file created
591
by yacc. By supplying a custom logger with a different message format, it is possible to get
592
more information (e.g., the line number in <tt>yacc.py</tt> used for issuing each line of
593
output in the log). The result is an instance of <tt>LRGeneratedTable</tt>.
597
An instance <tt>lr</tt> of <tt>LRGeneratedTable</tt> has the following attributes.
600
<b><tt>lr.grammar</tt></b>
602
A link to the Grammar object used to construct the parsing tables.
606
<b><tt>lr.lr_method</tt></b>
608
The LR parsing method used (e.g., <tt>'LALR'</tt>)
613
<b><tt>lr.lr_productions</tt></b>
615
A reference to <tt>grammar.Productions</tt>. This, together with <tt>lr_action</tt> and <tt>lr_goto</tt>
616
contain all of the information needed by the LR parsing engine.
620
<b><tt>lr.lr_action</tt></b>
622
The LR action dictionary that implements the underlying state machine. The keys of this dictionary are
627
<b><tt>lr.lr_goto</tt></b>
629
The LR goto table that contains information about grammar rule reductions.
633
<b><tt>lr.sr_conflicts</tt></b>
635
A list of tuples <tt>(state,token,resolution)</tt> identifying all shift/reduce conflicts. <tt>state</tt> is the LR state
636
number where the conflict occurred, <tt>token</tt> is the token causing the conflict, and <tt>resolution</tt> is
637
a string describing the resolution taken. <tt>resolution</tt> is either <tt>'shift'</tt> or <tt>'reduce'</tt>.
641
<b><tt>lr.rr_conflicts</tt></b>
643
A list of tuples <tt>(state,rule,rejected)</tt> identifying all reduce/reduce conflicts. <tt>state</tt> is the
644
LR state number where the conflict occurred, <tt>rule</tt> is the production rule that was selected
645
and <tt>rejected</tt> is the production rule that was rejected. Both <tt>rule</tt> and </tt>rejected</tt> are
646
instances of <tt>Production</tt>. They can be inspected to provide the user with more information.
650
There are two public methods of <tt>LRGeneratedTable</tt>.
653
<b><tt>lr.write_table(modulename,outputdir="",signature="")</tt></b>
655
Writes the LR parsing table information to a Python module. <tt>modulename</tt> is a string
656
specifying the name of a module such as <tt>"parsetab"</tt>. <tt>outputdir</tt> is the name of a
657
directory where the module should be created. <tt>signature</tt> is a string representing a
658
grammar signature that's written into the output file. This can be used to detect when
659
the data stored in a module file is out-of-sync with the the grammar specification (and that
660
the tables need to be regenerated). If <tt>modulename</tt> is a string <tt>"parsetab"</tt>,
661
this function creates a file called <tt>parsetab.py</tt>. If the module name represents a
662
package such as <tt>"foo.bar.parsetab"</tt>, then only the last component, <tt>"parsetab"</tt> is
667
<H2><a name="internal_nn7"></a>7. LRParser</H2>
670
The <tt>LRParser</tt> class implements the low-level LR parsing engine.
674
<b><tt>LRParser(lrtab, error_func)</tt></b>
676
Create an LRParser. <tt>lrtab</tt> is an instance of <tt>LRTable</tt>
677
containing the LR production and state tables. <tt>error_func</tt> is the
678
error function to invoke in the event of a parsing error.
681
An instance <tt>p</tt> of <tt>LRParser</tt> has the following methods:
684
<b><tt>p.parse(input=None,lexer=None,debug=0,tracking=0,tokenfunc=None)</tt></b>
686
Run the parser. <tt>input</tt> is a string, which if supplied is fed into the
687
lexer using its <tt>input()</tt> method. <tt>lexer</tt> is an instance of the
688
<tt>Lexer</tt> class to use for tokenizing. If not supplied, the last lexer
689
created with the <tt>lex</tt> module is used. <tt>debug</tt> is a boolean flag
690
that enables debugging. <tt>tracking</tt> is a boolean flag that tells the
691
parser to perform additional line number tracking. <tt>tokenfunc</tt> is a callable
692
function that returns the next token. If supplied, the parser will use it to get
697
<b><tt>p.restart()</tt></b>
699
Resets the parser state for a parse already in progress.
702
<H2><a name="internal_nn8"></a>8. ParserReflect</H2>
706
The <tt>ParserReflect</tt> class is used to collect parser specification data
707
from a Python module or object. This class is what collects all of the
708
<tt>p_rule()</tt> functions in a PLY file, performs basic error checking,
709
and collects all of the needed information to build a grammar. Most of the
710
high-level PLY interface as used by the <tt>yacc()</tt> function is actually
711
implemented by this class.
714
<b><tt>ParserReflect(pdict, log=None)</tt></b>
716
Creates a <tt>ParserReflect</tt> instance. <tt>pdict</tt> is a dictionary
717
containing parser specification data. This dictionary typically corresponds
718
to the module or class dictionary of code that implements a PLY parser.
719
<tt>log</tt> is a logger instance that will be used to report error
723
An instance <tt>p</tt> of <tt>ParserReflect</tt> has the following methods:
726
<b><tt>p.get_all()</tt></b>
728
Collect and store all required parsing information.
732
<b><tt>p.validate_all()</tt></b>
734
Validate all of the collected parsing information. This is a seprate step
735
from <tt>p.get_all()</tt> as a performance optimization. In order to
736
increase parser start-up time, a parser can elect to only validate the
737
parsing data when regenerating the parsing tables. The validation
738
step tries to collect as much information as possible rather than
739
raising an exception at the first sign of trouble. The attribute
740
<tt>p.error</tt> is set if there are any validation errors. The
741
value of this attribute is also returned.
745
<b><tt>p.signature()</tt></b>
747
Compute a signature representing the contents of the collected parsing
748
data. The signature value should change if anything in the parser
749
specification has changed in a way that would justify parser table
750
regeneration. This method can be called after <tt>p.get_all()</tt>,
751
but before <tt>p.validate_all()</tt>.
754
The following attributes are set in the process of collecting data:
757
<b><tt>p.start</tt></b>
759
The grammar start symbol, if any. Taken from <tt>pdict['start']</tt>.
763
<b><tt>p.error_func</tt></b>
765
The error handling function or <tt>None</tt>. Taken from <tt>pdict['p_error']</tt>.
769
<b><tt>p.tokens</tt></b>
771
The token list. Taken from <tt>pdict['tokens']</tt>.
775
<b><tt>p.prec</tt></b>
777
The precedence specifier. Taken from <tt>pdict['precedence']</tt>.
781
<b><tt>p.preclist</tt></b>
783
A parsed version of the precedence specified. A list of tuples of the form
784
<tt>(token,assoc,level)</tt> where <tt>token</tt> is the terminal symbol,
785
<tt>assoc</tt> is the associativity (e.g., <tt>'left'</tt>) and <tt>level</tt>
786
is a numeric precedence level.
790
<b><tt>p.grammar</tt></b>
792
A list of tuples <tt>(name, rules)</tt> representing the grammar rules. <tt>name</tt> is the
793
name of a Python function or method in <tt>pdict</tt> that starts with <tt>"p_"</tt>.
794
<tt>rules</tt> is a list of tuples <tt>(filename,line,prodname,syms)</tt> representing
795
the grammar rules found in the documentation string of that function. <tt>filename</tt> and <tt>line</tt> contain location
796
information that can be used for debugging. <tt>prodname</tt> is the name of the
797
production. <tt>syms</tt> is the right-hand side of the production. If you have a
802
'''expr : expr PLUS expr
805
| expr DIVIDE expr'''
808
then the corresponding entry in <tt>p.grammar</tt> might look like this:
811
('p_expr', [ ('calc.py',10,'expr', ['expr','PLUS','expr']),
812
('calc.py',11,'expr', ['expr','MINUS','expr']),
813
('calc.py',12,'expr', ['expr','TIMES','expr']),
814
('calc.py',13,'expr', ['expr','DIVIDE','expr'])
820
<b><tt>p.pfuncs</tt></b>
822
A sorted list of tuples <tt>(line, file, name, doc)</tt> representing all of
823
the <tt>p_</tt> functions found. <tt>line</tt> and <tt>file</tt> give location
824
information. <tt>name</tt> is the name of the function. <tt>doc</tt> is the
825
documentation string. This list is sorted in ascending order by line number.
829
<b><tt>p.files</tt></b>
831
A dictionary holding all of the source filenames that were encountered
832
while collecting parser information. Only the keys of this dictionary have
837
<b><tt>p.error</tt></b>
839
An attribute that indicates whether or not any critical errors
840
occurred in validation. If this is set, it means that that some kind
841
of problem was detected and that no further processing should be
846
<H2><a name="internal_nn9"></a>9. High-level operation</H2>
849
Using all of the above classes requires some attention to detail. The <tt>yacc()</tt>
850
function carries out a very specific sequence of operations to create a grammar.
851
This same sequence should be emulated if you build an alternative PLY interface.
854
<li>A <tt>ParserReflect</tt> object is created and raw grammar specification data is
856
<li>A <tt>Grammar</tt> object is created and populated with information
857
from the specification data.
858
<li>A <tt>LRGenerator</tt> object is created to run the LALR algorithm over
859
the <tt>Grammar</tt> object.
860
<li>Productions in the LRGenerator and bound to callables using the <tt>bind_callables()</tt>
862
<li>A <tt>LRParser</tt> object is created from from the information in the
863
<tt>LRGenerator</tt> object.