Skip to content

Commit

Permalink
Bug fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
dabeaz committed May 28, 2008
1 parent 241e3c7 commit d2151e3
Show file tree
Hide file tree
Showing 17 changed files with 58 additions and 71 deletions.
13 changes: 3 additions & 10 deletions ANNOUNCE
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
May 17, 2008
May 28, 2008

Announcing : PLY-2.4 (Python Lex-Yacc)
Announcing : PLY-2.5 (Python Lex-Yacc)

http://www.dabeaz.com/ply

I'm pleased to announce a significant new update to PLY---a 100% Python
implementation of the common parsing tools lex and yacc. PLY-2.4 fixes
implementation of the common parsing tools lex and yacc. PLY-2.5 fixes
some bugs in error handling and provides some performance improvements.

If you are new to PLY, here are a few highlights:
Expand All @@ -29,13 +29,6 @@ If you are new to PLY, here are a few highlights:
problems. Currently, PLY can build its parsing tables using
either SLR or LALR(1) algorithms.

- PLY can be used to build parsers for large programming languages.
Although it is not ultra-fast due to its Python implementation,
PLY can be used to parse grammars consisting of several hundred
rules (as might be found for a language like C). The lexer and LR
parser are also reasonably efficient when parsing normal
sized programs.

More information about PLY can be obtained on the PLY webpage at:

http://www.dabeaz.com/ply
Expand Down
4 changes: 2 additions & 2 deletions README
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
PLY (Python Lex-Yacc) Version 2.4 (May, 2008)
PLY (Python Lex-Yacc) Version 2.5 (May 28, 2008)

David M. Beazley ([email protected])

Copyright (C) 2001-2007 David M. Beazley
Copyright (C) 2001-2008 David M. Beazley

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
Expand Down
27 changes: 16 additions & 11 deletions doc/ply.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ <h1>PLY (Python Lex-Yacc)</h1>
</b>

<p>
<b>PLY Version: 2.4</b>
<b>PLY Version: 2.5</b>
<p>

<!-- INDEX -->
Expand Down Expand Up @@ -472,7 +472,8 @@ <H3><a name="ply_nn7"></a>3.4 Token values</H3>
</blockquote>

It is important to note that storing data in other attribute names is <em>not</em> recommended. The <tt>yacc.py</tt> module only exposes the
contents of the <tt>value</tt> attribute. Thus, accessing other attributes may be unnecessarily awkward.
contents of the <tt>value</tt> attribute. Thus, accessing other attributes may be unnecessarily awkward. If you
need to store multiple values on a token, assign a tuple, dictionary, or instance to <tt>value</tt>.

<H3><a name="ply_nn8"></a>3.5 Discarded tokens</H3>

Expand Down Expand Up @@ -894,14 +895,18 @@ <H3><a name="ply_nn17"></a>3.14 Alternative specification of lexers</H3>
</pre>
</blockquote>

For reasons that are subtle, you should <em>NOT</em> invoke <tt>lex.lex()</tt> inside the <tt>__init__()</tt> method of your class. If you
do, it may cause bizarre behavior if someone tries to duplicate a lexer object. Keep reading.
When building a lexer from class, you should construct the lexer from
an instance of the class, not the class object itself. Also, for
reasons that are subtle, you should <em>NOT</em>
invoke <tt>lex.lex()</tt> inside the <tt>__init__()</tt> method of
your class. If you do, it may cause bizarre behavior if someone tries
to duplicate a lexer object.

<H3><a name="ply_nn18"></a>3.15 Maintaining state</H3>


In your lexer, you may want to maintain a variety of state information. This might include mode settings, symbol tables, and other details. There are a few
different ways to handle this situation. First, you could just keep some global variables:
different ways to handle this situation. One way to do this is to keep a set of global variables in the module
where you created the lexer. For example:

<blockquote>
<pre>
Expand Down Expand Up @@ -940,9 +945,9 @@ <H3><a name="ply_nn18"></a>3.15 Maintaining state</H3>
</blockquote>

This latter approach has the advantage of storing information inside
the lexer itself---something that may be useful if multiple instances
the lexer object itself---something that may be useful if multiple instances
of the same lexer have been created. However, it may also feel kind
of "hacky" to the purists. Just to put their mind at some ease, all
of "hacky" to the OO purists. Just to put their mind at some ease, all
internal attributes of the lexer (with the exception of <tt>lineno</tt>) have names that are prefixed
by <tt>lex</tt> (e.g., <tt>lexdata</tt>,<tt>lexpos</tt>, etc.). Thus,
it should be perfectly safe to store attributes in the lexer that
Expand Down Expand Up @@ -977,12 +982,12 @@ <H3><a name="ply_nn18"></a>3.15 Maintaining state</H3>
</pre>
</blockquote>

The class approach may be the easiest to manage if your application is going to be creating multiple instances of the same lexer and
you need to manage a lot of state.
The class approach may be the easiest to manage if your application is
going to be creating multiple instances of the same lexer and you need
to manage a lot of state.

<H3><a name="ply_nn19"></a>3.16 Lexer cloning</H3>


<p>
If necessary, a lexer object can be quickly duplicated by invoking its <tt>clone()</tt> method. For example:

Expand Down
4 changes: 2 additions & 2 deletions example/ansic/clex.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,12 +143,12 @@ def t_ID(t):
# Comments
def t_comment(t):
r'/\*(.|\n)*?\*/'
t.lineno += t.value.count('\n')
t.lexer.lineno += t.value.count('\n')

# Preprocessor directive (ignored)
def t_preprocessor(t):
r'\#(.)*?\n'
t.lineno += 1
t.lexer.lineno += 1

def t_error(t):
print "Illegal character %s" % repr(t.value[0])
Expand Down
4 changes: 2 additions & 2 deletions example/yply/ylex.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ def t_SECTION(t):
# Comments
def t_ccomment(t):
r'/\*(.|\n)*?\*/'
t.lineno += t.value.count('\n')
t.lexer.lineno += t.value.count('\n')

t_ignore_cppcomment = r'//.*'

Expand Down Expand Up @@ -95,7 +95,7 @@ def t_code_error(t):
raise RuntimeError

def t_error(t):
print "%d: Illegal character '%s'" % (t.lineno, t.value[0])
print "%d: Illegal character '%s'" % (t.lexer.lineno, t.value[0])
print t.value
t.lexer.skip(1)

Expand Down
45 changes: 22 additions & 23 deletions ply/lex.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
# See the file COPYING for a complete copy of the LGPL.
# -----------------------------------------------------------------------------

__version__ = "2.4"
__version__ = "2.5"
__tabversion__ = "2.4" # Version of table file used

import re, sys, types, copy, os
Expand Down Expand Up @@ -89,6 +89,7 @@ def __init__(self):
self.lexretext = None # Current regular expression strings
self.lexstatere = {} # Dictionary mapping lexer states to master regexs
self.lexstateretext = {} # Dictionary mapping lexer states to regex strings
self.lexstaterenames = {} # Dictionary mapping lexer states to symbol names
self.lexstate = "INITIAL" # Current lexer state
self.lexstatestack = [] # Stack of lexer states
self.lexstateinfo = None # State information
Expand Down Expand Up @@ -161,7 +162,7 @@ def writetab(self,tabfile,outputdir=""):
for key, lre in self.lexstatere.items():
titem = []
for i in range(len(lre)):
titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1],key,initialfuncs)))
titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1],self.lexstaterenames[key][i])))
tabre[key] = titem

tf.write("_lexstatere = %s\n" % repr(tabre))
Expand Down Expand Up @@ -409,20 +410,11 @@ def _validate_file(filename):
# suitable for output to a table file
# -----------------------------------------------------------------------------

def _funcs_to_names(funclist,state,initial):
# If this is the initial state, we clear the state and initial list
if state == 'INITIAL':
state = ""
initial = []
def _funcs_to_names(funclist,namelist):
result = []
for f in funclist:
for f,name in zip(funclist,namelist):
if f and f[0]:
# If a function is defined, make sure it's name corresponds to the correct state
if not initial or f in initial:
statestr = "t_"
else:
statestr = "t_"+state+"_"
result.append((statestr+ f[1],f[1]))
result.append((name, f[1]))
else:
result.append(f)
return result
Expand Down Expand Up @@ -459,25 +451,27 @@ def _form_master_re(relist,reflags,ldict,toknames):

# Build the index to function map for the matching engine
lexindexfunc = [ None ] * (max(lexre.groupindex.values())+1)
lexindexnames = lexindexfunc[:]

for f,i in lexre.groupindex.items():
handle = ldict.get(f,None)
if type(handle) in (types.FunctionType, types.MethodType):
lexindexfunc[i] = (handle,toknames[f])
lexindexnames[i] = f
elif handle is not None:
# If rule was specified as a string, we build an anonymous
# callback function to carry out the action
lexindexnames[i] = f
if f.find("ignore_") > 0:
lexindexfunc[i] = (None,None)
else:
lexindexfunc[i] = (None, toknames[f])

return [(lexre,lexindexfunc)],[regex]
return [(lexre,lexindexfunc)],[regex],[lexindexnames]
except Exception,e:
m = int(len(relist)/2)
if m == 0: m = 1
llist, lre = _form_master_re(relist[:m],reflags,ldict,toknames)
rlist, rre = _form_master_re(relist[m:],reflags,ldict,toknames)
return llist+rlist, lre+rre
llist, lre, lnames = _form_master_re(relist[:m],reflags,ldict,toknames)
rlist, rre, rnames = _form_master_re(relist[m:],reflags,ldict,toknames)
return llist+rlist, lre+rre, lnames+rnames

# -----------------------------------------------------------------------------
# def _statetoken(s,names)
Expand Down Expand Up @@ -794,9 +788,10 @@ def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,now
# Build the master regular expressions

for state in regexs.keys():
lexre, re_text = _form_master_re(regexs[state],reflags,ldict,toknames)
lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,toknames)
lexobj.lexstatere[state] = lexre
lexobj.lexstateretext[state] = re_text
lexobj.lexstaterenames[state] = re_names
if debug:
for i in range(len(re_text)):
print "lex: state '%s'. regex[%d] = '%s'" % (state, i, re_text[i])
Expand All @@ -806,6 +801,7 @@ def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,now
if state != "INITIAL" and type == 'inclusive':
lexobj.lexstatere[state].extend(lexobj.lexstatere['INITIAL'])
lexobj.lexstateretext[state].extend(lexobj.lexstateretext['INITIAL'])
lexobj.lexstaterenames[state].extend(lexobj.lexstaterenames['INITIAL'])

lexobj.lexstateinfo = stateinfo
lexobj.lexre = lexobj.lexstatere["INITIAL"]
Expand Down Expand Up @@ -888,7 +884,10 @@ def runmain(lexer=None,data=None):

def TOKEN(r):
def set_doc(f):
f.__doc__ = r
if callable(r):
f.__doc__ = r.__doc__
else:
f.__doc__ = r
return f
return set_doc

Expand Down
2 changes: 1 addition & 1 deletion ply/yacc.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
# own risk!
# ----------------------------------------------------------------------------

__version__ = "2.4"
__version__ = "2.5"
__tabversion__ = "2.4" # Table version

#-----------------------------------------------------------------------------
Expand Down
3 changes: 1 addition & 2 deletions test/lex_ignore.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@
Traceback (most recent call last):
File "./lex_ignore.py", line 29, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_re1.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@ lex: Invalid regular expression for rule 't_NUMBER'. unbalanced parenthesis
Traceback (most recent call last):
File "./lex_re1.py", line 25, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_re2.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@ lex: Regular expression for rule 't_PLUS' matches empty string.
Traceback (most recent call last):
File "./lex_re2.py", line 25, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_re3.exp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,5 @@ lex: Make sure '#' in rule 't_POUND' is escaped with '\#'.
Traceback (most recent call last):
File "./lex_re3.py", line 27, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_state1.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@ lex: states must be defined as a tuple or list.
Traceback (most recent call last):
File "./lex_state1.py", line 38, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_state2.exp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,5 @@ lex: invalid state specifier 'example'. Must be a tuple (statename,'exclusive|in
Traceback (most recent call last):
File "./lex_state2.py", line 38, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_state3.exp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,5 @@ lex: No rules defined for state 'example'
Traceback (most recent call last):
File "./lex_state3.py", line 40, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_state4.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@ lex: state type for state comment must be 'inclusive' or 'exclusive'
Traceback (most recent call last):
File "./lex_state4.py", line 39, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_state5.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@ lex: state 'comment' already defined.
Traceback (most recent call last):
File "./lex_state5.py", line 40, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
3 changes: 1 addition & 2 deletions test/lex_state_norule.exp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@ lex: No rules defined for state 'example'
Traceback (most recent call last):
File "./lex_state_norule.py", line 40, in <module>
lex.lex()
File "../ply/lex.py", line 772, in lex
raise SyntaxError,"lex: Unable to build lexer."
File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.

0 comments on commit d2151e3

Please sign in to comment.