-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StackedLexer #18
base: master
Are you sure you want to change the base?
StackedLexer #18
Conversation
Apparently propeties are not RPython
I've added a test to ztranslation to ensure StackedLexer is RPython. I used a property which apparently isn't RPython and I fixed that, however now I'm getting a problem due to the .match method of compiled regular expressions not being RPython. How would I go about fixing this? |
I've fixed the issue by simply merging Lexer and StackedLexer into one Lexer class. Lexer now has all the functionality of StackedLexer but is completely backwards compatible. Another advantage of this is, that it resolves the code duplication issue which means that this could be merged as-is. The only thing that's missing is documentation but I would have to merge in master to do that, which on some projects is discouraged so I don't want to do that without feedback first. |
I don't have a problem with merging master in. On Mon, Jan 27, 2014 at 10:10 AM, Daniel Neuhäuser <[email protected]
"I disapprove of what you say, but I will defend to the death your right to |
Conflicts: rply/lexergenerator.py
Ok, I've merged master in and added documentation. |
Would it be possible to pull this into master? I was looking for this exact feature and am happy to see it's already been implemented! |
+1 |
I've found that the basic LexerGenerator provided by rply while convenient is too simple for some uses cases. So I've implemented this somewhat more powerful StackedLexer, inspired by pygments.
The basic idea is that instead of having rules directly associated with a lexer, the lexer instead has a set of states and each state has a set of rules. Further rules are extended with information about transitions.
During lexing the lexer maintains a stack of states and uses the rules of the state at the top of the stack. Once a rule matches, the lexer performs a transition, if so defined by that rule.
Possible transititions are pushing a new state to the stack or popping the current state of the stack. While I haven't implemented that, a switch to another state without modifying the stack would also be trivial to implement.
This can be very useful for lexing nested comments (see the test for an example) or strings as a sequence of character and escape sequence tokens.
I've made this pull request not as a request to pull the code as-is but rather to get feedback on the idea, API and implementation. Is there interest in this idea? Is the API as I've designed it acceptable or should it be changed? The StackedLexer represents a superset of Lexer and could also replace it entirely without breaking backwards compatibility, what are your thoughts on that?
At the moment quite a bit of code is duplicated between the two lexers, obviously this is not acceptable and needs improvement but I wanted to get feedback before putting too much time into this.