nitcc is a simple LR generator for Nit programs. It features a small subset of the functionalities of SableCC 3 and 4.
Have a valid compiler in bin/
Just run make
in the contrib/nitcc/
directory
Usage:
nitcc file.sablecc
nitcc generates a bunches of control files, a lexer, a parser, and a tester.
To compile and run the tester:
nitc file_test_parser.nit
./file_test_parser an_input_file_to_parse
The sub-directory examples/
contains simple grammars and interpretors.
The sub-directory tests/
contains regression tests.
- command line tool (
nitcc
) - Grammar syntax of SableCC4 (with pieces of SableCC3)
- Generates a Lexer
- Generates a SLR parser
- Generates a LALR parser
- Generates classes for the AST and utils
For the tool (and the code)
- usable
- bootstrap itself (see
nitcc.sablecc
)
For the lexer (and regexp, NFA, and DFA)
- Any
- interval of characters and subtraction of characters
- implicit priorities (by inclusion of languages)
- Except and And
- Shortest and Longest (but dummy semantic without lookahead)
- efficient implementation of intervals
- DFA minimization
For the parser (and grammar and LR)
- Modifiers (
?
,*
,+
) - Ignored
- Rejected
- Empty (but not mandatory)
- Opportunistic
- Precedence
- Separator
- Dangling (automatic, so mitigate the SLR limitations)
- simple transformation (unchecked)
- simple inlining (non automatic, except for
?
and*
)
For the AST (generated classes, utils and their API)
- Common runtime-library
nitcc_runtime.nit
- Terminal nodes; see
NToken
. - Heterogeneous non-terminal nodes with named fields; see
NProd
. - Homogeneous non-terminal nodes for lists (
+
and*
); seeNodes
. - Visitor design pattern; see
Visitor
. - Syntactic and lexical errors; see
NError
. - positions of tokens in the input stream; see
Position
- positions of non-terminal nodes.
- API for the input source
- sane API to invoke/initialize the parser (and the lexer)
- Limited error checking; bad grammars can produce uncompilable, or worse buggy, nit code.
- The SLR automaton is not very expressive; do not except to parse big and complex language like Nit or Java.
- The generated Nit code is inefficient and large; even if you get an acceptable grammar, do not except to parse efficiently big and complex language like Nit or Java.
- No real unicode support.
- Advanced features of SableCC4 are not planed.