Skip to content

Commit

Permalink
Added picture of tokenizer state machine.
Browse files Browse the repository at this point in the history
  • Loading branch information
happi committed Mar 26, 2017
1 parent 1cbb0d6 commit 2571929
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 29 deletions.
Binary file added code/compiler_chapter/json_tokens.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
38 changes: 38 additions & 0 deletions code/compiler_chapter/src/yecc_json_parser.yrl
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@

Nonterminals value values object array pair pairs.

Terminals number string true false null '[' ']' '{' '}' ',' ':'.

Rootsymbol value.

value -> object : '$1'.
value -> array : '$1'.
value -> number : get_val('$1').
value -> string : get_val('$1').
value -> 'true' : get_val('$1').
value -> 'null' : get_val('$1').
value -> 'false' : get_val('$1').

object -> '{' '}' : #{}.
object -> '{' pairs '}' : '$2'.

pairs -> pair : '$1'.
pairs -> pair ',' pairs : maps:merge('$1', '$3').

pair -> string ':' value : #{ get_val('$1') => '$3' }.

array -> '[' ']' : {}.
array -> '[' values ']' : list_to_tuple('$2').

values -> value : [ '$1' ].
values -> value ',' values : [ '$1' | '$3' ].



Erlang code.

get_val({_,_,Val}) -> Val;
get_val({Val, _}) -> Val.



58 changes: 29 additions & 29 deletions compiler.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -760,44 +760,44 @@ We can try our tokenizer on an example json file (test.json).

include::code/compiler_chapter/src/test.json

// First we need to compile our tokenizer, then we read the file
// and convert it to a string. Finally we can use
// the string/1 function that leex generates to tokenize the test file.
First we need to compile our tokenizer, then we read the file
and convert it to a string. Finally we can use
the string/1 function that leex generates to tokenize the test file.

// {:language="erlang"}
// ~~~
// 2> c(json_tokens).
// {ok,json_tokens}.
// 3> f(File), f(L), {ok, File} = file:read_file("test.json"), L = binary_to_list(File), ok.
// ok
// 4> f(Tokens), {ok, Tokens,_} = json_tokens:string(L), hd(Tokens).
// {'{',1}
// 5>
// ~~~
[source, erlang]
----
2> c(json_tokens).
{ok,json_tokens}.
3> f(File), f(L), {ok, File} = file:read_file("test.json"), L = binary_to_lisile), ok.
ok
4> f(Tokens), {ok, Tokens,_} = json_tokens:string(L), hd(Tokens).
{'{',1}
5>
----

// The shell function f/1 tells the shell to forget a variable
// binding. This is useful if you want to try a command that binds a
// variable multiple times, for example as you are writing the lexer and
// want to try it out after each rewrite. We will look at the shell
// commands in detail in the a later chapter.
The shell function f/1 tells the shell to forget a variable
binding. This is useful if you want to try a command that binds a
variable multiple times, for example as you are writing the lexer and
want to try it out after each rewrite. We will look at the shell
commands in detail in the a later chapter.

// Armed with a tokenizer for Json we can now write a json parser
// using the parser generator Yecc.
Armed with a tokenizer for Json we can now write a json parser
using the parser generator Yecc.


// === Yecc
=== Yecc

// Yecc is a parser generator for Erlang. The name comes from Yacc
// (Yet another compiler compiler) the canonical parser generator for C.
Yecc is a parser generator for Erlang. The name comes from Yacc
(Yet another compiler compiler) the canonical parser generator for C.

// Now that we have a lexer for JSON terms we can write a parser using
// yecc.
Now that we have a lexer for JSON terms we can write a parser using
yecc.

// <embed file="code/yecc_json_parser.yrl" verbatim="yes"/>
include::code/compiler_chapter/src/yecc_json_parser.yrl

// Then we can use yecc to generate an Erlang program that implements
// the parser, and call the parse/1 function provided with the tokens
// generated by the tokenizer as argument.
Then we can use yecc to generate an Erlang program that implements
the parser, and call the parse/1 function provided with the tokens
generated by the tokenizer as argument.

// {:language="erlang"}
// ~~~
Expand Down

0 comments on commit 2571929

Please sign in to comment.