-
Notifications
You must be signed in to change notification settings - Fork 0
Code for generating synthetic PCFGs for testing grammatical inference algorithms.
License
alexc17/syntheticpcfg
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Code for generating synthetic PCFGs with desirable properties, that can be useful for learning experiments. The generated PCFGs will be in CNF, and will be consistent (i.e. the sum of the probability of all strings will be 1.) There are two main types that we generate: the first has a non trivial CFG backbone, and the second has a trivial CFG, including all possible productions. The default settings will give grammars with 10 nonterminals, about 10,000 terminals, a length distribution that is zero truncated Poisson with expected length about 5 and a fat tailed lexical distribution (Zipfian) using a lognormal distribution. python sample_grammar.py /tmp/grammar1.pcfg python sample_fullgrammar.py /tmp/grammar1.pcfg python sample_corpus.py /tmp/grammar1.pcfg /tmp/grammar1.samples python sample_corpus.py --yieldonly --omitprobs /tmp/grammar1.pcfg /tmp/grammar1.samples
About
Code for generating synthetic PCFGs for testing grammatical inference algorithms.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published