-
Download and install the software needed
- Download and install Senna, http://ronan.collobert.com/senna/
- Download and install CoreNLP 3.6, http://stanfordnlp.github.io/CoreNLP/history.html - Needs Java 8
- Download and install pyFIM http://www.borgelt.net/pyfim.html
- Download
fim.so
and place it in python'sdist-packages
directory - Run
pip install scipy numpy python-Levenshtein
-
Download data and create environment
- Make
create_data_folder.sh
executable:chmod +x create_data_folder.sh
- Run
./create_data_folder.sh path/to/data/folder
(No trailing slash!) - Modify
./enlp/settings.py
accordingly
- Make
-
Pre-process datasets
python process_corpus.py
-
Run (will take several hours depending on the number of cores available)
python run.py path/to/store/output/json/files
Check process_corpus.py --help
and run.py --help
for more details on how run them.