Polite

Polite is a lite version of Polo. The main function Polite performs is to convert MALLET's topic model output data into tables that can be used a unified data model.

Instructions

Install MALLET.

Make sure Java is installed on your computer.
Go here to download MALLET. Follow the installation instructions.

Get a corpus file to model.

See the MALLET website for instructions on the format of this file. Essentially, it's a CSV with two or three items per row: a document ID, an optional label, and the document itself. In other words, it roughly conforms to F1 form of text data.
There is a sample corpus to start with, in the /corpus directory. You can add your own.

Create the config files. Samples are included; these can be used as templates to work with other corpora and to generate other models.

config-import-file.txt
config-train-topics.txt

Create any new directories that are referenced in the config files, such as the output directory for the files MALLET generates.
Run MALLET to import the corpus, that is, to convert the CSV into a special file that MALLET wants to work with. You need to do this only once per corpus. For example, using the example config:

mallet import-file --config config-import-file.txt

Run MALLET to train topic model. Do this for as many models as you want to create, using a different config file for each. In each file, be sure to use different output directories. For example, using the example config:

mallet train-topics --config config-train-topics.txt

Run mktables.py to create tables, with arguments for the configuration file and the output directory for the tables.

python mktables.py <CONFIGFILE> <TABLEDIR>

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
corpus		corpus
output		output
polite		polite
tables		tables
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config-import-file.txt		config-import-file.txt
config-train-topics.txt		config-train-topics.txt
doitall.py		doitall.py
help-import-file.txt		help-import-file.txt
help-train-topics.txt		help-train-topics.txt
mktables.py		mktables.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polite

Instructions

About

Releases

Packages

Languages

License

ontoligent/polite

Folders and files

Latest commit

History

Repository files navigation

Polite

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages