This is a project for Chinese Tokenization.
-
Usage of segment_sentences:
python segment_sentences.py [options] [arg]
-
Options:
-h, --help show this help message and exit -d, --debug print the debug information of the segmentation, default is not -f FILE, --file=FILE segment sentences from the specified file -i, --interactive go into interactive mode -o OUT, --out=OUT write the segment result into the specified file -s SEPARATOR, --separator=SEPARATOR specified the separator of the segmentation result -t TRAIN, --train=TRAIN use the training set to train the algorithm -v, --version output version info and exit