Skip to content

lang-uk/choppa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

choppa

Partial python port of java SRX segmenter, originally written by Jarek Lipski aka loomchild.

In a nutshell it allows you to tokenize texts into the sentences (but generally it's rule-based, so you can chop anything textual).

Shipped with segment.srx set of segmentation rules for different languages, crafted by the great team of languagetool.

Copyrights and kudos

  • Python port: Dmytro Chaplynskyi
  • Original Java implementation: Jarek Lipski
  • Segmentation rules: Daniel Naber, Jaume Ortolà et al (153 contributors!)
  • Special thanks to Andriy Rysin

About

Partial python port of java SRX segmenter

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages