Skip to content

Commit

Permalink
Add python module; supports LM querying
Browse files Browse the repository at this point in the history
  • Loading branch information
vchahun committed Jul 3, 2012
1 parent e3b5c55 commit 907da9b
Show file tree
Hide file tree
Showing 7 changed files with 3,329 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# python-kenlm

A python interface to [kenlm](http://kheafield.com/code/kenlm/)
25 changes: 25 additions & 0 deletions examples/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import os
import kenlm

LM = os.path.join(os.path.dirname(__file__), 'mini.klm')
model = kenlm.LanguageModel(LM)
print '%d-gram model' % model.order

sentence = u'language modeling is fun .'
print sentence, model.score(sentence)

# Check that total full score = direct score
def score(s):
return sum(prob for prob, _ in model.full_scores(s))

assert (abs(score(sentence) - model.score(sentence)) < 1e-3)

# Show scores and n-gram matches
words = ['<s>'] + sentence.split() + ['</s>']
for i, (prob, length) in enumerate(model.full_scores(sentence)):
print prob, length, ':', ' '.join(words[i+2-length:i+2])

# Find out-of-vocabulary words
for w in words:
if not w in model:
print '"%s" is an OOV' % w
Binary file added examples/mini.klm
Binary file not shown.
Loading

0 comments on commit 907da9b

Please sign in to comment.