Skip to content

Commit

Permalink
corrected typos in docstr and remove non :param style input/output
Browse files Browse the repository at this point in the history
description.
  • Loading branch information
alvations committed Aug 14, 2014
1 parent 6ff15b8 commit 598b936
Showing 1 changed file with 2 additions and 17 deletions.
19 changes: 2 additions & 17 deletions nltk/align/phrase_based.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ def phrase_extraction(srctext, trgtext, alignment):
a word-aligned sentence pair.
The idea is to loop over all possible source language (e) phrases and find
the minimal foregin phrase (f) that matches each of them. Matching is done
the minimal foreign phrase (f) that matches each of them. Matching is done
by identifying all alignment points for the source phrase and finding the
shortest foreign phrase that includes all hte foreign counterparts for the
shortest foreign phrase that includes all the foreign counterparts for the
source words.
In short, a phrase alignment has to
Expand All @@ -36,21 +36,6 @@ def phrase_extraction(srctext, trgtext, alignment):
∃e i ∈ e ̄ , f j ∈ f ̄ s.t. (e i , f j ) ∈ A
[in]:
*srctext* is the tokenized source sentence string.
*trgtext* is the tokenized target sentence string.
*alignment* is the word alignment outputs in pharaoh format
[out]:
*bp* is the phrases extracted from the algorithm, it's made up of a tuple
that stores:
( (src_from, src_to), (trg_from, trg_to), src_phrase, target_phrase )
(i) the position of the source phrase
(ii) the position of the target phrase
(iii) the source phrase
(iv) the target phrase
>>> srctext = "michael assumes that he will stay in the house"
>>> trgtext = "michael geht davon aus , dass er im haus bleibt"
>>> alignment = [(0,0), (1,1), (1,2), (1,3), (2,5), (3,6), (4,9),
Expand Down

0 comments on commit 598b936

Please sign in to comment.