Skip to content

Commit

Permalink
Stopped fixed-case tagging of "K" and "N" (acl-org#960)
Browse files Browse the repository at this point in the history
Stopped fixed-case tagging of "K" and "N" because it resulted in marking of expressions like "N-gram" and "K-means"

Co-authored-by: Marc Schulder <>
  • Loading branch information
marcschulder authored Aug 27, 2020
1 parent 9fb3f10 commit e4a1eff
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion bin/fixedcase/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Fixed-caseness is determined by this decision list:
4. Any word with a capital letter in a non-initial position (e.g.,
"TextTiling", "QA") is fixed-case.
5. The French contracted forms "L’" and "D’" are not fixed-case.
6. Any tokenized word consisting of a single uppercase letter other than "A",
6. Any tokenized word consisting of a single uppercase letter other than "A", "K" or "N",
or a single uppercase letter plus ".", is fixed-case.
7. If one of a short list of adjectival modifiers including "North" and "Modern"
precedes a fixed-case word, optionally separated by a hyphen,
Expand Down
2 changes: 1 addition & 1 deletion bin/fixedcase/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def fixedcase_word(w, truelist=None):
if any(c.isupper() for c in w[1:]):
# tokenized word with noninitial uppercase
return True
if len(w) == 1 and w.isupper() and w != 'A':
if len(w) == 1 and w.isupper() and w not in {'A', 'K', 'N'}:
# single uppercase letter
return True
if len(w) == 2 and w[1] == '.' and w[0].isupper():
Expand Down

0 comments on commit e4a1eff

Please sign in to comment.