Correct commonly misspelled English words... quickly.
This corrects commonly misspelled English words in computer source
code, and other text-based formats (.txt
, .md
, etc).
It is designed to run quickly so it can be used as a pre-commit hook with minimal burden on the developer.
It does not work with binary formats (e.g. Word, etc).
It is not a complete spell-checking program nor a grammar checker.
Some other misspelling correctors:
- https://github.com/vlajos/misspell_fixer
- https://github.com/lyda/misspell-check
- https://github.com/lucasdemarchi
They all work but had problems that prevented me from using them at scale:
- slow, all of the above check one misspelling at a time (i.e. linear) using regexps
- not MIT/Apache2 licensed (or equivalent)
- have dependencies that don't work for me (python3, bash, linux sed, etc)
That said, they might be perfect for you and many have more features that this project!
Easily 100x to 1000x faster. You should be able to check and correct 1000 files in under 250ms.
You need golang 1.5 to compile this, but the resulting binary has no dependencies. If people want precompiled binaries for various platforms, let me know.
It's currently pulled from Wikipedia and then edited to remove false positives.
This uses the mighty power of golang's strings.Replacer which is a implementation or variation of the Aho–Corasick algorithm. This makes multiple substring matches simultaneously
In addition this uses multiple CPU cores to works on multiple files.
Unlike the other projects, this doesn't know what a "word" is. There may be more false positives and false negatives due to this. On the other hand, it sometimes catches things others don't.
Either way, please file bugs and we'll fix them!
Since it operates in parallel to make corrections, it can be non-obvious to determine exactly what word was corrected.
Run using -debug
flag on the file you want. It should then
print what word it is trying to correct. Then file a bug describing the
problem. Thanks!
Yes! If the file ends in .go
, then misspell will only check spelling in comments.
If you want to force a file to be checked as a golang source, use
-source=go
on the command line. Conversely, you can check a go lang
source as if it were pure text by using -source=text
The matching function is case-sensitive, so variable names that are
multiple worlds either in all-upper or all-lower case sometimes can
cause false positives. For instance a variable named bodyreader
could trigger a false positive since yrea
is in the middle that
could be corrected to year
. Other problems happen if the variable
name uses a English contraction that should use an apostrophe. The
best way of fixing this is to use the Effective Go naming
conventions and
use camelCase for variable names. You can check your code using
golint
gometalinter runs
multiple golang linters, and it works well with misspell
too.
After go get -u github.com/client9/misspell
you need to add it, then
enable it, like so:
gometalinter --disable-all \
--linter='misspell:misspell ./*.go:PATH:LINE:MESSAGE' --enable=misspell \
./...
Using the -f template
flag you can pass in a
golang text template to format the output.
The built-in template uses everything, including the js
function to escape the original text.
{{ .Filename }}:{{ .Line }} corrected "{{ js .Original }}" to "{{ js .Corrected }}"
To just print probable misspellings:
-f '{{ .Original }}'
You can run misspell recursively using the following notation:
misspell directory/**/*
or
find . -name '*' | xargs misspell