Skip to content

Correct commonly misspelled English words in source files

License

Notifications You must be signed in to change notification settings

client9/misspell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Go Report Card GoDoc Coverage license

Correct commonly misspelled English words... quickly.

FAQ

What problem does this solve?

This corrects commonly misspelled English words in computer source code, and other text-based formats (.txt, .md, etc).

It is designed to run quickly so it can be used as a pre-commit hook with minimal burden on the developer.

It does not work with binary formats (e.g. Word, etc).

It is not a complete spell-checking program nor a grammar checker.

What are other misspelling correctors and what's wrong with them?

Some other misspelling correctors:

They all work but had problems that prevented me from using them at scale:

  • slow, all of the above check one misspelling at a time (i.e. linear) using regexps
  • not MIT/Apache2 licensed (or equivalent)
  • have dependencies that don't work for me (python3, bash, linux sed, etc)

That said, they might be perfect for you and many have more features that this project!

How much faster is this project?

Easily 100x to 1000x faster. You should be able to check and correct 1000 files in under 250ms.

What license is this?

MIT

What are the dependencies?

You need golang 1.5 to compile this, but the resulting binary has no dependencies. If people want precompiled binaries for various platforms, let me know.

Where do the word lists come from?

It's currently pulled from Wikipedia and then edited to remove false positives.

Why is this so fast?

This uses the mighty power of golang's strings.Replacer which is a implementation or variation of the Aho–Corasick algorithm. This makes multiple substring matches simultaneously

In addition this uses multiple CPU cores to works on multiple files.

What problems does it have?

Unlike the other projects, this doesn't know what a "word" is. There may be more false positives and false negatives due to this. On the other hand, it sometimes catches things others don't.

Either way, please file bugs and we'll fix them!

Since it operates in parallel to make corrections, it can be non-obvious to determine exactly what word was corrected.

It's making mistakes. How can I debug?

Run using -debug flag on the file you want. It should then print what word it is trying to correct. Then file a bug describing the problem. Thanks!

Are there special rules for golang source files?

Yes! If the file ends in .go, then misspell will only check spelling in comments.

If you want to force a file to be checked as a golang source, use -source=go on the command line. Conversely, you can check a go lang source as if it were pure text by using -source=text

Why is it making mistakes or missing items in golang files?

The matching function is case-sensitive, so variable names that are multiple worlds either in all-upper or all-lower case sometimes can cause false positives. For instance a variable named bodyreader could trigger a false positive since yrea is in the middle that could be corrected to year. Other problems happen if the variable name uses a English contraction that should use an apostrophe. The best way of fixing this is to use the Effective Go naming conventions and use camelCase for variable names. You can check your code using golint

Does this work with gometalinter?

gometalinter runs multiple golang linters, and it works well with misspell too.

After go get -u github.com/client9/misspell you need to add it, then enable it, like so:

gometalinter --disable-all \
   --linter='misspell:misspell ./*.go:PATH:LINE:MESSAGE' --enable=misspell \
   ./...

How can I change the output format?

Using the -f template flag you can pass in a golang text template to format the output.

The built-in template uses everything, including the js function to escape the original text.

{{ .Filename }}:{{ .Line }} corrected "{{ js .Original }}" to "{{ js .Corrected }}"

To just print probable misspellings:

-f '{{ .Original }}'

Check an entire folder recursively

You can run misspell recursively using the following notation:

misspell directory/**/*

or

find . -name '*' | xargs misspell

About

Correct commonly misspelled English words in source files

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages