Skip to content

Commit

Permalink
Bug 1137544 - Update the en-US dictionary based on the SCOWL 2015.04.…
Browse files Browse the repository at this point in the history
…24 wordlist using the new scripts
  • Loading branch information
ehsan committed May 4, 2015
1 parent 74990d1 commit 71fdc7e
Show file tree
Hide file tree
Showing 8 changed files with 228 additions and 50 deletions.
13 changes: 9 additions & 4 deletions extensions/spellcheck/locales/en-US/hunspell/README_en_US.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
en_US-mozilla Hunspell Dictionary
Generated from SCOWL Version 2015.02.15
Thu Apr 23 19:31:52 EDT 2015
Generated from SCOWL Version 2015.04.24
Wed Apr 29 20:51:35 EDT 2015

http://wordlist.sourceforge.net

Expand Down Expand Up @@ -57,12 +57,17 @@ [email protected] or to the wordlist-devel mailing lists
have specific issues with any of these dictionaries please file a bug
report at https://github.com/kevina/wordlist/issues.

IMPORTANT CHANGES FROM 2015.02.15:

The dictionaries are now in UTF-8 format instead of ISO-8859-1. This
was required to handle smart quotes correctly.

ADDITIONAL NOTES:

The NOSUGGEST flag was added to certain taboo words. While I made an
honest attempt to flag the strongest taboo words with the NOSUGGEST
flag, I MAKE NO GUARANTEE THAT I FLAGGED EVERY POSSIBLE TABOO WORD.
The list was originally derived from N�meth L�szl�, however I removed
The list was originally derived from Németh László, however I removed
some words which, while being considered taboo by some dictionaries,
are not really considered swear words in today's society.

Expand Down Expand Up @@ -306,4 +311,4 @@ from the Ispell distribution they are under the Ispell copyright:
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

Build Date: Thu Apr 23 19:31:52 EDT 2015
Build Date: Wed Apr 29 20:51:35 EDT 2015
Original file line number Diff line number Diff line change
Expand Up @@ -2148,7 +2148,6 @@ Chaddy
Chaddy's
Chaim
Chaim's
Chalmers
Chancey
Chancey's
Chanda
Expand Down Expand Up @@ -3151,8 +3150,6 @@ Der
Der's
Derk
Derk's
Dermot
Dermot's
Derrek
Derrek's
Derrik
Expand Down Expand Up @@ -5325,8 +5322,6 @@ Helsa's
Helvetica
Helyn
Helyn's
Hendrick
Hendrick's
Hendrik
Hendrik's
Hendrika
Expand All @@ -5339,8 +5334,6 @@ Henrieta
Henrieta's
Henriette
Henriette's
Henrik
Henrik's
Henryetta
Henryetta's
Hephzibah
Expand Down Expand Up @@ -5543,8 +5536,6 @@ Hyman's
Hymie
Hynda
Hynda's
IKEA
IKEA's
IMDb
IMDb's
IMDbPro
Expand Down Expand Up @@ -12097,7 +12088,6 @@ antivirus's
antiviruses
apatosaurus
apatosaurus's
apoptosis
arXiv
arXiv's
archaeoastronomy
Expand Down Expand Up @@ -12517,8 +12507,6 @@ intrench's
intrust's
invariant's
invest's
ischemia
ischemic
jewellery
judgement
judgement's
Expand Down Expand Up @@ -12558,7 +12546,6 @@ meetups
megadeathes
megajoule
megajoule's
men's
mesothelioma
mesothelioma's
metadata
Expand Down Expand Up @@ -12609,7 +12596,6 @@ newswires
octopi
offsite
oligo
oligonucleotide
onsite
opposable
opposer
Expand Down Expand Up @@ -12853,7 +12839,6 @@ rotatably
sabre
sabre's
sabres
sapiens
sativa
savoir
schnaps
Expand Down Expand Up @@ -12942,7 +12927,6 @@ textbox's
textboxes
thaliana
therebetween
theremin
theremin's
theremins
timeline
Expand Down Expand Up @@ -13071,15 +13055,11 @@ weaponizing
webdesign
webdesign's
webdesigns
weblog
weblog's
weblogs
whitepaper
whitepaper's
whitepapers
widescreen's
wildcard
wildcard's
wildcards
women's
wop's
Original file line number Diff line number Diff line change
@@ -1,6 +1,2 @@
ABCs
Ikea
Ikea's
Paypal
Paypal's
megadeaths
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
en_US Hunspell Dictionary
Generated from SCOWL Version 2015.02.15
Thu Apr 23 19:31:37 EDT 2015
Generated from SCOWL Version 2015.04.24
Wed Apr 29 20:51:21 EDT 2015

http://wordlist.sourceforge.net

Expand Down Expand Up @@ -33,7 +33,7 @@ only found in the large dictionary for American English:

Bermejo Freyr's Guenevere Hatshepsut Nottinghamshire arrestment
crassitudes crural dogwatches errorless fetial flaxseeds godroon
incretion jalapeño's kelpie kishkes neuroglias pietisms pullulation
incretion jalapeño's kelpie kishkes neuroglias pietisms pullulation
stemwinder stenoses syce thalassic zees

The en_US and en_CA are the official dictionaries for Hunspell. The
Expand All @@ -57,12 +57,17 @@ [email protected] or to the wordlist-devel mailing lists
have specific issues with any of these dictionaries please file a bug
report at https://github.com/kevina/wordlist/issues.

IMPORTANT CHANGES FROM 2015.02.15:

The dictionaries are now in UTF-8 format instead of ISO-8859-1. This
was required to handle smart quotes correctly.

ADDITIONAL NOTES:

The NOSUGGEST flag was added to certain taboo words. While I made an
honest attempt to flag the strongest taboo words with the NOSUGGEST
flag, I MAKE NO GUARANTEE THAT I FLAGGED EVERY POSSIBLE TABOO WORD.
The list was originally derived from Németh László, however I removed
The list was originally derived from Németh László, however I removed
some words which, while being considered taboo by some dictionaries,
are not really considered swear words in today's society.

Expand Down Expand Up @@ -306,5 +311,5 @@ from the Ispell distribution they are under the Ispell copyright:
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

Build Date: Thu Apr 23 19:31:37 EDT 2015
Build Date: Wed Apr 29 20:51:21 EDT 2015
Wordlist Command: mk-list --accents=strip en_US 60
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
SET ISO8859-1
SET UTF8
TRY esianrtolcdugmphbyfvkwzESIANRTOLCDUGMPHBYFVKWZ'
ICONV 1
ICONV ’ '
NOSUGGEST !

# ordinal numbers
Expand Down
Loading

0 comments on commit 71fdc7e

Please sign in to comment.