Unscramble provided character strings to form derived words.
A character string is encoded as a vector holding the number of occurrences of each character in the string, resulting in permutations of the same characters having identical vectors. Then the vector is mapped to a unique signature-integer using an injective function approximation. Distinct integers correspond to distinct vectors and vice versa.
A plain txt wordlist file is used to generate a json dictionary with integer-signatures as keys and the list of words from the initial wordlist that have the same integer-signature as value of that key. This action is performed only once for each plain wordlist and the resulting mapped wordlist can be used multiple times from now on.
Whenever a character string needs to be unscrambled, firstly it's integer-signature is calculated and then a search is performed in an already mapped wordlist for that integer-signature as dictionary key. The corresponding dictionary value is then returned as the results list, populated with derived words.
Two already mapped wordlists can be found in %project_dir%/wordlists/mapped/
directory:
english_wordlist_simple.json
(DEFAULT MAPPED WORDLIST)english_wordlist_complex.json
Run main.py
and use:
unscramble
option to unscramble character strings.
- Use
--chars
(or-c
) argument to provide a single character string. - Use
--file
(or-f
) argument to provide multiple character strings via a txt file (one string per line). The results are outputted inside%project_dir%/output/
directory as a txt file named%initial_filename%_unscrambled.txt
.
Optional: - Use
--subsets
(or-s
) argument to unscramble all subsets of provided character string as well. - Use
--wordlist
(or-w
) argument to provide a path to a custom mapped wordlist. If a custom mapped wordlist is not provided then default english_wordlist_simple.json will be used.
map
option to generate a custom mapped wordlist.
- Use
--wordlist
(or-w
) argument to specify the path to a plain txt wordlist file. - Use
--valid-chars
(or-v
) argument to provide a string of characters that are permitted to be present in words of the new custom mapped wordlist. During the creation of a mapped wordlist, a word from the plain wordlist is not registered if it contains invalid characters.
Examples:
python main.py unscramble -c "ogd"
ogd --> ['dog', 'god']
python main.py unscramble -c "ogd" -s
og --> ['go']
od --> ['do']
ogd --> ['dog', 'god']
python main.py unscramble -f "%path_to_input_txt_file%"
python main.py map -w "%path_to_plain_txt_wordlist%" -v "abcdefghijklmnopqrstuvwxyz-"