Skip to content

cofi/PDictCC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

PDictCC

PDictCC is a Python port of Tassilo Horn’s great RDictCc. It tries to be compatible with it where it is sensible and possible.

See Compatibility with RDictCc for more information.

Requirements

  • Python 2.7 or Python 2.6 with argparse module
  • GDBM bindings for Python (should come with Python, if you see an ImportError it does not; on Debian GNU/Linux you should install the python-gdbm package)
  • optional: progressbar module for a progressbar on import (on Debian GNU/Linux python-progressbar)

Importing a dict.cc dictionary

You can get a dict.cc dictionary by filing a request here.

Only one dictionary (e.g. DE => EN) is needed for lookup from words from both directions (i.e DE => EN, EN => DE) as PDictCC constructs two databases from one dictionary.

To import that dictionary into PDictCC’s format invoke pdictcc -i dictionary.txt.

At the moment only one dictionary is supported at a time.

Transcriptions

If you deal with languages that contain “special” letters you may want to be query for words that contain them via transcriptions (i.e. query “kaese” and get the results you’d get for “käse”).

Here is an import with transcriptions that’s sensible for German dictionaries.

cofi@hitchhiker%> pdictcc --import dictionary.txt --transcription ä ae ö oe ß ss ü ue

Transcriptions are pairs of the string to replace and the string that will replace it (e.g. a pair of the transcriptions above is ä ae).

And here is an example that utilizes transcriptions (note how we query for ‘ä’ with ‘ae’):

cofi@hitchhiker %> pdictcc kaese
=============== [ DE => EN ] ===============
Käse {m}:
    - cheese
Käse {m} [ugs.] [Unsinn]:
    - codswallop [Br.] [coll.]
    - garbage [esp. Am.]
Käse {m} [ugs.] [Blödsinn]:
    - crap [coll.] [nonsense]
    - rubbish [esp. Br.] [nonsense]
Käse {m} [ugs.] [dumme Angelegenheit]:
    - stupid business

You can only specify transcriptions on import and they will affect both languages. Furthermore transcriptions are only considered for simple and regexp lookup, not for fulltext queries.

Querying from the commandline

You can query the database from the commandline as a simple lookup:

cofi@hitchhiker %> pdictcc flask
=============== [ EN => DE ] ===============
flask:
    - Fläschchen {n}
    - Flachmann {m}
    - Flasche {f}
    - Formkasten {m}
    - Kolben {m}
hip flask:
    - Flachmann {m}
(glass) flask:
    - Glaskolben {m} [Laborgerät]

as a regular expression (tests the key, i.e. longest word in the phrase to translate):

cofi@hitchhiker %> pdictcc -r '^B.*königin'
=============== [ DE => EN ] ===============
Bienenkönigin {f}:
    - queen bee
Die Bienenkönigin [Brüder Grimm]:
    - The Queen Bee [Grimm Brothers]
Ballkönigin {f}:
    - belle of the ball
    - prom queen [Am.] [Can.]

or as a fulltext search (currently broken):

cofi@hitchhiker %> pdictcc -f 'sag niemals nie'
...

Querying the database as a regular expression and a fulltext search performs a full scan of the database, so it will take longer than a simple lookup.

For a more compact output you can use the --compact parameter:

cofi@hitchhiker %> pdictcc --compact flask
=============== [ EN => DE ] ===============
flask: Fläschchen {n} / Flachmann {m} / Flasche {f} / Formkasten {m} / Kolben
     {m}
hip flask: Flachmann {m}
(glass) flask: Glaskolben {m} [Laborgerät]

Using the interactive mode

If you invoke pdictcc without a query you will enter the interative mode. The interactive mode has readline support and is quite handy if you want to translate several words in a row or over time.

Every query you type will be translated as a simple lookup. If you prefix your query with :r: or :f: you will execute a regular expresion query respective a fulltext query.

You can change some settings within the interactive Mode by queries like :set key value.

These settings are currently supported:

  • compact for compact formatting (“boolish”[1], e.g. :set compact on or :set compact off)
  • width after how many characters output is wrapped (Integer, e.g. :set width 42)
  • limit how many phrases per entry are displayed (Integer, e.g. :set limit 23)

Here is an example Session (with an DE=>EN dictionary):

cofi@hitchhiker %> pdictcc
Welcome to the interactive mode: You can type queries here.
Prefix your query with `:r:` to issue a regular expression query or with `:f:` for a fulltext query.
Enter C-d (Ctrl + d) to exit.
=> Schaf
=============== [ DE => EN ] ===============
Schaf {n}:
    - jumbuck [Aus.] [Aboriginal]
    - sheep [Ovis]
Schaf {n} [Begriff aus austral. Pionierzeit]:
    - jumbuck [Aus.] [coll.]
wie ein Schaf:
    - sheepish
=> jumbuck
=============== [ EN => DE ] ===============
jumbuck [Aus.] [Aboriginal]:
    - Schaf {n}
jumbuck [Aus.] [coll.]:
    - Schaf {n} [Begriff aus austral. Pionierzeit]
=> :r:^a(p|ff)e$
=============== [ DE => EN ] ===============
Affe {m} [ugs.]:
    - knapsack
Affe {m} [Menschenaffe]:
    - ape
Affe {m}:
    - monkey
wie ein Affe:
    - apelike
Der Affe:
    - The Monkey [Stephen King]
Ape {f} [dreirädiges Rollermobil, Kleintransporter]:
    - (Piaggio) Ape / Apecar ® [also used as an autorickshaw]
=============== [ EN => DE ] ===============
ape:
    - Affe {m} [Menschenaffe]
ape [coll.]:
    - Tollpatsch {m} [ugs.]
to ape sb.:
    - jdn. imitieren
    - jdn. nachäffen [ugs.]
    - jdn. nachmachen
to go ape [sl.]:
    - ausflippen [ugs.]
=>
Bye.

Using a non-default database directory

If you don’t want to use the PDictCC default directory ($HOME/.pdictcc) you can specify a different directory path with the --directory parameter:

pdictcc -d ~/.local/share/pdict/ -i dictionary.txt

But you have to specify the path everytime you query the database:

pdictcc -d ~/.local/share/pdict/ dictionary

or with the interactive mode:

pdictcc -d ~/.local/share/pdict/

Integration with Emacs

PDictCC integrates with RDictCc’s Emacs package rdictcc.el.

If you set this in your Emacs config you should be good to go:

(setq rdictcc-program "path/to/pdictcc")

See The RDictCc website for more information.

Compatibility with RDictCc

I tried to keep the database format compatible with RDictCc but there are differences between Python’s and Ruby’s (G)DBM Modules that make it difficult to use the same files (file name issues). Once you map them onto each other you should use PDictCC with RDictCc databases and vice versa. But I don’t see it as a priority.

PDictCC is fully compatible with RDictCc’s commandline arguments and mostly with the output formatting but provides a strict superset of features (and arguments).

Footnotes

[1] A “boolish” value is true if it’s one of on, true, 1 or yes or false if it’s one of off, false, 0 or no .

About

Offline dictionary lookup based on dict.cc dictionaries.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages