Skip to content

PDF processing tool to extract document data and save it in EDN format

License

Notifications You must be signed in to change notification settings

jackrusher/pdftoedn

 
 

Repository files navigation

pdftoedn 0.36.8

A poppler-based PDF processing tool to extract document data and save it in EDN format. It supports:

  • Font and glyph remapping via user-defined font map configurations (in JSON format) to allow glyph substitutions for Type 1 or TT fonts with invalid/incorrect unicode tables and even embedded CID fonts with missing tables.
  • Path data extraction.
  • Transformed image output, written directly to disk in PNG format.
  • Annotations.
  • PDF outlines.

Usage

Process a pdf document and write its output to output_file.edn:

pdftoedn -o output_file.edn input_file.pdf

Further reading

Refer to the wiki for

About

PDF processing tool to extract document data and save it in EDN format

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HTML 62.8%
  • C++ 31.1%
  • M4 5.0%
  • Shell 0.7%
  • Roff 0.3%
  • Makefile 0.1%