Skip to content

DyAnnotationExtractor is software for extracting annotations (highlighted text and comments) from e-documents like PDF.

License

Notifications You must be signed in to change notification settings

dimi2/DyAnnotationExtractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DyAnnotationExtractor

DyAnnotationExtractor is software for extracting annotations (highlighted text and comments) from e-documents like PDF. The extracted parts can be used to build summary/resume of the document.

Usage

Imagine you have ebook (PDF) which is 100 pages long. While reading the book, you highlight the important parts in your favorite reader:

Then use the DyAnnotationExtractor tool to get just the highlighted parts.

On the comman line execute following command.
For Windows:

DyAnnotationExtractor -input "Getting Started with Ubuntu 16.04.pdf"

For Linux:

./DyAnnotationExtractor.sh -input "Getting Started with Ubuntu 16.04.pdf"

This will create a file with same name in the same directory, with added '.md' suffix.

Now you have extract of the book which is not 100 but 5-6 pages. So, you can skim just the exported text instead of re-reading the entire book.

Supported Input Formats

  • PDF (Portable Document Format)

Supported Output Formats

  • MD (Markdown)

Requirements

  • Java 8+.

Download

Get the latest release.

There are separate files for: distribution, binary and sources.
End users need to download only the distribution.

Installation

Extract the downloaded archive in some local directory.
Run the provided 'DyAnnotationExtractor' script to perform extraction.

Build

To build the project from sources, you will need Gradle build tool. Go into the project home directory (PROJ_HOME) and execute command:

gradle

The result will appear in directory PROJ_HOME/build/distributions.

Dependencies

  • iTextPdf 7.1.2+ (PDF handling library)

About

DyAnnotationExtractor is software for extracting annotations (highlighted text and comments) from e-documents like PDF.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages