This is the GitHub organization Docling open-source project.
Docling is our main open-source package. It is a powerful library which simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.
We support an amazing community which helps us driving forward the adoption of Docling. Give it a try and join the community!
The key repositories of Docling are:
- docling - The home of the main
docling
package. - docling-core - The definition of types, transforms, serializers, etc. If it has to do with the
DoclingDocument
you will find it here. - docling-parse - The backend PDF parser used by Docling.
- docling-serve - The FastAPI wrappers for running Docling as REST API and distribute large jobs.
- docling-ibm-models - The AI models powering Docling.
Docling is hosted as a project in the LF AI & Data Foundation.
The project was started by the AI for knowledge team at IBM Research Zurich.