Package profile
docling
- Summary: SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.
- Author: Christoph Auer <cau@zurich.ibm.com>, Michele Dolfi <dol@zurich.ibm.com>, Maxim Lysak <mly@zurich.ibm.com>, Nikos Livathinos <nli@zurich.ibm.com>, Ahmed Nassar <ahn@zurich.ibm.com>, Panos Vagenas <pva@zurich.ibm.com>, Peter Staar <taa@zurich.ibm.com>
- Homepage: https://github.com/docling-project/docling
- Source: https://github.com/docling-project/docling (Repo profile)
- Number of releases: 111
- First release: 0.1.0 on 2024-07-15
- Latest release: 2.41.0 on 2025-07-10
Releases
PyPI Downloads
Dependencies
Docling has 33 dependencies, 8 of which optional.Dependency | Optional |
---|---|
beautifulsoup4 | false |
certifi | false |
docling-core | false |
docling-ibm-models | false |
docling-parse | false |
easyocr | false |
filetype | false |
huggingface_hub | false |
lxml | false |
marko | false |
openpyxl | false |
pandas | false |
pillow | false |
pluggy | false |
pydantic | false |
pydantic-settings | false |
pylatexenc | false |
pypdfium2 | false |
python-docx | false |
python-pptx | false |
requests | false |
rtree | false |
scipy | false |
tqdm | false |
typer | false |
accelerate | true |
mlx-vlm | true |
ocrmac | true |
onnxruntime | true |
openai-whisper | true |
rapidocr-onnxruntime | true |
tesserocr | true |
transformers | true |