PySpect

Home

lists

Frequently asked questions

© 2025 PySpect

Package profile

docling

  • Summary: SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.
  • Author: Christoph Auer <cau@zurich.ibm.com>, Michele Dolfi <dol@zurich.ibm.com>, Maxim Lysak <mly@zurich.ibm.com>, Nikos Livathinos <nli@zurich.ibm.com>, Ahmed Nassar <ahn@zurich.ibm.com>, Panos Vagenas <pva@zurich.ibm.com>, Peter Staar <taa@zurich.ibm.com>
  • Homepage: https://github.com/docling-project/docling
  • Source: https://github.com/docling-project/docling (Repo profile)
  • Number of releases: 111
  • First release: 0.1.0 on 2024-07-15
  • Latest release: 2.41.0 on 2025-07-10

Releases

Dates and sizes of releasesOctober2025AprilJulyRelease Date0.040.060.080.100.120.140.160.180.20Size in MB

PyPI Downloads

Weekly downloads over the last 3 monthsFebruaryMarchAprilMayJuneJulyDate020406080100120140160180200220 thousand downloads per week

Dependencies

Docling has 33 dependencies, 8 of which optional.
Dependencies of docling (33).
DependencyOptional
beautifulsoup4false
certififalse
docling-corefalse
docling-ibm-modelsfalse
docling-parsefalse
easyocrfalse
filetypefalse
huggingface_hubfalse
lxmlfalse
markofalse
openpyxlfalse
pandasfalse
pillowfalse
pluggyfalse
pydanticfalse
pydantic-settingsfalse
pylatexencfalse
pypdfium2false
python-docxfalse
python-pptxfalse
requestsfalse
rtreefalse
scipyfalse
tqdmfalse
typerfalse
acceleratetrue
mlx-vlmtrue
ocrmactrue
onnxruntimetrue
openai-whispertrue
rapidocr-onnxruntimetrue
tesserocrtrue
transformerstrue

Details