PySpect

Home

lists

Frequently asked questions

© 2025 PySpect

Package profile

unstructured

  • Summary: A library that prepares raw documents for downstream ML tasks.
  • Author: Unstructured Technologies
  • Homepage: https://github.com/Unstructured-IO/unstructured
  • Source: https://github.com/Unstructured-IO/unstructured (Repo profile)
  • Number of releases: 189
  • First release: 0.0.1.dev0 on 2022-09-06
  • Latest release: 0.18.9 on 2025-07-16

Releases

Dates and sizes of releasesOctober2023AprilJulyOctober2024AprilJulyOctober2025AprilJulyRelease Date0.51.01.52.0Size in MB

PyPI Downloads

Weekly downloads over the last 3 monthsFebruaryMarchAprilMayJuneJulyDate0100200300400500600700800 thousand downloads per week

Dependencies

Unstructured has 47 dependencies, 26 of which optional.
Dependencies of unstructured (47).
DependencyOptional
backofffalse
beautifulsoup4false
chardetfalse
dataclasses-jsonfalse
emojifalse
filetypefalse
html5libfalse
langdetectfalse
lxmlfalse
nltkfalse
numpyfalse
psutilfalse
python-iso639false
python-magicfalse
python-oxmsgfalse
rapidfuzzfalse
requestsfalse
tqdmfalse
typing-extensionsfalse
unstructured-clientfalse
wraptfalse
effdettrue
google-cloud-visiontrue
markdowntrue
msoffcrypto-tooltrue
networkxtrue
onnxtrue
onnxruntimetrue
openpyxltrue
paddlepaddletrue
pandastrue
pdf2imagetrue
pdfminer.sixtrue
pi-heiftrue
pikepdftrue
pypandoctrue
pypdftrue
python-docxtrue
python-pptxtrue
sacremosestrue
sentencepiecetrue
torchtrue
transformerstrue
unstructured-inferencetrue
unstructured.paddleocrtrue
unstructured.pytesseracttrue
xlrdtrue

Details