PySpect

Home

lists

Frequently asked questions

© 2025 PySpect

Package profile

unstructured

  • Summary: A library that prepares raw documents for downstream ML tasks.
  • Author: Unstructured Technologies
  • Homepage: https://github.com/Unstructured-IO/unstructured
  • Source: https://github.com/Unstructured-IO/unstructured (Repo profile)
  • Number of releases: 192
  • First release: 0.0.1.dev0 on 2022-09-06
  • Latest release: 0.18.14 on 2025-08-26

Releases

Dates and sizes of releases202320242025Release Date0.51.01.52.0Size in MB

PyPI Downloads

Loading PyPI statistics...

Dependencies

Unstructured has 47 dependencies, 26 of which optional.
Dependencies of unstructured (47).
DependencyOptional
backofffalse
beautifulsoup4false
charset-normalizerfalse
dataclasses-jsonfalse
emojifalse
filetypefalse
html5libfalse
langdetectfalse
lxmlfalse
nltkfalse
numpyfalse
psutilfalse
python-iso639false
python-magicfalse
python-oxmsgfalse
rapidfuzzfalse
requestsfalse
tqdmfalse
typing-extensionsfalse
unstructured-clientfalse
wraptfalse
effdettrue
google-cloud-visiontrue
markdowntrue
msoffcrypto-tooltrue
networkxtrue
onnxtrue
onnxruntimetrue
openpyxltrue
paddlepaddletrue
pandastrue
pdf2imagetrue
pdfminer.sixtrue
pi-heiftrue
pikepdftrue
pypandoctrue
pypdftrue
python-docxtrue
python-pptxtrue
sacremosestrue
sentencepiecetrue
torchtrue
transformerstrue
unstructured-inferencetrue
unstructured.paddleocrtrue
unstructured.pytesseracttrue
xlrdtrue

Details