PySpect

Home

Invoices

search

top

Package search

gptqmodel

  • Summary: Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
  • Author: ModelCloud
  • Homepage: https://github.com/ModelCloud/GPTQModel
  • Source: https://github.com/ModelCloud/GPTQModel
  • Repo profile
  • Number of releases: 35
  • First release: 1.0.1 on 2024-08-15T02:58:05
  • Latest release: 2.2.0 on 2025-04-03T05:28:13
Dependencies of gptqmodel (21).
DependencyOptional
auto_roundtrue
bitblastrue
clearmltrue
evalplustrue
fastapitrue
flashinfer-pythontrue
intel_extension_for_pytorchtrue
isorttrue
lm_evaltrue
mlx_lmtrue
optimumtrue
parameterizedtrue
plotlytrue
pydantictrue
pytesttrue
random_wordtrue
rufftrue
sglangtrue
tritontrue
uvicorntrue
vllmtrue