Package search
gptqmodel
- Summary: Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
- Author: ModelCloud
- Homepage: https://github.com/ModelCloud/GPTQModel
- Source: https://github.com/ModelCloud/GPTQModel
- Repo profile
- Number of releases: 35
- First release: 1.0.1 on 2024-08-15T02:58:05
- Latest release: 2.2.0 on 2025-04-03T05:28:13
Dependency | Optional |
---|---|
auto_round | true |
bitblas | true |
clearml | true |
evalplus | true |
fastapi | true |
flashinfer-python | true |
intel_extension_for_pytorch | true |
isort | true |
lm_eval | true |
mlx_lm | true |
optimum | true |
parameterized | true |
plotly | true |
pydantic | true |
pytest | true |
random_word | true |
ruff | true |
sglang | true |
triton | true |
uvicorn | true |
vllm | true |