Repository info
GPTQModel
- Summary: a summary of GPTQModel
- Description: Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
- Stars: 670
- Number of forks: 99
© 2025 PySpect