PySpect

Home

Invoices

search

top

Repository info

GPTQModel

  • Summary: a summary of GPTQModel
  • Description: Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
  • Stars: 578
  • Number of forks: 82