PySpect

Home

lists

Frequently asked questions

© 2025 PySpect

Repository info

LLMLingua

  • Summary: a summary of LLMLingua
  • Description: [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
  • Stars: 5272
  • Number of forks: 311