AI Term 1 min read

Product Quantization

What it is: A compression technique that reduces vector storage size by breaking vectors into smaller pieces and using “codebooks.”

How it works:

  • Splits each vector into smaller sub-vectors
  • Creates a “codebook” of common patterns for each piece
  • Stores references to codebook entries instead of actual values

Why it matters: Saves massive amounts of memory and storage while maintaining reasonable search quality. Essential for large-scale deployments.

Real-world analogy: Like using abbreviations in texting - “LOL” instead of “laugh out loud.” You lose some nuance but save space and can still communicate effectively.

← Back to Glossary