What it is: A compression technique that reduces vector storage size by breaking vectors into smaller pieces and using “codebooks.”
How it works:
- Splits each vector into smaller sub-vectors
- Creates a “codebook” of common patterns for each piece
- Stores references to codebook entries instead of actual values
Why it matters: Saves massive amounts of memory and storage while maintaining reasonable search quality. Essential for large-scale deployments.
Real-world analogy: Like using abbreviations in texting - “LOL” instead of “laugh out loud.” You lose some nuance but save space and can still communicate effectively.