GigaVector supports multiple index types, each optimized for different scenarios:
Best for:
- Small to medium datasets (< 1M vectors)
- Exact nearest neighbor search
- Low-dimensional vectors (< 100 dimensions)
- Static or rarely updated datasets
- Memory-constrained environments
Characteristics:
- O(log n) search time for balanced trees
- Exact results (no approximation)
- Memory efficient (Structure-of-Arrays layout)
- Fast insertion for small datasets
- Tree rebalancing may be needed for large datasets
When to avoid:
- High-dimensional vectors (> 100 dimensions) - curse of dimensionality
- Very large datasets (> 1M vectors) - tree depth becomes problematic
- Frequent updates - tree rebalancing overhead
Best for:
- Large datasets (millions to billions of vectors)
- High-dimensional vectors (100+ dimensions)
- Approximate nearest neighbor search with high recall
- Dynamic datasets with frequent insertions
- Production systems requiring sub-millisecond search latency
Characteristics:
- O(log n) search time with high recall
- Hierarchical graph structure
- Excellent scalability
- Supports binary quantization for faster candidate selection
- ACORN-style exploration for filtered search
When to avoid:
- Very small datasets (< 10K vectors) - overhead not justified
- Exact search requirements - HNSW is approximate
- Memory-constrained environments - higher memory overhead
Best for:
- Very large datasets (millions+ vectors)
- High-dimensional vectors (64+ dimensions)
- Memory-constrained environments
- Batch search scenarios
- Production systems with trained models
Characteristics:
- Product Quantization for memory efficiency
- Inverted File index for fast candidate retrieval
- Requires training phase before use
- Supports scalar quantization for additional compression
- Configurable trade-off between accuracy and speed
When to avoid:
- Small datasets (< 100K vectors) - training overhead not justified
- Low-dimensional vectors (< 32 dimensions) - PQ overhead
- Exact search requirements - IVFPQ is approximate
- Frequently changing data - requires retraining
Best for:
- Sparse vectors (most dimensions are zero)
- Text embeddings and bag-of-words representations
- High-dimensional sparse data
Characteristics:
- Optimized for sparse vector operations
- Memory efficient for sparse data
- Fast search for sparse queries
- Default: 16
- Range: 4-64
- Higher values:
- Better recall and search quality
- Higher memory usage
- Slower insertion time
- Lower values:
- Faster insertion
- Lower memory usage
- Potentially lower recall
- Recommendation: Start with 16, increase to 32-48 for higher quality, decrease to 8-12 for faster insertion
- Default: 200
- Range: 50-500
- Higher values:
- Better index quality and recall
- Slower insertion time
- Higher memory during construction
- Lower values:
- Faster insertion
- Potentially lower recall
- Recommendation: Use 200-400 for high quality, 100-200 for faster insertion
- Default: 50
- Range: k to 500
- Higher values:
- Better recall
- Slower search time
- Lower values:
- Faster search
- Potentially lower recall
- Recommendation: Set to 2-4x your typical
kvalue. For k=10, use efSearch=20-40
use_binary_quant: Enable for 2-3x faster search with minimal recall lossquant_rerank: Number of candidates to rerank with exact distance- Set to 2-3x your
kvalue - 0 disables reranking (fastest, lower accuracy)
- Higher values improve accuracy at cost of speed
- Set to 2-3x your
use_acorn: Enable for better recall in filtered searchesacorn_hops: Exploration depth (1-2 recommended)- 1 hop: Fast, good recall
- 2 hops: Better recall, slower
- Default: 64
- Range: 16-4096
- Higher values:
- Better accuracy
- More memory for centroids
- Slower training
- Lower values:
- Faster training and search
- Lower memory
- Potentially lower accuracy
- Recommendation:
- Small datasets (< 1M): 64-256
- Medium datasets (1M-10M): 256-1024
- Large datasets (> 10M): 1024-4096
- Rule of thumb: sqrt(N) where N is dataset size
- Default: 8
- Range: Must divide dimension evenly
- Higher values:
- Better accuracy
- More memory for codebooks
- Slightly slower search
- Lower values:
- Faster search
- Lower memory
- Potentially lower accuracy
- Recommendation:
- Dimension 64: m=8 or m=16
- Dimension 128: m=8, m=16, or m=32
- Dimension 256: m=16, m=32, or m=64
- Must evenly divide dimension
- Default: 8
- Range: 4-8
- 8 bits: Standard, good accuracy
- 4 bits: 2x memory reduction, lower accuracy
- Recommendation: Use 8 bits unless memory is critical
- Default: 4
- Range: 1 to nlist
- Higher values:
- Better recall
- Slower search
- Lower values:
- Faster search
- Potentially lower recall
- Recommendation:
- Start with nlist/16 to nlist/8
- Increase for better recall
- Decrease for faster search
- Default: 32
- Range: 0 (disabled) to several hundred
- Higher values:
- Better accuracy
- Slower search
- 0 (disabled):
- Fastest search
- Lower accuracy
- Recommendation: Set to 2-4x your typical
kvalue
use_scalar_quant: Enable for additional 2-4x memory reductionscalar_quant_config.bits: 4-8 bits per component- 8 bits: Minimal accuracy loss
- 4 bits: Significant memory savings, noticeable accuracy loss
- Recommendation: Enable for very large datasets where memory is critical
oversampling_factor: Multiplier for candidate selection- Default: 1.0
- Range: 1.0-3.0
- Higher values: Better recall, slower search
- Recommendation: Use 1.5-2.0 for high-recall scenarios
GigaVector automatically detects and uses available CPU SIMD features for optimized distance calculations.
- SSE4.2: Basic vector operations
- AVX2: 256-bit vector operations (2x speedup)
- AVX-512F: 512-bit vector operations (4x speedup on supported CPUs)
- FMA: Fused multiply-add operations
The build system automatically enables SIMD optimizations:
Makefile:
make lib # Automatically uses -march=native -msse4.2 -mavx2 -mavx512f -mfmaCMake:
cmake -B build # Automatically detects and enables available SIMD featuresGigaVector detects CPU features at runtime using gv_cpu_detect_features(). The library automatically uses the best available SIMD implementation.
- SSE4.2: 1.5-2x speedup over scalar code
- AVX2: 2-3x speedup over scalar code
- AVX-512F: 3-4x speedup over scalar code (when available)
Check which SIMD features are available:
unsigned int features = gv_cpu_detect_features();
if (gv_cpu_has_feature(GV_CPU_FEATURE_AVX2)) {
// AVX2 is available
}GigaVector uses SoA storage for efficient memory access patterns:
- Better cache locality
- SIMD-friendly memory layout
- Reduced memory fragmentation
KD-Tree:
- Vectors:
N * dimension * sizeof(float)bytes - Tree nodes:
N * (2 * sizeof(void*) + sizeof(size_t) + sizeof(float))bytes - Total: ~
N * (dimension * 4 + 32)bytes
HNSW:
- Vectors:
N * dimension * sizeof(float)bytes - Graph structure:
N * M * sizeof(size_t)bytes (M = connections per node) - Total: ~
N * (dimension * 4 + M * 8)bytes
IVFPQ:
- Compressed vectors:
N * m * sizeof(uint8_t)bytes (m = subquantizers) - Centroids:
nlist * dimension * sizeof(float)bytes - Codebooks:
m * 256 * (dimension/m) * sizeof(float)bytes - Total: ~
N * m + nlist * dimension * 4 + m * 256 * (dimension/m) * 4bytes
- Use IVFPQ for large datasets: 10-100x memory reduction vs. exact storage
- Enable scalar quantization: Additional 2-4x memory reduction
- Reduce HNSW M parameter: Lower memory usage at cost of recall
- Use binary quantization: Minimal memory overhead, significant speedup
- Batch operations: Process vectors in batches to control peak memory
Monitor memory usage:
// Get current memory usage
size_t memory_bytes = gv_db_get_memory_usage(db);
printf("Memory usage: %zu bytes\n", memory_bytes);KD-Tree:
- Search latency: 0.1-1ms (1K-100K vectors)
- Insertion: 0.01-0.1ms per vector
- Memory: Low overhead
- Recall: 100% (exact)
HNSW:
- Search latency: 0.1-5ms (1M-1B vectors)
- Insertion: 0.1-1ms per vector
- Memory: Moderate overhead (M * 8 bytes per vector)
- Recall: 95-99% (with proper efSearch tuning)
IVFPQ:
- Search latency: 0.5-10ms (1M-1B vectors)
- Insertion: 0.01-0.1ms per vector
- Memory: Very low overhead (m bytes per vector)
- Recall: 85-95% (with proper nprobe tuning)
- Start with defaults: Use default parameters for your index type
- Measure baseline: Run benchmarks with your data
- Tune for accuracy: Increase quality parameters (efSearch, nprobe, rerank)
- Tune for speed: Decrease quality parameters if latency is critical
- Enable optimizations: Enable binary quantization, scalar quantization if applicable
- Iterate: Fine-tune based on your accuracy/speed requirements
Use the provided benchmark executables:
# Build benchmarks
make bench
# Run SIMD benchmarks
./build/bench/benchmark_simd
# Run IVFPQ benchmarks
./build/bench/benchmark_ivfpq
# Run recall benchmarks
./build/bench/benchmark_ivfpq_recallWith proper tuning and SIMD optimizations:
- 2-4x speedup from SIMD (AVX2/AVX-512)
- 2-3x speedup from binary quantization (HNSW)
- 10-100x memory reduction from IVFPQ + scalar quantization
- 10-50% recall improvement from proper parameter tuning
- Choose the right index: Match index type to your dataset size and requirements
- Tune incrementally: Start with defaults, adjust one parameter at a time
- Enable SIMD: Always build with SIMD optimizations enabled
- Use quantization: Enable binary/scalar quantization when appropriate
- Monitor metrics: Track recall, latency, and memory usage
- Profile your workload: Benchmark with your actual data and queries
- Balance trade-offs: Optimize for your specific accuracy/speed/memory requirements
For specific tuning questions, refer to the API documentation or run benchmarks with your data to find optimal parameters.