Add SQLite cache backend by JacksonKaunismaa · Pull Request #157 · safety-research/safety-tooling

JacksonKaunismaa · 2026-02-28T23:44:54Z

Summary

Adds SQLiteCacheManager as a new cache backend, selectable via CacheBackend.SQLITE enum
Per-model .sqlite files with WAL mode, connection pooling, zstd compression, schema versioning
Fixes pre-existing pyright errors in cache_manager.py (nullable responses, redis type annotations)
Replaces use_redis: bool with cache_backend: CacheBackend enum on InferenceAPI and BatchInferenceAPI

Motivation

FileBasedCacheManager reloads the entire bin file from disk on every cache miss — even if the bin is already in memory. With accumulated cache (e.g. 543MB across 20 bins from past runs), 10,000 concurrent lookups with 65% miss rate causes ~182GB of JSON parsing serialized on the event loop, freezing it for 8+ minutes.

Benchmark (3,500 entries, 10k lookups, 65% miss rate, 28MB/bin)

	File-based	SQLite
10,000 lookups	484.9s	2.5s
Event loop blocked	8 min (frozen)	2.5s
Throughput	21 lookups/s	3,985 lookups/s
Cache on disk	559 MB	3 MB
Populate 3,500 entries	341s	2.1s
Speedup		193x

Usage

from safetytooling.apis import InferenceAPI, CacheBackend

api = InferenceAPI(cache_backend=CacheBackend.SQLITE)

Values: CacheBackend.FILE (default, existing behavior), CacheBackend.SQLITE, CacheBackend.REDIS.

Test plan

21 new tests covering save/load, batch, compression, schema versioning, stats, moderation, embeddings, WAL mode
Existing 7 FileBasedCacheManager tests still pass
Stress-tested with realistic entry sizes (28MB/bin, 10k concurrent lookups)
Run existing test_api_cache.py integration tests with CacheBackend.SQLITE

Replace JSON bin-file approach with per-model SQLite databases. Activated via SQLITE_CACHE=true env var or use_sqlite=True in get_cache_manager(). Key improvements: - O(1) lookup by primary key (no loading entire 28MB bin files) - WAL mode for concurrent readers without blocking - Connection pooling (reuse across calls) - zstd compression (~559MB JSON → 3MB SQLite) - Schema versioning (stale entries = clean cache miss) - Batch lookups via SQL IN clause - Built-in hit/miss/cost statistics Benchmark (3500 entries, 10k lookups with 65% miss rate, 28MB/bin): File-based: 484.9s (event loop frozen 8 min) SQLite: 2.5s (193x faster) The pathology: FileBasedCacheManager reloads the ENTIRE bin from disk on every cache miss (to check if another process wrote the entry). With 6500 misses × 28MB bins = 182GB of JSON parsing serialized on the event loop. SQLite misses are a single B-tree lookup returning NULL. Also fixes pre-existing pyright errors in cache_manager.py (nullable responses field on LLMCache, redis type annotations).

JacksonKaunismaa force-pushed the sqlite-cache-backend branch 6 times, most recently from 372c765 to 9de867e Compare March 1, 2026 01:15

JacksonKaunismaa force-pushed the sqlite-cache-backend branch from 9de867e to ecea69d Compare March 1, 2026 01:57

JacksonKaunismaa changed the title ~~Add SQLite cache backend (193x faster at scale)~~ Add SQLite cache backend Mar 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SQLite cache backend #157

Add SQLite cache backend #157
JacksonKaunismaa wants to merge 1 commit into
mainfrom
sqlite-cache-backend

JacksonKaunismaa commented Feb 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JacksonKaunismaa commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Benchmark (3,500 entries, 10k lookups, 65% miss rate, 28MB/bin)

Usage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JacksonKaunismaa commented Feb 28, 2026 •

edited

Loading