Vector database Weekly — 2026-04, Week 17

Editor’s Note

This week’s material converges on two interrelated tensions in production vector and graph database deployments: the cost of retaining too much (bloated context, token waste, PII exposure) and the engineering trade-offs required to retain less or protect what is stored. Alongside those concerns, benchmark results from the graph database space and a reported 16× vector search speedup offer concrete performance reference points for practitioners evaluating architecture options.


Top Stories

Applying Ebbinghaus Forgetting Curves to RAG Memory Management

Standard RAG implementations accumulate every transient artifact indefinitely, which inflates context windows and drives up token costs in long-running agentic workloads. A community-built MCP server addresses this by layering a graph store over a DuckDB-backed vector store and applying Ebbinghaus forgetting-curve decay with spaced-repetition reinforcement — treating “what to forget” as an engineering concern on equal footing with “what to remember.” The implementation reports 52% Recall@5 on the LoCoMo dataset, roughly doubling stateless vector-store baselines, alongside an approximately 84% reduction in token waste. For teams running persistent agents at scale, the architectural pattern is worth examining closely read more.

A structural privacy gap exists in current vector database deployments: similarity search requires embeddings to be decrypted server-side, and published embedding-inversion research has demonstrated that source text can be recovered from those embeddings — a meaningful exposure risk for RAG pipelines handling medical, legal, or financial data. XTrace, released under Apache 2.0, mitigates this by encrypting vectors with Paillier homomorphic encryption and document text with AES-256 entirely client-side, so the server performs all operations on ciphertexts and never sees plaintext. The team explicitly acknowledges measurable latency overhead from the encryption layer as an active engineering trade-off rather than a solved problem read more.

NeuG Graph Database: Dual-Mode Architecture and LDBC Benchmark Results

A persistent friction point with graph databases is the gap between zero-config embedded deployments and production network-service configurations, which typically require migrating data or managing separate engine instances. NeuG, built on the GraphScope Flex engine, resolves this with a single-line mode switch between embedded and service operation on the same data and query interface. Community benchmarks on the LDBC SNB SF1 dataset (roughly 3 million nodes, 17 million edges) report 617 QPS in service mode against Neo4j’s 12 QPS — a 50.6× throughput difference — with P95 latency of 20ms versus 1,728ms. In single-threaded embedded mode, NeuG outperforms a Kuzu-based baseline on 8 of 9 LSQB queries, including a 287× improvement on triangle-pattern queries read more.

16× Vector Search Throughput via Hot-Path Profiling

An independent engineering writeup details achieving a 16× throughput improvement on a vector search engine by identifying and optimizing the computational hot path, with full methodology published for review. The result is notable less for the headline number than for the methodology: systematic profiling to isolate the dominant cost center before applying targeted optimization. For teams building or tuning their own vector search infrastructure, the detailed walkthrough offers a replicable diagnostic framework read more.


Releases

SQLite vec1 Extension — The official SQLite Vec1 extension is available, enabling native vector search within SQLite without requiring a separate vector database process or external dependency. Details and documentation are at sqlite.org/vec1.


Security and Compliance

PII Leakage via Embedding Inversion in RAG Pipelines — An independent analysis argues that PII exposure through vector embeddings is broadly underacknowledged relative to its severity, with RAG pipelines routinely persisting sensitive data inside vector stores without adequate encryption-at-rest or encryption-in-use controls. The piece reinforces the case made by the XTrace work: the vector store itself must now be treated as a sensitive data store read more.

Plaintext Embedding Exposure on Shared Vector Infrastructure — All major vector databases currently require plaintext embeddings server-side, meaning published inversion techniques constitute an active data-recovery risk for sensitive-domain deployments. Homomorphic encryption approaches such as XTrace represent one mitigation path, though the latency trade-off requires evaluation against specific workload requirements read more.


Worth Reading