Vector database Weekly — 2026-03, Week 13
Editor’s Note
The architecture conversation around retrieval-augmented generation continues to evolve beyond simple embedding similarity. This week brings three distinct technical approaches: graph-based associative memory that preserves semantic structure when context windows fill, typed reasoning primitives designed for regulatory audit trails, and geometric pruning algorithms that outperform BLAS implementations at half-million-vector scale.
Top Stories
Graph-Based Context Injection Replaces Summarization Under Constraint
Breathe-Memory introduces an alternative to traditional RAG by extracting semantic anchors from user messages and traversing a concept graph via breadth-first search to inject relevant context in under 60 milliseconds. When context capacity is reached, the system extracts structured graphs representing topics, decisions, and artifacts rather than producing text summaries, preserving semantic relationships that conventional summarization destroys. The reference implementation uses PostgreSQL with pgvector and targets scenarios where maintaining structural fidelity matters more than compressing token counts. Read more
Typed Reasoning Graphs Target EU AI Act Transparency Requirements
FlowScript presents a typed reasoning graph system using primitives including thoughts, questions, decisions, and blockers, with deterministic query operations such as tensions, blocked, why, whatIf, and alternatives. Unlike vector-based memory systems that treat contradictions as retrieval noise, FlowScript preserves contradictions as named relationships and provides hash-encoded audit trails. The design addresses transparency requirements in the EU AI Act, which mandate auditable decision pathways by August 2026. Read more
DuckDB Extension Brings Prefiltered ANN to Analytical Workflows
A community-contributed DuckDB extension implementing the ACORN-1 algorithm now enables approximate nearest neighbor search with WHERE clause prefiltering, addressing limitations in pgvector-style workflows that apply filters after vector retrieval. The implementation required modifications to vendored usearch dependencies and has been accepted into the official DuckDB community extensions repository, providing analytical pipelines with integrated vector operations that respect SQL predicates during index traversal. Read more
Triangle Inequality Pruning Outpaces BLAS at Half-Million Scale
PyNear benchmark results demonstrate that exact L2 search using Vantage-Point Trees achieves 2.2 milliseconds versus Faiss IndexFlatL2’s 85 milliseconds at 512 dimensions and 500,000 vectors, representing a 39× speedup. Multi-Index Hashing for binary descriptors achieves 0.037 milliseconds versus Faiss’s 9.5 milliseconds at one million vectors with 100% Recall@10, a 257× improvement. The benchmarks also identified an O(N²) complexity bug in Faiss IndexBinaryIVF that required 34 minutes to build indices at one million vectors. These results suggest that geometric pruning can outperform BLAS-optimized brute force at scales previously assumed to favor hardware acceleration. Read more and benchmarks
Releases
Octopus ships as an open-source AI code reviewer for GitHub and Bitbucket, using RAG with Qdrant vector search to analyze full codebases rather than individual diffs and posting inline findings with severity ratings. More details
TurboQuant provides 2-4 bit compression for vector search workloads. More details
Pgsemantic enables vector search by pointing at existing Postgres databases without schema modification. More details
Altor-vec delivers a 54KB HNSW vector search engine compiled to WebAssembly for client-side deployment. More details
SentrySearch implements sub-second video search using Gemini Embedding 2’s native video projection into 768-dimensional vector space, indexing to ChromaDB with natural language queries. More details
DuoRAG introduces a dual-stack RAG system with initial metadata schema discovery and self-updating schema when failing to answer questions, addressing top-k incompleteness for structured queries. More details
VellaVeto ships as a fail-closed gateway for MCP tool calls with three zero-config protection levels and includes MCPSEC, a benchmark for evaluating MCP security. More details
Worth Reading
Semantic Gating: Partitioning Filtered ANN — architectural patterns for prefiltered approximate nearest neighbor search.
PyNear performance discussion — video walkthrough of triangle inequality pruning benchmarks.