Memory-friendly Vector Database

LSM-Vec

A disk-oriented vector database for approximate nearest-neighbor search, managing HNSW graph index on-disk with graph-oriented LSM-tree storage.

Get Started GitHub

Why LSM-Vec?

Key differences from existing vector databases.

Minimal Memory Overhead

Unlike many vector databases that keep large index state in memory, LSM-Vec is fully disk-oriented. Its memory footprint remains small and predictable even at large data scale.

Graph-Oriented LSM-Tree

LSM-Vec stores the majority of the HNSW index within Aster, a RocksDB fork with a graph data model. This graph-oriented LSM-tree structure enables search and update performance comparable to in-memory vector databases.

Embeddable & Easy to Use

LSM-Vec is offered as a lightweight C++ library with Python bindings. Build with a few commands, then link the library or import the module to get started.

Features

Everything you need for high-performance vector search on disk.

HNSW Graph Index

Disk-oriented design with LSM-tree Based Data Structure

Dual Vector Storage

BasicVectorStorage (flat file) and PagedVectorStorage (4KB page-managed with FIFO cache)

Batch Vector Read

Groups neighbor reads by page to reduce I/O during search

Persistent Metadata

Close and reopen the database without re-indexing

SIMD Distance Metrics

L2 and Cosine distance with AVX2/SSE2 acceleration

Python SDK

Full Python bindings via pybind11 with NumPy support