Python API Reference

Complete reference for the vectlite Python package.

import vectlite

Module Functions

`open`

vectlite.open(
    path: str,
    dimension: int | None = None,
    *,
    metric: str = "cosine",
    read_only: bool = False,
    lock_timeout: float | None = None,
) -> Database

Open or create a vectlite database.

Parameter	Type	Description
`path`	`str`	Path to the `.vdb` file. Created if it does not exist.
`dimension`	`int \| None`	Vector dimension. Required when creating a new database. Omit when opening an existing one (the stored dimension is used).
`metric`	`str`	Distance metric used at creation time. One of `"cosine"` (default), `"euclidean"` (alias `"l2"`), `"dotproduct"` (aliases `"dot"`, `"ip"`, `"inner_product"`, `"dot_product"`), or `"manhattan"` (alias `"l1"`). Persisted in the database file. Ignored when opening an existing database. Scores are always oriented so higher is better.
`read_only`	`bool`	Open in read-only mode. Uses shared file locks so multiple readers can access the same file. Write operations raise `VectLiteError`.
`lock_timeout`	`float \| None`	Maximum seconds to wait when acquiring the advisory file lock. `None` fails immediately on contention.

Returns: Database

Use as a context manager to release the file lock deterministically:

with vectlite.open("knowledge.vdb", dimension=384) as db:
    db.upsert("doc1", embedding, {"source": "blog"})

`open_store`

vectlite.open_store(root: str) -> Store

Open or create a collection store (a directory of independent databases).

Parameter	Type	Description
`root`	`str`	Path to the directory that holds the collections. Created if it does not exist.

Returns: Store

`restore`

vectlite.restore(source: str, dest: str) -> Database

Restore a backup to a new database path.

Parameter	Type	Description
`source`	`str`	Path to the backup directory (created by `Database.backup()`).
`dest`	`str`	Path where the restored `.vdb` file will be written.

Returns: Database -- the restored database, opened for read-write.

`sparse_terms`

vectlite.sparse_terms(text: str) -> dict[str, float]

Tokenize and weight text into a sparse term vector suitable for BM25 search.

Parameter	Type	Description
`text`	`str`	Input text to analyse.

Returns: dict[str, float] -- mapping of terms to their weights.

`upsert_text`

vectlite.upsert_text(
    db: Database,
    id: str,
    text: str,
    embed: Callable[[str], list[float]],
    metadata: Metadata | None = None,
    namespace: str | None = None,
) -> None

High-level helper that generates a dense embedding and sparse terms from text, then upserts the record.

Parameter	Type	Description
`db`	`Database`	Target database.
`id`	`str`	Record identifier.
`text`	`str`	Text to embed and index.
`embed`	`Callable[[str], list[float]]`	Function that converts text to a dense vector.
`metadata`	`Metadata \| None`	Optional metadata dict.
`namespace`	`str \| None`	Optional namespace.

`search_text`

vectlite.search_text(
    db: Database,
    query: str,
    embed: Callable[[str], list[float]],
    *,
    k: int = 10,
    filter: Filter | None = None,
    namespace: str | None = None,
    all_namespaces: bool = False,
    dense_weight: float = 1.0,
    sparse_weight: float = 1.0,
    fetch_k: int = 0,
    mmr_lambda: float | None = None,
    vector_name: str | None = None,
    fusion: str = "linear",
    rrf_k: int = 60,
    explain: bool = False,
    rerank: RerankHook | None = None,
    rerank_k: int = 0,
) -> list[SearchResult]

High-level hybrid search. Generates a dense embedding and sparse terms from query, then runs a fused search.

Parameter	Type	Default	Description
`db`	`Database`	--	Target database.
`query`	`str`	--	Natural-language query.
`embed`	`Callable[[str], list[float]]`	--	Function that converts text to a dense vector.
`k`	`int`	`10`	Number of results to return.
`filter`	`Filter \| None`	`None`	MongoDB-style metadata filter.
`namespace`	`str \| None`	`None`	Restrict to a single namespace.
`all_namespaces`	`bool`	`False`	Search across all namespaces.
`dense_weight`	`float`	`1.0`	Weight for the dense score component.
`sparse_weight`	`float`	`1.0`	Weight for the sparse (BM25) score component.
`fetch_k`	`int`	`0`	Number of candidates to fetch before re-ranking. `0` uses the engine default.
`mmr_lambda`	`float \| None`	`None`	Maximal Marginal Relevance diversity parameter (0 = max diversity, 1 = max relevance). `None` disables MMR.
`vector_name`	`str \| None`	`None`	Search a specific named vector space.
`fusion`	`str`	`"linear"`	Fusion strategy: `"linear"` or `"rrf"`.
`rrf_k`	`int`	`60`	RRF smoothing constant (only used when `fusion="rrf"`).
`explain`	`bool`	`False`	Include scoring breakdown in results.
`rerank`	`RerankHook \| None`	`None`	Optional reranker function. See vectlite.rerankers.
`rerank_k`	`int`	`0`	Number of candidates to pass to the reranker. `0` uses `fetch_k`.

Returns: list[SearchResult]

`search_text_with_stats`

vectlite.search_text_with_stats(
    db: Database,
    query: str,
    embed: Callable[[str], list[float]],
    *,
    # same parameters as search_text
) -> SearchResponse

Same as search_text but returns a SearchResponse containing both results and query statistics.

`configure_opentelemetry`

vectlite.configure_opentelemetry(
    config: bool | dict | None = None,
) -> object | None

Enable, disable, or configure optional OpenTelemetry tracing for search operations. When active, every search_text() and search_text_with_stats() call is wrapped in a vectlite.search span carrying db.system, db.operation.name, and search-specific attributes (vectlite.search.k, .namespace, .has_dense, .has_sparse, .fusion, .used_ann, .result_count, .total_us).

opentelemetry-api is imported lazily and is not a runtime dependency. If a search raises, the span records the exception and sets an error status before re-raising.

Form	Behaviour
`configure_opentelemetry()`	Auto-detect: resolves a tracer from `opentelemetry.trace` if installed.
`configure_opentelemetry({"tracer": my_tracer})`	Use a user-supplied tracer instance.
`configure_opentelemetry({"tracer_name": "my-app"})`	Override the tracer name (default `"vectlite"`).
`configure_opentelemetry(False)`	Disable tracing.

Returns: the active tracer (or None when disabled).

Database

Returned by open() and restore().

Properties

Property	Type	Description
`path`	`str`	Absolute path to the `.vdb` file.
`wal_path`	`str`	Path to the write-ahead log file.
`dimension`	`int`	Vector dimension for this database.
`metric`	`str`	Distance metric used at creation time: `"cosine"`, `"euclidean"`, `"dotproduct"`, or `"manhattan"`.
`read_only`	`bool`	Whether the database was opened in read-only mode.
`quantization_method`	`str \| None`	Active quantization method (`"scalar"`, `"binary"`, `"product"`) or `None`.

Property vs method

quantization_method is a property (no parentheses). Use db.is_quantized() as a method to check whether quantization is enabled — is_quantized was changed from a property to a method in 0.9.1 for consistency with the rest of the quantization helper API.

`count`

db.count(
    *,
    namespace: str | None = None,
    filter: Filter | None = None,
) -> int

Return the number of records in the database. Use len(db) as a shortcut for the unfiltered count.

Parameter	Type	Description
`namespace`	`str \| None`	Restrict the count to a single namespace.
`filter`	`Filter \| None`	MongoDB-style metadata filter to scope the count.

`namespaces`

db.namespaces() -> list[str]

Return a list of all namespaces present in the database.

`transaction`

db.transaction() -> Transaction

Begin a new transaction. Use as a context manager for automatic commit/rollback:

with db.transaction() as tx:
    tx.upsert("id", vector, metadata)

Returns: Transaction

`insert`

db.insert(
    id: str,
    vector: list[float],
    metadata: Metadata | None = None,
    *,
    namespace: str | None = None,
    sparse: dict[str, float] | None = None,
    vectors: dict[str, list[float]] | None = None,
    ttl: float | None = None,
) -> None

Insert a new record. Raises VectLiteError if a record with the same id (and namespace) already exists.

Parameter	Type	Description
`id`	`str`	Record identifier.
`vector`	`list[float]`	Dense embedding vector.
`metadata`	`Metadata \| None`	Optional metadata dict.
`namespace`	`str \| None`	Target namespace.
`sparse`	`dict[str, float] \| None`	Sparse term vector for BM25 search.
`vectors`	`dict[str, list[float]] \| None`	Additional named vectors.
`ttl`	`float \| None`	Time-to-live in seconds. Expired records are filtered from reads and garbage-collected on `compact()`.

`upsert`

db.upsert(
    id: str,
    vector: list[float],
    metadata: Metadata | None = None,
    *,
    namespace: str | None = None,
    sparse: dict[str, float] | None = None,
    vectors: dict[str, list[float]] | None = None,
    ttl: float | None = None,
) -> None

Insert or update a record. If the id already exists the record is replaced.

Parameters are identical to insert().

`insert_many`

db.insert_many(
    records: list[Record],
    *,
    namespace: str | None = None,
) -> int

Batch insert multiple records. Raises on duplicate IDs.

Parameter	Type	Description
`records`	`list[Record]`	List of record dicts with keys `id`, `vector`, and optionally `metadata`, `sparse`, `vectors`.
`namespace`	`str \| None`	Target namespace for all records.

Returns: int -- number of records inserted.

`upsert_many`

db.upsert_many(
    records: list[Record],
    *,
    namespace: str | None = None,
) -> int

Batch upsert multiple records.

Returns: int -- number of records upserted.

`bulk_ingest`

db.bulk_ingest(
    records: list[Record],
    *,
    namespace: str | None = None,
    batch_size: int = 10000,
    m: int | None = None,
    ef_construction: int | None = None,
    ef_search: int | None = None,
    parallel_insert_threshold: int | None = None,
    tombstone_rebuild_pct: int | None = None,
    segment_size_threshold: int | None = None,
) -> int

Ingest a batch of records efficiently. Releases the GIL during the underlying Rust call (since 0.12.0), so other Python threads make progress during ingestion.

Since 0.10.0, bulk_ingest uses parallel_insert (Rayon-backed) when the dataset is large enough (>= parallel_insert_threshold, default 256) and coalesces WAL writes into a single fsync at the end of the batch — typically 10–20× faster than the previous fsync-per-batch path.

Use bulk_ingest_array for large NumPy batches

For large batches where vectors already live in a numpy.ndarray[float32], prefer bulk_ingest_array — it skips per-record dict parsing and the per-element Python → Rust crossing, and is typically 10–30× faster than bulk_ingest(list_of_dicts).

Parameter	Type	Description
`records`	`list[Record]`	List of record dicts.
`namespace`	`str \| None`	Target namespace.
`batch_size`	`int`	Internal WAL batch size.
`m`	`int \| None`	HNSW out-degree per node. Higher = better recall, larger graph.
`ef_construction`	`int \| None`	HNSW build-time search width. Higher = better recall, slower ingest.
`ef_search`	`int \| None`	HNSW query-time search width. Free to retune.
`parallel_insert_threshold`	`int \| None`	Dataset size at which parallel HNSW insertion kicks in.
`tombstone_rebuild_pct`	`int \| None`	(0.11.0+) HNSW rebuild threshold after deletes (default `30`).
`segment_size_threshold`	`int \| None`	(0.12.0+) HNSW segment size in LSM-style segmenting (default `50_000`).

Returns: int -- total number of records ingested.

See: HNSW tuning guide, and db.set_index_config() below for retuning an existing database.

`bulk_ingest_array`

Added in 0.12.0.

db.bulk_ingest_array(
    ids: list[str] | numpy.ndarray,
    vectors: numpy.ndarray,                  # shape (N, D), dtype float32
    metadata: list[Metadata] | None = None,
    *,
    namespace: str | None = None,
    batch_size: int = 10000,
    m: int | None = None,
    ef_construction: int | None = None,
    ef_search: int | None = None,
    parallel_insert_threshold: int | None = None,
    tombstone_rebuild_pct: int | None = None,
    segment_size_threshold: int | None = None,
) -> int

Zero-copy bulk ingest directly from a contiguous numpy.ndarray[float32] of shape (N, D). The vector data is passed by reference to the Rust core — no per-record dict parsing, no per-element Python → Rust crossing.

10–30× faster than bulk_ingest(list_of_dicts) on large batches. Releases the GIL during the underlying Rust call.

import numpy as np
import vectlite

with vectlite.open("knowledge.vdb", dimension=384) as db:
    ids = [f"doc-{i}" for i in range(N)]
    vectors = np.asarray(my_embeddings, dtype=np.float32)   # shape (N, 384)
    metadata = [{"source": "blog"} for _ in range(N)]

    db.bulk_ingest_array(ids, vectors, metadata)

Parameter	Type	Description
`ids`	`list[str]` \| `np.ndarray`	Record ids, parallel to the first axis of `vectors`.
`vectors`	`np.ndarray[float32]`	C-contiguous array of shape `(N, D)` where `D` equals the database dimension.
`metadata`	`list[Metadata] \| None`	Optional metadata dicts, parallel to `ids`.
Other kwargs		Same HNSW / segment knobs as `bulk_ingest`.

Returns: int — total number of records ingested.

Requires NumPy

Since 0.12.0 the Python package declares numpy>=1.23 as a runtime dependency.

`ann_segment_count`

Added in 0.12.0.

db.ann_segment_count(
    namespace: str | None = None,
    vector_name: str | None = None,
) -> int

Return the number of HNSW segments in the active ANN index. Since 0.12.0, the HNSW index is built LSM-tree style: new inserts land in the active segment until it hits segment_size_threshold (default 50_000), then a fresh segment is started. Search queries every segment and merges top-K.

print(db.ann_segment_count())                # default namespace, default dense vector
print(db.ann_segment_count(vector_name="colbert"))

Older ANN1 / ANN2 manifests are still readable — they report as 1-segment indexes.

`get`

db.get(
    id: str,
    *,
    namespace: str | None = None,
) -> Record | None

Retrieve a record by ID. Returns None if not found.

`delete`

db.delete(
    id: str,
    *,
    namespace: str | None = None,
) -> bool

Delete a record by ID. Returns True if the record existed and was deleted.

`delete_many`

db.delete_many(
    ids: list[str],
    *,
    namespace: str | None = None,
) -> int

Delete multiple records by ID. Returns the number of records actually deleted.

`delete_by_filter`

db.delete_by_filter(
    filter: Filter,
    *,
    namespace: str | None = None,
) -> int

Delete every record that matches filter in a single pass. Returns the number of records deleted.

deleted = db.delete_by_filter({"stale": True}, namespace="docs")

`update_metadata`

db.update_metadata(
    id: str,
    metadata: Metadata,
    *,
    namespace: str | None = None,
) -> bool

Merge a metadata patch into an existing record without re-writing the vector. Keys present in metadata overwrite existing keys; other keys are left untouched. Skips ANN, sparse, quantized, and multi-vector index rebuilds when a WAL batch contains only metadata updates.

Returns True if the record was found and updated, False if the id does not exist.

db.update_metadata("doc1", {"status": "reviewed", "score": 0.95})

`list`

db.list(
    *,
    namespace: str | None = None,
    filter: Filter | None = None,
    limit: int = 0,
    offset: int = 0,
) -> list[Record]

List records without issuing a vector query. limit=0 returns every match.

records = db.list(namespace="docs", filter={"stale": False}, limit=20)

`list_cursor`

db.list_cursor(
    *,
    namespace: str | None = None,
    filter: Filter | None = None,
    limit: int = 100,
    cursor: str | None = None,
) -> tuple[list[Record], str | None]

Cursor-based pagination for efficient iteration over large collections. Returns the next page of records and an opaque continuation cursor (None when exhausted).

cursor = None
while True:
    page, cursor = db.list_cursor(limit=100, cursor=cursor)
    for record in page:
        process(record)
    if cursor is None:
        break

`set_ttl`

db.set_ttl(
    id: str,
    ttl_secs: float,
    *,
    namespace: str | None = None,
) -> bool

Set a time-to-live (in seconds from now) on an existing record. Expired records are transparently filtered from get(), list(), count(), and search(), and are permanently removed on compact(). The expires_at field on Record exposes the absolute expiry timestamp (epoch seconds).

`clear_ttl`

db.clear_ttl(
    id: str,
    *,
    namespace: str | None = None,
) -> bool

Remove the expiry from a record so it lives indefinitely.

`create_index`

db.create_index(field: str, index_type: str) -> None

Create a payload index on a metadata field to accelerate filtered queries 10-100x on large collections. Indexes are automatically used by search(), count(), and list() to narrow candidates before full filter evaluation. AND filters intersect index results; OR filters union when all sub-filters are indexed. Index definitions persist across close/reopen in a .vdb.pidx sidecar file.

Parameter	Type	Description
`field`	`str`	Metadata field to index.
`index_type`	`str`	`"keyword"` for string equality / `$in`, or `"numeric"` for range queries (`$gt`, `$gte`, `$lt`, `$lte`).

db.create_index("source", "keyword")
db.create_index("score", "numeric")

`drop_index`

db.drop_index(field: str) -> bool

Remove a payload index. Returns True if the index existed.

`list_indexes`

db.list_indexes() -> list[tuple[str, str]]

Return all active payload indexes as [(field, index_type), ...].

`enable_quantization`

db.enable_quantization(
    method: str,
    *,
    rescore_multiplier: int = 4,
    num_sub_vectors: int | None = None,
    num_centroids: int = 256,
) -> None

Enable vector quantization to reduce memory and accelerate search. All methods use a 2-stage pipeline: fast quantized candidate selection followed by exact float32 rescoring. Parameters persist in a .vdb.quant sidecar file and reload automatically on open. Indexes rebuild automatically on inserts, upserts, and bulk ingestion.

Parameter	Type	Description
`method`	`str`	`"scalar"` (int8, ~4x compression), `"binary"` (~32x compression, best for normalized embeddings), or `"product"` / `"pq"` (configurable PQ with k-means). Method names are parsed case-insensitively.
`rescore_multiplier`	`int`	Rescoring budget — the engine fetches `k × rescore_multiplier` candidates from the quantized scan and rescores them with exact float32 distances. Higher = better recall, slower search.
`num_sub_vectors`	`int \| None`	(Product quantization only) Number of sub-vector splits. When `None`, vectlite picks a dimension-compatible default. Use `valid_num_sub_vectors()` to see the legal values for your dimension.
`num_centroids`	`int`	(Product quantization only) Centroids per sub-vector codebook.

db.enable_quantization("scalar")
db.enable_quantization("binary", rescore_multiplier=10)
db.enable_quantization("pq")  # dimension-aware default for num_sub_vectors
db.enable_quantization("product", num_sub_vectors=16, num_centroids=256)

`disable_quantization`

db.disable_quantization() -> None

Disable quantization and remove the persisted .vdb.quant sidecar.

`is_quantized`

db.is_quantized() -> bool

Whether vector quantization is currently enabled. Method, not a property since 0.9.1.

`valid_num_sub_vectors`

db.valid_num_sub_vectors() -> list[int]

Return the list of valid num_sub_vectors values for this database's dimension (i.e. the divisors that allow an even split). Useful before calling enable_quantization("product", num_sub_vectors=...).

`upsert_multi_vectors`

db.upsert_multi_vectors(
    id: str,
    vector: list[float],
    multi_vectors: dict[str, list[list[float]]],
    metadata: Metadata | None = None,
    *,
    namespace: str | None = None,
    sparse: dict[str, float] | None = None,
) -> None

Insert or replace a record carrying per-token (ColBERT, ColPali) embeddings for late-interaction search. vector is the usual dense embedding; multi_vectors maps space names (e.g. "colbert", "colpali") to a list of token vectors.

db.upsert_multi_vectors(
    "doc1",
    dense_vector,
    {"colbert": [token_vec_1, token_vec_2, ...]},
    metadata={"source": "paper"},
)

`search_multi_vector`

db.search_multi_vector(
    space: str,
    query_tokens: list[list[float]],
    *,
    k: int = 10,
    namespace: str | None = None,
    filter: Filter | None = None,
) -> list[SearchResult]

MaxSim late-interaction search: for each query token, take the max cosine similarity against all document tokens, then sum across query tokens.

results = db.search_multi_vector("colbert", query_token_vectors, k=10)

`enable_multi_vector_quantization`

db.enable_multi_vector_quantization(space: str) -> None

Enable ColBERTv2-style 2-bit quantization (~16x compression) for a multi-vector space. Quantized search uses a 2-stage pipeline: fast approximate MaxSim candidate selection followed by exact float32 rescoring. Parameters persist in a .vdb.mvquant.<space> sidecar file.

`disable_multi_vector_quantization`

db.disable_multi_vector_quantization(space: str) -> None

Disable 2-bit quantization for a multi-vector space.

`is_multi_vector_quantized`

db.is_multi_vector_quantized(space: str) -> bool

Whether a multi-vector space currently has 2-bit quantization enabled.

`close`

db.close() -> None

Flush pending state, release the file lock, and invalidate the handle. Subsequent operations that return data raise VectLiteError. Prefer the context-manager form (with vectlite.open(...) as db:) when possible.

`flush`

db.flush() -> None

Flush pending writes to the WAL. Data is durable after this call but not yet compacted into the main file. In the current package, flush() is an alias for compact().

`compact`

db.compact() -> None

Merge the WAL into the main .vdb file and rebuild ANN indexes if necessary. Call this periodically or after large batch writes.

`snapshot`

db.snapshot(dest: str) -> None

Create a self-contained copy of the database at dest. Includes all committed data (call compact() first to include WAL entries).

`backup`

db.backup(dest: str) -> None

Full backup: copies the .vdb file and all ANN sidecar files to the dest directory.

`index_config`

Added in 0.10.0.

db.index_config() -> dict

Return the current HNSW tuning parameters as a dict with keys m, ef_construction, ef_search, parallel_insert_threshold, (since 0.11.0) tombstone_rebuild_pct, and (since 0.12.0) segment_size_threshold.

`set_index_config`

Added in 0.10.0.

db.set_index_config(
    *,
    m: int | None = None,
    ef_construction: int | None = None,
    ef_search: int | None = None,
    parallel_insert_threshold: int | None = None,
    tombstone_rebuild_pct: int | None = None,
    segment_size_threshold: int | None = None,
) -> None

Retune the HNSW configuration on an existing database. Changing m or ef_construction triggers a full ANN rebuild. Changing only ef_search is free (query-time only) and recommended as the first knob to try when tuning recall vs latency. segment_size_threshold (0.12.0+) controls when a fresh HNSW segment is started during streaming inserts.

db.set_index_config(ef_search=128)  # higher recall, slightly slower

`set_ef_search`

Added in 0.10.0.

db.set_ef_search(value: int | None) -> None

Shortcut to retune only the query-time ef_search. Pass None to fall back to the engine default.

`set_wal_sync_mode`

Added in 0.11.0.

db.set_wal_sync_mode(mode: str, n: int | None = None) -> None

Choose the WAL durability strategy. Trades a bounded amount of recently-acked data on crash for 5–10× higher single-record ingestion throughput on macOS APFS.

`mode`	Behaviour
`"per_op"`	(Default) `fsync` after every insert. Strongest durability.
`"every_n"`	`fsync` every `n` inserts (requires `n`). Amortises the fsync tax.
`"on_flush"`	`fsync` only at `flush` / `compact` / `close`. Highest throughput.

db.set_wal_sync_mode("every_n", n=64)
# ...
db.set_wal_sync_mode("on_flush")

`wal_sync_mode`

Added in 0.11.0.

db.wal_sync_mode() -> dict

Return the current WAL sync mode. Examples: {"mode": "per_op"}, {"mode": "every_n", "n": 64}, {"mode": "on_flush"}.

`tombstone_stats`

Added in 0.11.0.

db.tombstone_stats() -> tuple[int, int]

Return (live, dead) counts across HNSW graphs. Since 0.11.0, delete marks records as tombstoned rather than rebuilding the graph. The graph is rebuilt at compact() time, or whenever the dead/live ratio exceeds tombstone_rebuild_pct (default 30%).

`prepare_for_scan`

Added in 0.11.0.

db.prepare_for_scan() -> None

Materialise the contiguous dense-vector arena up front. Useful before a heavy brute-force / rescoring workload — avoids the first-call latency spike when the arena is built lazily.

`vector_arena_len`

Added in 0.11.0.

db.vector_arena_len() -> int | None

Return the number of vectors currently held in the contiguous arena, or None if the arena is not materialised.

`search`

db.search(
    vector: list[float] | None = None,
    *,
    k: int = 10,
    filter: Filter | None = None,
    namespace: str | None = None,
    all_namespaces: bool = False,
    sparse: dict[str, float] | None = None,
    dense_weight: float = 1.0,
    sparse_weight: float = 1.0,
    fusion: str = "linear",
    rrf_k: int = 60,
    fetch_k: int = 0,
    mmr_lambda: float | None = None,
    vector_name: str | None = None,
    query_vectors: dict[str, list[float]] | None = None,
    vector_weights: dict[str, float] | None = None,
    explain: bool = False,
    rerank: RerankHook | None = None,
    rerank_k: int = 0,
    truncate_dim: int | None = None,
) -> list[SearchResult]

Run a search query. Supports dense, sparse, hybrid, and multi-vector search modes.

Parameter	Type	Default	Description
`vector`	`list[float] \| None`	`None`	Dense query vector. Pass `None` for sparse-only or multi-vector search. Must match the database `dimension` unless `truncate_dim` is set. An all-zeros vector raises `VectLiteError` for cosine and dot-product metrics (distance from the origin is undefined for these).
`k`	`int`	`10`	Number of results to return.
`filter`	`Filter \| None`	`None`	MongoDB-style metadata filter.
`namespace`	`str \| None`	`None`	Restrict to a namespace.
`all_namespaces`	`bool`	`False`	Search all namespaces.
`sparse`	`dict[str, float] \| None`	`None`	Sparse term vector for keyword search.
`dense_weight`	`float`	`1.0`	Weight for the dense component in hybrid search.
`sparse_weight`	`float`	`1.0`	Weight for the sparse component in hybrid search.
`fusion`	`str`	`"linear"`	Fusion strategy: `"linear"` or `"rrf"`.
`rrf_k`	`int`	`60`	RRF smoothing constant.
`fetch_k`	`int`	`0`	Number of candidates to retrieve before reranking. `0` uses the engine default.
`mmr_lambda`	`float \| None`	`None`	MMR diversity parameter. `None` disables MMR.
`vector_name`	`str \| None`	`None`	Search a specific named vector space.
`query_vectors`	`dict[str, list[float]] \| None`	`None`	Named query vectors for multi-vector search.
`vector_weights`	`dict[str, float] \| None`	`None`	Weights for multi-vector search.
`explain`	`bool`	`False`	Include scoring breakdown in results.
`rerank`	`RerankHook \| None`	`None`	Reranker function.
`rerank_k`	`int`	`0`	Candidates to pass to the reranker.
`truncate_dim`	`int \| None`	`None`	Matryoshka prefix search. When set, query and database vectors are truncated to the first `truncate_dim` dimensions before scoring. Required when the query length is shorter than the database dimension; otherwise the engine raises `DimensionMismatch`.

Returns: list[SearchResult]

Dimension mismatch (changed in 0.9.1)

Search now strictly rejects query vectors whose length does not match the database dimension. Previously, undersized queries were silently truncated via Matryoshka logic. To opt into prefix search, pass truncate_dim explicitly.

`search_with_stats`

db.search_with_stats(
    # same parameters as search()
) -> SearchResponse

Same as search() but returns a SearchResponse with results and query statistics.

Store

Returned by open_store(). Manages a directory of independent database collections.

Properties

Property	Type	Description
`root`	`str`	Absolute path to the store directory.

`collections`

store.collections() -> list[str]

List all collection names in the store.

`create_collection`

store.create_collection(name: str, dimension: int) -> Database

Create a new collection. Raises VectLiteError if it already exists.

`open_or_create_collection`

store.open_or_create_collection(name: str, dimension: int) -> Database

Open an existing collection or create a new one.

`open_collection`

store.open_collection(name: str) -> Database

Open an existing collection. Raises VectLiteError if it does not exist.

`drop_collection`

store.drop_collection(name: str) -> bool

Delete a collection and all its data from disk. Returns True if the collection existed.

`close`

store.close() -> None

Symmetry with Database.close(). The store holds no open file handles, so this is currently a no-op, but using it future-proofs your code and avoids AttributeError surprises across binding versions. Available in Python, Node, Swift, and Kotlin since 0.9.1.

Transaction

Returned by Database.transaction(). Supports use as a context manager (with statement) for automatic commit on success and rollback on exception.

Context Manager

with db.transaction() as tx:
    tx.upsert("id", vector, metadata)
# auto-commits here; rolls back on exception

`insert`

tx.insert(
    id: str,
    vector: list[float],
    metadata: Metadata | None = None,
    *,
    namespace: str | None = None,
    sparse: dict[str, float] | None = None,
    vectors: dict[str, list[float]] | None = None,
    ttl: float | None = None,
) -> None

Queue an insert within the transaction. ttl (seconds from now) sets an expiry on the record.

`upsert`

tx.upsert(
    id: str,
    vector: list[float],
    metadata: Metadata | None = None,
    *,
    namespace: str | None = None,
    sparse: dict[str, float] | None = None,
    vectors: dict[str, list[float]] | None = None,
    ttl: float | None = None,
) -> None

Queue an upsert within the transaction. ttl (seconds from now) sets an expiry on the record.

`insert_many`

tx.insert_many(
    records: list[Record],
    *,
    namespace: str | None = None,
) -> int

Queue a batch insert. Returns the number of records queued.

`upsert_many`

tx.upsert_many(
    records: list[Record],
    *,
    namespace: str | None = None,
) -> int

Queue a batch upsert. Returns the number of records queued.

`delete`

tx.delete(
    id: str,
    *,
    namespace: str | None = None,
) -> bool

Queue a delete within the transaction.

`commit`

tx.commit() -> None

Commit all queued operations atomically.

`rollback`

tx.rollback() -> None

Discard all queued operations.

`len`

len(tx) -> int

Return the number of queued operations in the transaction.

Types

`MetadataValue`

MetadataValue = str | int | float | bool | None | list | dict

A single metadata field value.

`Metadata`

Metadata = dict[str, MetadataValue]

A metadata dictionary attached to a record.

`Filter`

Filter = dict[str, Any]

MongoDB-style filter expression. See Metadata Filters for the full query syntax.

`RerankHook`

RerankHook = Callable[[dict[str, Any], list[dict[str, Any]]], list[dict[str, Any]]]

A function that receives a query payload dict and a list of result dicts, and returns a reordered list. Used with the rerank parameter.

`Record`

class Record(TypedDict, total=False):
    namespace: str
    id: str                              # Required
    vector: list[float]                  # Required
    vectors: dict[str, list[float]]
    multi_vectors: dict[str, list[list[float]]]
    sparse: dict[str, float]
    metadata: Metadata
    expires_at: float | None             # epoch seconds (TTL)

A record dictionary. Used for batch operations and returned by get(). expires_at is the absolute expiry timestamp in epoch seconds when a TTL is set (otherwise None).

`SearchResult`

class SearchResult(TypedDict):
    namespace: str
    id: str
    score: float
    dense_score: float
    sparse_score: float
    vector_name: str | None
    matched_terms: list[str]
    dense_rank: int | None
    sparse_rank: int | None
    bm25_term_scores: dict[str, float]
    rerank_score: float
    metadata: Metadata
    explain: ExplainDetails

A single search result.

`ExplainDetails`

class ExplainDetails(TypedDict, total=False):
    fusion: str
    dense_score: float
    sparse_score: float
    matched_terms: list[str]
    vector_name: str | None
    dense_rank: int | None
    sparse_rank: int | None
    bm25_term_scores: dict[str, float]
    rerankers: list[dict[str, Any]]

Scoring breakdown when explain=True.

`SearchStats`

class SearchStats(TypedDict):
    used_ann: bool
    ann_candidate_count: int
    exact_fallback: bool
    considered_count: int
    fetch_k: int
    mmr_applied: bool
    sparse_candidate_count: int
    ann_loaded_from_disk: bool
    wal_entries_replayed: int
    fusion: str
    rerank_applied: bool
    rerank_count: int
    timings: SearchTimings

Engine statistics for a search query.

`SearchTimings`

class SearchTimings(TypedDict):
    dense_us: int
    sparse_us: int
    fusion_us: int
    total_us: int

Timing breakdown in microseconds.

`SearchResponse`

class SearchResponse(TypedDict):
    results: list[SearchResult]
    stats: SearchStats

Returned by search_with_stats() and search_text_with_stats().

Exceptions

`VectLiteError`

class VectLiteError(Exception): ...

Base exception for all vectlite errors. Raised for:

Write operations on a read-only database
Duplicate ID on insert()
Dimension mismatch
Corrupt database file
File lock contention
I/O errors

Sub-modules

`vectlite.analyzers`

Text analysis utilities for customizing sparse tokenization.

`Analyzer`

class Analyzer:
    def __init__(self) -> None: ...
    def tokenizer(self, fn: Callable[[str], list[str]]) -> Analyzer: ...
    def lowercase(self) -> Analyzer: ...
    def stopwords(self, lang_or_set: str | frozenset[str] | set[str]) -> Analyzer: ...
    def stemmer(self, lang: str = "english") -> Analyzer: ...
    def ngrams(self, n: int = 2) -> Analyzer: ...
    def filter(self, fn: Callable[[list[str]], list[str]]) -> Analyzer: ...
    def tokenize(self, text: str) -> list[str]: ...
    def sparse_terms(self, text: str) -> dict[str, float]: ...
    def sparse_terms_weighted(
        self,
        fields: dict[str, str],
        weights: dict[str, float] | None = None,
    ) -> dict[str, float]: ...

Methods:

tokenizer(fn) -- replace the default tokenizer.
lowercase() -- add a lowercase filter.
stopwords(lang_or_set) -- remove stopwords by language code or custom set.
stemmer(lang) -- add a Snowball stemmer. Requires PyStemmer.
ngrams(n) -- add character n-gram generation.
filter(fn) -- add a custom token filter.
tokenize(text) -- return the processed token list.
sparse_terms(text) -- return a sparse term-frequency vector.
sparse_terms_weighted(fields, weights) -- combine multiple text fields with per-field weights.

Constants

vectlite.analyzers.ENGLISH_STOPWORDS: frozenset[str]
vectlite.analyzers.FRENCH_STOPWORDS: frozenset[str]

Pre-built stopword sets.

`vectlite.rerankers`

Composable reranking functions for search post-processing.

`text_match`

vectlite.rerankers.text_match(
    *,
    text_key: str = "text",
    title_key: str | None = "title",
    text_weight: float = 1.0,
    title_weight: float = 1.5,
    matched_term_weight: float = 0.25,
    phrase_boost: float = 1.0,
) -> RerankHook

Boost results according to term overlap between the query payload and metadata text/title fields.

`metadata_boost`

vectlite.rerankers.metadata_boost(
    field: str,
    boosts: Mapping[Any, float],
    *,
    default: float = 0.0,
) -> RerankHook

Adjust scores by adding the configured boost for a metadata field value.

`cross_encoder`

vectlite.rerankers.cross_encoder(
    model_name_or_path: str = "cross-encoder/ms-marco-MiniLM-L-6-v2",
    *,
    text_key: str = "text",
    batch_size: int = 32,
    device: str | None = None,
) -> RerankHook

Rerank using a sentence-transformers cross-encoder loaded from a model name or local path.

Parameter	Type	Description
`model_name_or_path`	`str`	HuggingFace model name or local path.
`text_key`	`str`	Metadata field containing document text.
`batch_size`	`int`	Batch size for inference.
`device`	`str \| None`	Device such as `"cpu"` or `"cuda"`.

`bi_encoder`

vectlite.rerankers.bi_encoder(
    model_name_or_path: str = "sentence-transformers/all-MiniLM-L6-v2",
    *,
    text_key: str = "text",
    batch_size: int = 64,
    device: str | None = None,
) -> RerankHook

Rerank using a sentence-transformers bi-encoder loaded from a model name or local path.

`onnx_cross_encoder`

vectlite.rerankers.onnx_cross_encoder(
    model_name_or_path: str = "cross-encoder/ms-marco-MiniLM-L-6-v2",
    *,
    text_key: str = "text",
    batch_size: int = 32,
) -> RerankHook

Zero-PyTorch cross-encoder reranking via onnxruntime + tokenizers. Auto-downloads models from the HuggingFace Hub. Same RerankHook interface as cross_encoder().

Requires: pip install onnxruntime tokenizers huggingface-hub.

reranker = vectlite.rerankers.onnx_cross_encoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
results = db.search(query, k=20, rerank=reranker)

`compose`

vectlite.rerankers.compose(
    *hooks: RerankHook,
    strategy: str = "sequential",
    rank_constant: int = 60,
) -> RerankHook

Compose multiple rerankers either sequentially or with reciprocal rank fusion.

reranker = vectlite.rerankers.compose(
    vectlite.rerankers.text_match(text_weight=0.5),
    vectlite.rerankers.metadata_boost("priority", {"high": 2.0, "low": 0.5}),
)
results = db.search(query_emb, k=10, rerank=reranker, rerank_k=50)

`vectlite.embedders`

Ready-to-use embedding functions for upsert_text() and search_text(). Every provider lazy-imports its SDK, so none of them are runtime dependencies.

from vectlite import embedders

embed = embedders.openai("text-embedding-3-small")
embed = embedders.cohere("embed-english-v3.0")
embed = embedders.voyage("voyage-3")
embed = embedders.fastembed("BAAI/bge-small-en-v1.5")           # local ONNX
embed = embedders.sentence_transformer("sentence-transformers/all-MiniLM-L6-v2")
embed = embedders.ollama("nomic-embed-text")

vectlite.upsert_text(db, "doc1", "Hello world", embed)
results = vectlite.search_text(db, "greeting", embed, k=5)

Factory	Required extra	Description
`embedders.openai(model)`	`openai`	OpenAI hosted embeddings (reads `OPENAI_API_KEY`).
`embedders.cohere(model)`	`cohere`	Cohere hosted embeddings.
`embedders.voyage(model)`	`voyageai`	Voyage AI hosted embeddings.
`embedders.fastembed(model)`	`fastembed`	Local ONNX inference, no API calls.
`embedders.sentence_transformer(model)`	`sentence-transformers`	Local PyTorch inference.
`embedders.ollama(model)`	--	Calls a local Ollama server over HTTP.

Each factory returns a Callable[[str], list[float]].

`vectlite.schema`

Optional typed metadata schemas with clear error messages on type mismatch.

from vectlite import schema

s = schema.Schema({
    "price": "number",
    "title": "string",
    "tags": "array<string>",
    "author": {
        "name": "string",
        "age": "number",
    },
}, strict=True)  # strict=True rejects unknown fields

s.validate({"price": 9.99, "title": "Hello"})           # OK
s.validate({"price": "free"})                            # raises SchemaError

# Wrap a database to auto-validate on every write
validated_db = schema.validated(db, s)
validated_db.upsert("doc1", vector, {"price": 9.99})    # OK
validated_db.upsert("doc2", vector, {"price": "free"})  # raises SchemaError

# Persist alongside the database in a .vdb.schema.json sidecar
s.save(db)
loaded = schema.load(db)

Supported types: string, number, integer, boolean, null, any, array, array<string>, array<number>, object, plus nested objects.

`SchemaError`

class schema.SchemaError(VectLiteError): ...

Raised by Schema.validate() and schema.validated() writes when metadata fails type or strict-mode checks.

`vectlite.langchain`

LangChain VectorStore implementation. Requires pip install langchain-core.

from vectlite.langchain import VectLiteVectorStore
from langchain_openai import OpenAIEmbeddings

store = VectLiteVectorStore(
    path="my.vdb",
    embedding=OpenAIEmbeddings(),
    dimension=1536,
)

store.add_texts(["Hello world", "How to authenticate"])
results = store.similarity_search("greeting", k=3)
results_with_scores = store.similarity_search_with_score("greeting", k=3)

Implements add_texts, add_documents, similarity_search, similarity_search_with_score, similarity_search_by_vector, delete, and the from_texts class method. Compatible with VectorStoreIndex, RetrievalQA, and other LangChain consumers.

`vectlite.llamaindex`

LlamaIndex VectorStore implementation. Requires pip install llama-index-core.

from vectlite.llamaindex import VectLiteVectorStore
from llama_index.core import StorageContext, VectorStoreIndex

store = VectLiteVectorStore(path="my.vdb", dimension=1536)
storage_ctx = StorageContext.from_defaults(vector_store=store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_ctx)

query_engine = index.as_query_engine()
response = query_engine.query("How do I authenticate?")

Implements add, delete, and query against LlamaIndex's VectorStore protocol.

Command-Line Interface

Installed as the vectlite console script (also reachable via python -m vectlite).

Command	Purpose
`vectlite stats <path>`	Dimension, metric, record counts, file sizes, active indexes.
`vectlite count <path> [--namespace ns] [--filter JSON]`	Filtered or namespace-scoped count.
`vectlite list <path> [--limit N] [--filter JSON] [--namespace ns]`	List records as JSON.
`vectlite dump <path>`	Export every record as JSONL via cursor pagination.
`vectlite search <path> --query JSON [--k N] [--filter JSON]`	Run a dense search and print results.
`vectlite compact <path>`	Fold the WAL into the snapshot and persist ANN sidecars.
`vectlite verify <path>`	Check WAL + snapshot integrity.
`vectlite bench <path> --queries N [--k N]`	Search benchmark with QPS and latency.
`vectlite import-jsonl <path> <file> --dimension N`	Bulk import from a JSONL file.
`vectlite import-csv <path> <file> --dimension N --vector-col embedding`	Bulk import from a CSV file.

vectlite stats my.vdb
vectlite count my.vdb --namespace blog
vectlite list my.vdb --limit 10 --filter '{"source": "blog"}'
vectlite dump my.vdb > backup.jsonl
vectlite import-jsonl my.vdb data.jsonl --dimension 384
vectlite bench my.vdb --queries 1000 --k 10

Module Functions
Database
Store
Transaction
Types
Exceptions
- VectLiteError
Sub-modules
Command-Line Interface

Module Functions​

open​

open_store​

restore​

sparse_terms​

upsert_text​

search_text​

search_text_with_stats​

configure_opentelemetry​

Database​

Properties​

count​

namespaces​

transaction​

insert​

upsert​

insert_many​

upsert_many​

bulk_ingest​

bulk_ingest_array​

ann_segment_count​

get​

delete​

delete_many​

delete_by_filter​

update_metadata​

list​

list_cursor​

set_ttl​

clear_ttl​

create_index​

drop_index​

list_indexes​

enable_quantization​

disable_quantization​

is_quantized​

valid_num_sub_vectors​

upsert_multi_vectors​

search_multi_vector​

enable_multi_vector_quantization​

disable_multi_vector_quantization​

is_multi_vector_quantized​

close​

flush​

compact​

snapshot​

backup​

index_config​

set_index_config​

set_ef_search​

set_wal_sync_mode​

wal_sync_mode​

tombstone_stats​

prepare_for_scan​

vector_arena_len​

search​

search_with_stats​

Store​

Properties​

collections​

create_collection​

open_or_create_collection​

open_collection​

drop_collection​

close​

Transaction​

Context Manager​

insert​

upsert​

insert_many​

upsert_many​

delete​

commit​

rollback​

__len__​

Types​

MetadataValue​

Metadata​

Filter​

RerankHook​

Module Functions

`open`

`open_store`

`restore`

`sparse_terms`

`upsert_text`

`search_text`

`search_text_with_stats`

`configure_opentelemetry`

Database

Properties

`count`

`namespaces`

`transaction`

`insert`

`upsert`

`insert_many`

`upsert_many`

`bulk_ingest`

`bulk_ingest_array`

`ann_segment_count`

`get`

`delete`

`delete_many`

`delete_by_filter`

`update_metadata`

`list`

`list_cursor`

`set_ttl`

`clear_ttl`

`create_index`

`drop_index`

`list_indexes`

`enable_quantization`

`disable_quantization`

`is_quantized`

`valid_num_sub_vectors`

`upsert_multi_vectors`

`search_multi_vector`

`enable_multi_vector_quantization`

`disable_multi_vector_quantization`

`is_multi_vector_quantized`

`close`

`flush`

`compact`

`snapshot`

`backup`

`index_config`

`set_index_config`

`set_ef_search`

`set_wal_sync_mode`

`wal_sync_mode`

`tombstone_stats`

`prepare_for_scan`

`vector_arena_len`

`search`

`search_with_stats`

Store

Properties

`collections`

`create_collection`

`open_or_create_collection`

`open_collection`

`drop_collection`

`close`

Transaction

Context Manager

`insert`

`upsert`

`insert_many`

`upsert_many`

`delete`

`commit`

`rollback`

`len`

Types

`MetadataValue`

`Metadata`

`Filter`

`RerankHook`