Skip to main content

Reranking

vectlite includes built-in rerankers that re-score search results after the initial retrieval phase. Rerankers can improve relevance by incorporating signals beyond vector similarity.

info

Rerankers are currently available in the Python binding only.

Text Match Reranker

Boosts results whose metadata text overlaps with the query:

from vectlite import rerankers

results = db.search(
query_embedding,
k=10,
sparse=vectlite.sparse_terms("SSO authentication"),
rerank=rerankers.text_match(),
)

Options

rerankers.text_match(
text_key="text", # Metadata field with document body
title_key="title", # Metadata field with document title
text_weight=1.0, # Weight for body text overlap
title_weight=1.5, # Weight for title text overlap
matched_term_weight=0.25, # Weight for BM25 matched terms
phrase_boost=1.0, # Boost for exact phrase matches
)

Metadata Boost Reranker

Boosts results based on a metadata field value:

results = db.search(
query_embedding,
k=10,
rerank=rerankers.metadata_boost("source", {"docs": 0.5, "blog": 0.2}),
)

Results with source: "docs" get +0.5 added to their score, source: "blog" gets +0.2.

Cross-Encoder Reranker

Uses a cross-encoder model for high-quality re-scoring. Requires sentence-transformers:

pip install sentence-transformers
results = db.search(
query_embedding,
k=10,
rerank=rerankers.cross_encoder("cross-encoder/ms-marco-MiniLM-L-6-v2"),
)

Options

rerankers.cross_encoder(
model_name_or_path="cross-encoder/ms-marco-MiniLM-L-6-v2",
text_key="text", # Metadata field with document text
batch_size=32, # Batch size for inference
device=None, # "cpu", "cuda", etc.
)

Bi-Encoder Reranker

Uses a bi-encoder model to compute embedding similarity between query and document text:

results = db.search(
query_embedding,
k=10,
rerank=rerankers.bi_encoder("sentence-transformers/all-MiniLM-L6-v2"),
)

Composing Rerankers

Chain multiple rerankers sequentially or with reciprocal rank fusion:

Sequential

results = db.search(
query_embedding,
k=10,
rerank=rerankers.compose(
rerankers.text_match(),
rerankers.metadata_boost("source", {"docs": 0.5}),
),
)

RRF Composition

results = db.search(
query_embedding,
k=10,
rerank=rerankers.compose(
rerankers.text_match(),
rerankers.cross_encoder(),
strategy="rrf",
rank_constant=60,
),
)

Rerank with Oversampling

Fetch more candidates than needed, rerank, then return the top k:

results = db.search(
query_embedding,
k=10,
rerank=rerankers.cross_encoder(),
rerank_k=50, # Fetch 50 candidates, rerank, return top 10
)

Explain Mode

See detailed reranking traces with explain=True:

results = db.search(
query_embedding,
k=10,
explain=True,
rerank=rerankers.text_match(),
)

for r in results:
print(r["explain"]["rerankers"])
# [{"name": "text_match", "base_score": 0.85, "body_overlap": 0.6, ...}]