Search API
Relevance-ranked search with snippets, citations, metadata, and fine-grained controls. Designed for tool use - structured results your agents can consume.
Core features
Everything you need for production search
Full-text search
BM25F ranking across title, body, headings, and metadata fields. Query with natural language or structured terms.
Attribute filters
Filter by any indexed attribute - date ranges, form types, agencies, domains, custom fields. Hard filters applied before ranking.
Ranking controls
Field weights (title vs body), recency boost with configurable decay, domain and path boosts. Override per query or save as presets.
Snippets with citations
Contextual snippets anchored to source content. Every result includes the source URL, document hash, and timestamps.
Hybrid retrieval
BM25 lexical matching combined with semantic similarity. Best-of-both for intent-aware search with keyword precision.
Pagination
Cursor-based pagination for stable iteration through large result sets. Configurable page size.
Example request
Search with filters and ranking boosts
POST /v1/corpora/:id/search
{
"query": "cybersecurity risk disclosure",
"filters": {
"form_type": "10-K",
"filed_after": "2024-01-01"
},
"boosts": {
"recency": 1.5,
"fields": { "title": 2.0 }
},
"limit": 10
}Response format
Structured results with provenance
{
"results": [
{
"document_id": "...",
"url": "https://...",
"title": "...",
"snippet": "...highlighted match...",
"score": 12.45,
"attributes": { "form_type": "10-K", ... },
"indexed_at": "2024-06-15T...",
"content_hash": "sha256:..."
}
],
"cursor": "...",
"total_estimate": 142
}