Search API
Relevance-ranked search with snippets, citations, metadata, and fine-grained controls. Designed for tool use - structured results your agents can consume.
Core features
Everything you need for production search
Full-text search
BM25F ranking across title, body, headings, and metadata fields. Query with natural language or structured terms.
Filters
Built-in filters (domain, path prefix, language, changed/published after) plus attribute filters with operators (eq, in, gt, lt, range) on any indexed attribute.
Attribute boosts
Score multipliers based on attribute values - recency decay on datetime fields, numeric scaling, exact-match boosts on string fields. Configurable per query.
Snippets with citations
Query-time snippets with term highlighting. Every result includes source URL, doc ID, score, and custom attributes.
Hybrid retrieval
BM25 lexical matching combined with semantic similarity. Best-of-both for intent-aware search with keyword precision. Coming soon.
Pagination
Offset-based pagination with total count and query timing. Configurable page size up to 100 results.
Example request
Search with filters and ranking boosts
POST /v1/corpora/{corpus_id}/search
{
"query": "cybersecurity risk disclosure",
"filters": {
"published_after": "2024-01-01T00:00:00Z"
},
"attribute_filters": [
{ "field": "form_type", "op": "eq", "value": "10-K" }
],
"attribute_boosts": [
{ "field": "posted_at", "multiplier": 1.5 }
],
"limit": 10
}Response format
Structured results with provenance
{
"results": [
{
"doc_id": "d_3f8a...",
"url": "https://www.sec.gov/...",
"title": "Annual Report (10-K)",
"snippet": "...cybersecurity <mark>risk disclosure</mark>...",
"score": 12.45,
"attributes": {
"form_type": "10-K",
"posted_at": "2024-06-15T..."
}
}
],
"total_count": 142,
"query_time_ms": 23
}