Search (RAG)

The POST endpoint /v1/search allows you to search for chunks within collections using a query prompt. This increases the context available to a language model through RAG (Retrieval-Augmented Generation). RAG search retrieves relevant chunks from your collections based on your query, enabling language models to generate responses grounded in your specific documents and knowledge base.

RAG search flow

Search Methods

OpenGateLLM supports multiple search methods:

Method	Description
`semantic`	Vector similarity search using embeddings
`lexical`	Keyword-based search (BM25)
`hybrid`	Combination of semantic and lexical search

Search query

prompt: Search query (required)
collections: List of collection IDs to search in (required)
method: Search method (default: semantic)
limit: Number of results to return (default: 10, max: 200)
offset: Pagination offset (default: 0)
rff_k: RRF constant for hybrid search (default: 20)
score_threshold: Minimum similarity score (0.0-1.0, only for semantic)

Semantic search
Hybrid search
With web search

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is machine learning?",
    "collections": [1, 2],
    "method": "semantic",
    "limit": 10,
    "score_threshold": 0.7
  }'

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Python programming",
    "collections": [1],
    "method": "hybrid",
    "limit": 10,
    "rff_k": 20
  }'

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Latest AI developments",
    "collections": [1],
    "method": "semantic",
    "limit": 10
  }'

info

See Configuration for more details.

RAG search flow​

Search Methods​

Search query​

RAG search flow

Search Methods

Search query