OpenGateLLM (0.4.1)

Download OpenAPI specification:

License: MIT

OpenGateLLM connect to your models. You can configuration this swagger UI in the configuration file, like hide routes or change the title.

Admin

Create Provider

Authorizations:
HTTPBearer
Request Body schema: application/json
required
router
required
integer (Router)

ID of the model to create the provider for (router ID, eg. 123).

type
required
string (ProviderType)
Enum: "albert" "openai" "mistral" "tei" "vllm"

Model provider type.

Url (string) or Url (null) (Url)

Model provider API url. The url must only contain the domain name (without /v1 suffix for example). Depends of the model provider type, the url can be optional (Albert, OpenAI).

Key (string) or Key (null) (Key)

Model provider API key.

timeout
integer (Timeout)
Default: 300

Timeout for the model provider requests, after user receive an 503 error (model is too busy).

model_name
required
string (Model Name)

Model name from the model provider.

model_hosting_zone
string (ProviderCarbonFootprintZone)
Default: "WOR"
Enum: "ABW" "AFG" "AGO" "AIA" "ALA" "ALB" "AND" "ARE" "ARG" "ARM" "ASM" "ATA" "ATF" "ATG" "AUS" "AUT" "AZE" "BDI" "BEL" "BEN" "BES" "BFA" "BGD" "BGR" "BHR" "BHS" "BIH" "BLM" "BLR" "BLZ" "BMU" "BOL" "BRA" "BRB" "BRN" "BTN" "BVT" "BWA" "CAF" "CAN" "CCK" "CHE" "CHL" "CHN" "CIV" "CMR" "COD" "COG" "COK" "COL" "COM" "CPV" "CRI" "CUB" "CUW" "CXR" "CYM" "CYP" "CZE" "DEU" "DJI" "DMA" "DNK" "DOM" "DZA" "ECU" "EGY" "ERI" "ESH" "ESP" "EST" "ETH" "FIN" "FJI" "FLK" "FRA" "FRO" "FSM" "GAB" "GBR" "GEO" "GGY" "GHA" "GIB" "GIN" "GLP" "GMB" "GNB" "GNQ" "GRC" "GRD" "GRL" "GTM" "GUF" "GUM" "GUY" "HKG" "HMD" "HND" "HRV" "HTI" "HUN" "IDN" "IMN" "IND" "IOT" "IRL" "IRN" "IRQ" "ISL" "ISR" "ITA" "JAM" "JEY" "JOR" "JPN" "KAZ" "KEN" "KGZ" "KHM" "KIR" "KNA" "KOR" "KWT" "LAO" "LBN" "LBR" "LBY" "LCA" "LIE" "LKA" "LSO" "LTU" "LUX" "LVA" "MAC" "MAF" "MAR" "MCO" "MDA" "MDG" "MDV" "MEX" "MHL" "MKD" "MLI" "MLT" "MMR" "MNE" "MNG" "MNP" "MOZ" "MRT" "MSR" "MTQ" "MUS" "MWI" "MYS" "MYT" "NAM" "NCL" "NER" "NFK" "NGA" "NIC" "NIU" "NLD" "NOR" "NPL" "NRU" "NZL" "OMN" "PAK" "PAN" "PCN" "PER" "PHL" "PLW" "PNG" "POL" "PRI" "PRK" "PRT" "PRY" "PSE" "PYF" "QAT" "REU" "ROU" "RUS" "RWA" "SAU" "SDN" "SEN" "SGP" "SGS" "SHN" "SJM" "SLB" "SLE" "SLV" "SMR" "SOM" "SPM" "SRB" "SSD" "STP" "SUR" "SVK" "SVN" "SWE" "SWZ" "SXM" "SYC" "SYR" "TCA" "TCD" "TGO" "THA" "TJK" "TKL" "TKM" "TLS" "TON" "TTO" "TUN" "TUR" "TUV" "TWN" "TZA" "UGA" "UKR" "UMI" "URY" "USA" "UZB" "VAT" "VCT" "VEN" "VGB" "VIR" "VNM" "VUT" "WLF" "WOR" "WSM" "YEM" "ZAF" "ZMB" "ZWE"

Model hosting zone using ISO 3166-1 alpha-3 code format (e.g., WOR for World, FRA for France, USA for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.ai

model_total_params
integer (Model Total Params) >= 0
Default: 0

Total params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai

model_active_params
integer (Model Active Params) >= 0
Default: 0

Active params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai

Metric (string) or null

The metric to use for the quality of service policy. If not provided, no QoS policy is applied.

Qos Limit (number) or Qos Limit (null) (Qos Limit)

The value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "router": 0,
  • "type": "albert",
  • "url": "string",
  • "key": "string",
  • "timeout": 300,
  • "model_name": "string",
  • "model_hosting_zone": "ABW",
  • "model_total_params": 0,
  • "model_active_params": 0,
  • "qos_metric": "ttft",
  • "qos_limit": 0
}

Response samples

Content type
application/json
{
  • "id": 0,
  • "router_id": 0,
  • "user_id": 0,
  • "type": "albert",
  • "url": "string",
  • "key": "string",
  • "timeout": 0,
  • "model_name": "string",
  • "model_hosting_zone": "ABW",
  • "model_total_params": 0,
  • "model_active_params": 0,
  • "qos_metric": "ttft",
  • "qos_limit": 0,
  • "created": 0,
  • "updated": 0
}

Get Providers

Get all model providers for a router.

Authorizations:
HTTPBearer
query Parameters
Router (integer) or Router (null) (Router)

Filter providers by router ID.

offset
integer (Offset) >= 0
Default: 0

The offset of the tokens to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the tokens to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "model_name" "created"

The field to order the tokens by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the tokens by.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Delete Provider

Authorizations:
HTTPBearer
path Parameters
provider_id
required
integer (Provider Id)

The ID of the provider to delete.

Responses

Response samples

Content type
application/json
{
  • "object": "provider",
  • "id": 0,
  • "router_id": 0,
  • "user_id": 0,
  • "type": "albert",
  • "url": "string",
  • "key": "string",
  • "timeout": 0,
  • "model_name": "string",
  • "model_hosting_zone": "ABW",
  • "model_total_params": 0,
  • "model_active_params": 0,
  • "qos_metric": "ttft",
  • "qos_limit": 0,
  • "created": 0,
  • "updated": 0
}

Update Provider

Update a model provider.

Authorizations:
HTTPBearer
path Parameters
provider
required
integer (Provider)

The ID of the provider to update.

Request Body schema: application/json
required
Router (integer) or Router (null) (Router)

The ID of the new router to assign to the provider.

Timeout (integer) or Timeout (null) (Timeout)

Timeout for the model provider requests, after user receive an 500 error (model is too busy).

ProviderCarbonFootprintZone (string) or null

Model hosting zone using ISO 3166-1 alpha-3 code format (e.g., WOR for World, FRA for France, USA for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.ai

Model Total Params (integer) or Model Total Params (null) (Model Total Params)

Total params of the model in billions of parameters for carbon footprint computation. If not provided, the active params will be used if provided, else carbon footprint will not be computed. For more information, see https://ecologits.ai

Model Active Params (integer) or Model Active Params (null) (Model Active Params)

Active params of the model in billions of parameters for carbon footprint computation. If not provided, the total params will be used if provided, else carbon footprint will not be computed. For more information, see https://ecologits.ai

Metric (string) or null

The metric to use for the quality of service policy. If not provided, no QoS policy is applied.

Qos Limit (number) or Qos Limit (null) (Qos Limit)

The value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "router": 0,
  • "timeout": 0,
  • "model_hosting_zone": "ABW",
  • "model_total_params": 0,
  • "model_active_params": 0,
  • "qos_metric": "ttft",
  • "qos_limit": 0
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get Provider

Get a model provider by router and provider IDs.

Authorizations:
HTTPBearer
path Parameters
provider
required
integer (Provider)

The ID of the provider to get.

Responses

Response samples

Content type
application/json
{
  • "object": "provider",
  • "id": 0,
  • "router_id": 0,
  • "user_id": 0,
  • "type": "albert",
  • "url": "string",
  • "key": "string",
  • "timeout": 0,
  • "model_name": "string",
  • "model_hosting_zone": "ABW",
  • "model_total_params": 0,
  • "model_active_params": 0,
  • "qos_metric": "ttft",
  • "qos_limit": 0,
  • "created": 0,
  • "updated": 0
}

Create Router

Authorizations:
HTTPBearer
Request Body schema: application/json
required
name
required
string (Name) non-empty

Name of the model router.

type
required
string (ModelType)
Enum: "automatic-speech-recognition" "image-text-to-text" "image-to-text" "text-embeddings-inference" "text-generation" "text-classification"

Type of the model router. It will be used to identify the model router type.

aliases
Array of strings (Aliases) [ items [ 1 .. 64 ] characters ]

Aliases of the model. It will be used to identify the model by users.

load_balancing_strategy
string (RouterLoadBalancingStrategy)
Default: "shuffle"
Enum: "shuffle" "least_busy"

Routing strategy for load balancing between providers of the model. It will be used to identify the model type.

cost_prompt_tokens
number (Cost Prompt Tokens) >= 0
Default: 0

Cost of a million prompt tokens (decrease user budget)

cost_completion_tokens
number (Cost Completion Tokens) >= 0
Default: 0

Cost of a million completion tokens (decrease user budget)

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "model-router-1",
  • "type": "automatic-speech-recognition",
  • "aliases": [
    ],
  • "load_balancing_strategy": "shuffle",
  • "cost_prompt_tokens": 0,
  • "cost_completion_tokens": 0
}

Response samples

Content type
application/json
{
  • "id": 0,
  • "name": "model-router-1",
  • "type": "automatic-speech-recognition",
  • "aliases": [
    ],
  • "load_balancing_strategy": "shuffle",
  • "cost_prompt_tokens": 0,
  • "cost_completion_tokens": 0
}

Get Routers

Authorizations:
HTTPBearer
query Parameters
offset
integer (Offset) >= 0
Default: 0

Number of routers to skip.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

Maximum number of routers to return.

sort_by
string (RouterSortField)
Default: "id"
Enum: "id" "name" "created"

Field to sort by.

sort_order
string (SortOrder)
Default: "asc"
Enum: "asc" "desc"

Sort order.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "total": 0,
  • "offset": 0,
  • "limit": 0,
  • "data": [
    ]
}

Get Router

Authorizations:
HTTPBearer
path Parameters
router_id
required
integer (Router Id)

The router ID.

Responses

Response samples

Content type
application/json
{
  • "object": "router",
  • "id": 0,
  • "name": "string",
  • "user_id": 0,
  • "type": "automatic-speech-recognition",
  • "aliases": [
    ],
  • "load_balancing_strategy": "shuffle",
  • "vector_size": 0,
  • "max_context_length": 0,
  • "cost_prompt_tokens": 0,
  • "cost_completion_tokens": 0,
  • "providers": 0,
  • "created": 0,
  • "updated": 0
}

Delete Router

Authorizations:
HTTPBearer
path Parameters
router_id
required
integer (Router Id)

The ID of the router to delete (router ID, eg. 123).

Responses

Response samples

Content type
application/json
{
  • "object": "router",
  • "id": 0,
  • "name": "string",
  • "user_id": 0,
  • "type": "automatic-speech-recognition",
  • "aliases": [
    ],
  • "load_balancing_strategy": "shuffle",
  • "vector_size": 0,
  • "max_context_length": 0,
  • "cost_prompt_tokens": 0,
  • "cost_completion_tokens": 0,
  • "providers": 0,
  • "created": 0,
  • "updated": 0
}

Create Organization

Authorizations:
HTTPBearer
Request Body schema: application/json
required
name
required
string (Name) non-empty

The organization name.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string"
}

Response samples

Content type
application/json
{
  • "id": 0
}

Get Organizations

Authorizations:
HTTPBearer
query Parameters
offset
integer (Offset) >= 0
Default: 0

The offset of the organizations to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the organizations to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created" "updated"

The field to order the organizations by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the organizations by.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Delete Organization

Authorizations:
HTTPBearer
path Parameters
organization
required
integer (Organization)

The ID of the organization to delete.

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Update Organization

Authorizations:
HTTPBearer
path Parameters
organization
required
integer (Organization)

The ID of the organization to update.

Request Body schema: application/json
required
Name (string) or Name (null) (Name)

The new organization name.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string"
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get Organization

Authorizations:
HTTPBearer
path Parameters
organization
required
integer (Organization)

The ID of the organization to get.

Responses

Response samples

Content type
application/json
{
  • "object": "organization",
  • "id": 0,
  • "name": "string",
  • "users": 0,
  • "created": 0,
  • "updated": 0
}

Create Role

Create a new role.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
name
required
string (Name) non-empty
Array of Permissions (strings) or Permissions (null) (Permissions)
Default: []
Array of objects (Limits)
Default: []
property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "permissions": [ ],
  • "limits": [ ]
}

Response samples

Content type
application/json
{
  • "id": 0
}

Get Roles

Get all roles.

Authorizations:
HTTPBearer
query Parameters
offset
integer (Offset) >= 0
Default: 0

The offset of the roles to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the roles to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created" "updated"

The field to order the roles by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the roles by.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Delete Role

Delete a role.

Authorizations:
HTTPBearer
path Parameters
role
required
integer (Role)

The ID of the role to delete.

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Update Role

Update a role.

Authorizations:
HTTPBearer
path Parameters
role
required
integer (Role)

The ID of the role to update.

Request Body schema: application/json
required
Name (string) or Name (null) (Name)

The new role name.

Array of Permissions (strings) or Permissions (null) (Permissions)

The new permissions.

Array of Limits (objects) or Limits (null) (Limits)

The new limits.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "permissions": [
    ],
  • "limits": [
    ]
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get Role

Get a role by id.

Authorizations:
HTTPBearer
path Parameters
role
required
integer (Role)

The ID of the role to get.

Responses

Response samples

Content type
application/json
{
  • "object": "role",
  • "id": 0,
  • "name": "string",
  • "permissions": [
    ],
  • "limits": [
    ],
  • "users": 0,
  • "created": 0,
  • "updated": 0
}

Update Router

Update a router.

Authorizations:
HTTPBearer
path Parameters
router
required
integer (Router)

The ID of the router to update (router ID, eg. 123).

Request Body schema: application/json
required
Name (string) or Name (null) (Name)

Name of the model router.

ModelType (string) or null

Type of the model router. It will be used to identify the model router type.

Array of Aliases (strings) or Aliases (null) (Aliases)

Aliases of the model. It will be used to identify the model by users.

RouterLoadBalancingStrategy (string) or null

Routing strategy for load balancing between providers of the model. It will be used to identify the model type.

Cost Prompt Tokens (number) or Cost Prompt Tokens (null) (Cost Prompt Tokens)

Cost of a million prompt tokens (decrease user budget)

Cost Completion Tokens (number) or Cost Completion Tokens (null) (Cost Completion Tokens)

Cost of a million completion tokens (decrease user budget)

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "model-router-1",
  • "type": "text-generation",
  • "aliases": [
    ],
  • "load_balancing_strategy": "least_busy",
  • "cost_prompt_tokens": 0,
  • "cost_completion_tokens": 0
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Create Token

Create a new token.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
name
required
string (Name) non-empty
user
required
integer (User)

User ID to create the token for another user (by default, the current user). Required CREATE_USER permission.

Expires (integer) or Expires (null) (Expires)

Timestamp in seconds

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "user": 0,
  • "expires": 0
}

Response samples

Content type
application/json
{
  • "id": 0,
  • "token": "string"
}

Get Tokens

Get all your tokens.

Authorizations:
HTTPBearer
query Parameters
User (integer) or User (null) (User)

The user ID of the user to get the tokens for.

offset
integer (Offset) >= 0
Default: 0

The offset of the tokens to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the tokens to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created"

The field to order the tokens by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the tokens by.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Delete Token

Delete a token.

Authorizations:
HTTPBearer
path Parameters
user
required
integer (User)

The user ID of the user to delete the token for.

token
required
integer (Token)

The token ID of the token to delete.

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get Token

Get your token by id.

Authorizations:
HTTPBearer
path Parameters
token
required
integer (Token)

The token ID of the token to get.

Responses

Response samples

Content type
application/json
{
  • "object": "token",
  • "id": 0,
  • "name": "string",
  • "token": "string",
  • "user": 0,
  • "expires": 0,
  • "created": 0
}

Create User

Create a new user.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
email
required
string (Email)

The user email.

Name (string) or Name (null) (Name)

The user name.

password
required
string (Password)

The user password.

role
required
integer (Role)

The role ID.

Organization (integer) or Organization (null) (Organization)

The organization ID.

Budget (number) or Budget (null) (Budget)

The budget.

Expires (integer) or Expires (null) (Expires)

The expiration timestamp.

Priority (integer) or Priority (null) (Priority)
Default: 0

The user priority. Higher value means higher priority. 0 is default.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "email": "string",
  • "name": "string",
  • "password": "string",
  • "role": 0,
  • "organization": 0,
  • "budget": 0,
  • "expires": 0,
  • "priority": 0
}

Response samples

Content type
application/json
{
  • "id": 0
}

Get Users

Get all users.

Authorizations:
HTTPBearer
query Parameters
Role (integer) or Role (null) (Role)

The ID of the role to filter the users by.

Organization (integer) or Organization (null) (Organization)

The ID of the organization to filter the users by.

Email (string) or Email (null) (Email)

The email of the user to filter the users by.

offset
integer (Offset) >= 0
Default: 0

The offset of the users to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the users to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created" "updated"

The field to order the users by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the users by.

Responses

Response samples

Content type
application/json
null

Delete User

Delete a user.

Authorizations:
HTTPBearer
path Parameters
user
required
integer (User)

The ID of the user to delete.

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Update User

Update a user.

Authorizations:
HTTPBearer
path Parameters
user
required
integer (User)

The ID of the user to update.

Request Body schema: application/json
required
Email (string) or Email (null) (Email)

The new user email. If None, the user email is not changed.

Name (string) or Name (null) (Name)

The new user name. If None, the user name is not changed.

Current Password (string) or Current Password (null) (Current Password)

The current user password.

Password (string) or Password (null) (Password)

The new user password. If None, the user password is not changed.

Role (integer) or Role (null) (Role)

The new role ID. If None, the user role is not changed.

Organization (integer) or Organization (null) (Organization)

The new organization ID. If None, the user will be removed from the organization if he was in one.

Budget (number) or Budget (null) (Budget)

The new budget. If None, the user will have no budget.

Expires (integer) or Expires (null) (Expires)

The new expiration timestamp. If None, the user will never expire.

Priority (integer) or Priority (null) (Priority)

The new user priority. Higher value means higher priority. If None, unchanged.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "email": "string",
  • "name": "string",
  • "current_password": "string",
  • "password": "string",
  • "role": 0,
  • "organization": 0,
  • "budget": 0,
  • "expires": 0,
  • "priority": 0
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get User

Get a user by id.

Authorizations:
HTTPBearer
path Parameters
user
required
integer (User)

The ID of the user to get.

Responses

Response samples

Content type
application/json
null

Audio

Audio Transcriptions

Transcribes audio into the input language.

Authorizations:
HTTPBearer
Request Body schema: multipart/form-data
required
file
required
string <application/octet-stream> (File)

The audio file object (not file name) to transcribe, in one of these formats: mp3 or wav.

model
required
string (Model)

ID of the model to use. Call /v1/models endpoint to get the list of available models, only automatic-speech-recognition model type is supported.

language
string (AudioTranscriptionLanguage)
Default: ""
Enum: "af" "afrikaans" "albanian" "am" "amharic" "ar" "arabic" "armenian" "as" "assamese" "az" "azerbaijani" "ba" "bashkir" "basque" "be" "belarusian" "bengali" "bg" "bn" "bo" "bosnian" "br" "breton" "bs" "bulgarian" "burmese" "ca" "cantonese" "castilian" "catalan" "chinese" "croatian" "cs" "cy" "czech" "da" "danish" "de" "dutch" "el" "en" "english" "es" "estonian" "et" "eu" "fa" "faroese" "fi" "finnish" "flemish" "fo" "fr" "french" "galician" "georgian" "german" "gl" "greek" "gu" "gujarati" "ha" "haitian" "haitian creole" "hausa" "haw" "hawaiian" "he" "hebrew" "hi" "hindi" "hr" "ht" "hu" "hungarian" "hy" "icelandic" "id" "indonesian" "is" "it" "italian" "ja" "japanese" "javanese" "jw" "ka" "kannada" "kazakh" "khmer" "kk" "km" "kn" "ko" "korean" "la" "lao" "latin" "latvian" "lb" "letzeburgesch" "lingala" "lithuanian" "ln" "lo" "lt" "luxembourgish" "lv" "macedonian" "malagasy" "malay" "malayalam" "maltese" "mandarin" "maori" "marathi" "mg" "mi" "mk" "ml" "mn" "moldavian" "moldovan" "mongolian" "mr" "ms" "mt" "my" "myanmar" "ne" "nepali" "nl" "nn" "no" "norwegian" "nynorsk" "oc" "occitan" "pa" "panjabi" "pashto" "persian" "pl" "polish" "portuguese" "ps" "pt" "punjabi" "pushto" "ro" "romanian" "ru" "russian" "sa" "sanskrit" "sd" "serbian" "shona" "si" "sindhi" "sinhala" "sinhalese" "sk" "sl" "slovak" "slovenian" "sn" "so" "somali" "spanish" "sq" "sr" "su" "sundanese" "sv" "sw" "swahili" "swedish" "ta" "tagalog" "tajik" "tamil" "tatar" "te" "telugu" "tg" "th" "thai" "tibetan" "tk" "tl" "tr" "tt" "turkish" "turkmen" "uk" "ukrainian" "ur" "urdu" "uz" "uzbek" "valencian" "vi" "vietnamese" "welsh" "yi" "yiddish" "yo" "yoruba" "yue" "zh" ""

The language of the output audio. If the output language is different than the audio language, the audio language will be translated into the output language. Supplying the output language in ISO-639-1 (e.g. en, fr) format will improve accuracy and latency.

prompt
string (Prompt)
Default: ""

An optional text to tell the model what to do with the input audio.

response_format
string (Response Format)
Default: "json"
Enum: "json" "text"

The format of the transcript output, in one of these formats: json or text.

temperature
number (Temperature) [ 0 .. 1 ]
Default: 0

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

Responses

Response samples

Content type
application/json
{
  • "id": "string",
  • "text": "string",
  • "model": "string",
  • "usage": {
    }
}

Auth

Login

Receive encrypted token from playground encoded with shared key via POST body. The token contains user id. Refresh and return playground api key associated with the user.

Request Body schema: application/json
required
email
required
string (Email) non-empty

The user email.

password
required
string (Password) non-empty

The user password.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "email": "string",
  • "password": "string"
}

Response samples

Content type
application/json
{
  • "id": 0,
  • "key": "string"
}

Chat

Chat Completions

Creates a model response for the given chat conversation.

Authorizations:
HTTPBearer
query Parameters
required
boolean (Required)
Default: false
Request Body schema: application/json
required
messages
required
Array of any (Messages)

A list of messages comprising the conversation so far.

model
required
string (Model)

ID of the model to use. Call /v1/models endpoint to get the list of available models, only text-generation model type is supported.

Frequency Penalty (number) or Frequency Penalty (null) (Frequency Penalty)
Default: 0

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Logit Bias (object) or Logit Bias (null) (Logit Bias)

Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

Logprobs (boolean) or Logprobs (null) (Logprobs)
Default: false

Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.

Top Logprobs (integer) or Top Logprobs (null) (Top Logprobs)

An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.

Presence Penalty (number) or Presence Penalty (null) (Presence Penalty)
Default: 0

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Max Completion Tokens (integer) or Max Completion Tokens (null) (Max Completion Tokens)

An upper bound for the number of tokens that can be generated for a completion.

N (integer) or N (null) (N)
Default: 1

How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

Response Format (any) or Response Format (null) (Response Format)

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.
Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

Seed (integer) or Seed (null) (Seed)

If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.

Stop (string) or Array of Stop (strings) or Stop (null) (Stop)

Up to 4 sequences where the API will stop generating further tokens.

Stream (boolean) or Stream (null) (Stream)
Default: false

If set, partial message deltas will be sent. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

Stream Options (any) or Stream Options (null) (Stream Options)

Options for streaming response. Only set this when you set stream: true.

Temperature (number) or Temperature (null) (Temperature)
Default: 0.7

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

Top P (number) or Top P (null) (Top P)
Default: 1

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both.

(Tools ((Array of Tools (objects or SearchTool (object))) or Tools (null))) or Tools (null) (Tools)
tool_choice
any (Tool Choice)
Default: "none"

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
none is the default when no tools are present. auto is the default if tools are present.

Parallel Tool Calls (boolean) or Parallel Tool Calls (null) (Parallel Tool Calls)
Default: false

Whether to call tools in parallel or sequentially. If true, the model will call tools in parallel. If false, the model will call tools sequentially. If None, the model will call tools in parallel if the model supports it, otherwise it will call tools sequentially.

User (string) or User (null) (User)

A unique identifier representing the user.

search
boolean (Search)
Deprecated
Default: false
SearchArgs (object) or null
Deprecated
property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "messages": [
    ],
  • "model": "string",
  • "frequency_penalty": 0,
  • "logit_bias": {
    },
  • "logprobs": false,
  • "top_logprobs": 0,
  • "presence_penalty": 0,
  • "max_completion_tokens": 0,
  • "n": 1,
  • "response_format": { },
  • "seed": 0,
  • "stop": "string",
  • "stream": false,
  • "stream_options": { },
  • "temperature": 0.7,
  • "top_p": 1,
  • "tools": [
    ],
  • "tool_choice": "none",
  • "parallel_tool_calls": false,
  • "user": "string",
  • "search": false,
  • "search_args": {
    }
}

Response samples

Content type
application/json
Example
{
  • "id": "string",
  • "choices": [
    ],
  • "created": 0,
  • "model": "string",
  • "object": "chat.completion",
  • "service_tier": "auto",
  • "system_fingerprint": "string",
  • "usage": {
    },
  • "search_results": [ ]
}

Chunks

Get Chunk Deprecated

Get a chunk of a document.

Authorizations:
HTTPBearer
path Parameters
document
required
integer (Document)

The document ID

chunk
required
integer (Chunk)

The chunk ID

query Parameters
required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
{
  • "object": "chunk",
  • "id": 0,
  • "collection_id": 0,
  • "document_id": 0,
  • "content": "string",
  • "metadata": {
    },
  • "created": 0
}

Get Chunks Deprecated

Get chunks of a document.

Authorizations:
HTTPBearer
path Parameters
document
required
integer (Document)

The document ID

query Parameters
limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The number of documents to return

offset
integer (Offset)
Default: 0

The offset of the first document to return

required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Collections

Create Collection

Create a new collection.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
name
required
string (Name) non-empty

The name of the collection.

Description (string) or Description (null) (Description)

The description of the collection.

visibility
string (CollectionVisibility)
Default: "private"
Enum: "private" "public"

The type of the collection. Public collections are available to all users, private collections are only available to the user who created them.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "description": "string",
  • "visibility": "private"
}

Response samples

Content type
application/json
null

Get Collections

Get list of collections.

Authorizations:
HTTPBearer
query Parameters
name
string (Name)

Filter by collection name.

CollectionVisibility (string) or Visibility (null) (Visibility)

Filter by collection visibility.

offset
integer (Offset) >= 0
Default: 0

The offset of the collections to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the collections to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created" "updated"

The order by field to sort the collections by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the collections by.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Get Collection

Get a collection by ID.

Authorizations:
HTTPBearer
path Parameters
collection_id
required
integer (Collection Id)

The collection ID

Responses

Response samples

Content type
application/json
{
  • "object": "collection",
  • "id": 0,
  • "name": "string",
  • "owner": "string",
  • "description": "string",
  • "visibility": "private",
  • "created": 0,
  • "updated": 0,
  • "documents": 0
}

Delete Collection

Delete a collection.

Authorizations:
HTTPBearer
path Parameters
collection_id
required
integer (Collection Id)

The collection ID

query Parameters
required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Update Collection

Update a collection.

Authorizations:
HTTPBearer
path Parameters
collection_id
required
integer (Collection Id)

The collection ID

Request Body schema: application/json
required
Name (string) or Name (null) (Name)

The name of the collection.

Description (string) or Description (null) (Description)

The description of the collection.

CollectionVisibility (string) or null

The type of the collection. Public collections are available to all users, private collections are only available to the user who created them.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "description": "string",
  • "visibility": "private"
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Documents

Create Document

Upload a file, parse and split it into chunks, then create a document. If no file is provided, the document will be created without content, use POST /v1/documents/{document_id}/chunks to fill it.

Authorizations:
HTTPBearer
query Parameters
required
boolean (Required)
Default: true
Request Body schema: multipart/form-data
File (string) or File (null) (File)

The file to create a document from. If not provided, the document will be created without content, use POST /v1/documents/{document_id}/chunks to fill it.

Name (string) or Name (null) (Name)

Name of document if no file is provided or to override file name.

Collection Id (integer) or Collection Id (null) (Collection Id)

The collection ID to use for the file upload. The file will be vectorized with model defined by the collection.

Collection (integer) or Collection (null) (Collection)
Deprecated
disable_chunking
boolean (Disable Chunking)
Default: false

Whether to disable RecursiveCharacterTextSplitter chunking for the upload file.

chunk_size
integer (Chunk Size) >= 0
Default: 2048

The size in characters of the chunks to use for the upload file. If not provided, the document will not be split into chunks.

chunk_min_size
integer (Chunk Min Size) >= 0
Default: 0

The minimum size in characters of the chunks to use for the upload file.

chunk_overlap
integer (Chunk Overlap) >= 0
Default: 0

The overlap in characters of the chunks to use for the upload file.

is_separator_regex
boolean (Is Separator Regex)
Default: false

Whether the separator is a regex to use for the upload file.

separators
Array of strings (Separators) >= 0 items
Default: []

Delimiters used by RecursiveCharacterTextSplitter for further splitting. If provided, preset_separators is ignored.

preset_separators
string (PresetSeparators)
Default: "markdown"
Enum: "cpp" "go" "java" "kotlin" "js" "ts" "php" "proto" "python" "r" "rst" "ruby" "rust" "scala" "swift" "markdown" "latex" "html" "sol" "csharp" "cobol" "c" "lua" "perl" "haskell" "elixir" "powershell" "visualbasic6"

Preset separators used by RecursiveCharacterTextSplitter for further splitting. See implemented details.

metadata
string (Metadata)
Default: ""

Optional additional metadata to add to each chunk if a file is provided. Provide a stringified JSON object matching the Metadata schema.

Responses

Response samples

Content type
application/json
{
  • "id": 0
}

Get Documents

Get all documents ID from a collection.

Authorizations:
HTTPBearer
query Parameters
Name (string) or Name (null) (Name)

Filter documents by name

Collection Id (integer) or Collection Id (null) (Collection Id)

Filter documents by collection ID

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The number of documents to return

offset
integer (Offset)
Default: 0

The offset of the first document to return

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created"

The order by field to sort the documents by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the documents by.

required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
null

Get Document

Get a document by ID.

Authorizations:
HTTPBearer
path Parameters
document_id
required
integer (Document Id) >= 0

The document ID

query Parameters
required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
{
  • "object": "document",
  • "id": 0,
  • "name": "string",
  • "collection_id": 0,
  • "created": 0,
  • "chunks": 0
}

Delete Document

Delete a document.

Authorizations:
HTTPBearer
path Parameters
document_id
required
integer (Document Id) > 0

The document ID

query Parameters
required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Create Document Chunks

Fill document with chunks.

Authorizations:
HTTPBearer
path Parameters
document_id
required
integer (Document Id) > 0

The document ID

query Parameters
required
boolean (Required)
Default: true
Request Body schema: application/json
required
required
Array of objects (Chunks) [ 1 .. 64 ] items

The list of chunks to create.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "chunks": [
    ]
}

Response samples

Content type
application/json
null

Get Document Chunks

Get chunks of a document.

Authorizations:
HTTPBearer
path Parameters
document_id
required
integer (Document Id) > 0

The document ID

query Parameters
limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The number of chunks to return

offset
integer (Offset)
Default: 0

The offset of the first chunk to return

required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
null

Delete Document Chunk

Delete a chunk of a document.

Authorizations:
HTTPBearer
path Parameters
document_id
required
integer (Document Id) > 0

The document ID

chunk_id
required
integer (Chunk Id) >= 0

The chunk ID

query Parameters
required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get Document Chunk

Get a chunk of a document.

Authorizations:
HTTPBearer
path Parameters
document_id
required
integer (Document Id) > 0

The document ID

chunk_id
required
integer (Chunk Id) >= 0

The chunk ID

query Parameters
required
boolean (Required)
Default: true

Responses

Response samples

Content type
application/json
null

Embeddings

Embeddings

Creates an embedding vector representing the input text.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
required
Array of Input (integers) or Array of Input (integers) or Input (string) or Array of Input (strings) (Input)

Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (call /v1/models endpoint to get the max_context_length by model) and cannot be an empty string.

model
required
string (Model)

ID of the model to use. Call /v1/models endpoint to get the list of available models, only text-embeddings-inference model type is supported.

Dimensions (integer) or Dimensions (null) (Dimensions)

The number of dimensions the resulting output embeddings should have.

"float" (string) or Encoding Format (null) (Encoding Format)
Default: "float"

The format of the output embeddings. Only float is supported.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "input": [
    ],
  • "model": "string",
  • "dimensions": 0,
  • "encoding_format": "float"
}

Response samples

Content type
application/json
{
  • "data": [
    ],
  • "model": "string",
  • "object": "list",
  • "usage": {
    },
  • "id": "string"
}

Me

Get User

Get information about the current user.

Authorizations:
HTTPBearer

Responses

Response samples

Content type
application/json
{
  • "object": "userInfo",
  • "id": 0,
  • "email": "string",
  • "name": "string",
  • "organization": 0,
  • "budget": 0,
  • "permissions": [
    ],
  • "limits": [
    ],
  • "expires": 0,
  • "priority": 0,
  • "created": 0,
  • "updated": 0
}

Update User

Update information about the current user.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
Name (string) or Name (null) (Name)

The user name.

Email (string) or Email (null) (Email)

The user email.

Current Password (string) or Current Password (null) (Current Password)

The current user password.

Password (string) or Password (null) (Password)

The new user password. If None, the user password is not changed.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "email": "string",
  • "current_password": "string",
  • "password": "string"
}

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Create Key

Create a new API key.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
name
required
string (Name)
Expires (integer) or Expires (null) (Expires)

Timestamp in seconds

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "name": "string",
  • "expires": 0
}

Response samples

Content type
application/json
{
  • "id": 0,
  • "token": "string"
}

Get Keys

Get all your tokens.

Authorizations:
HTTPBearer
query Parameters
offset
integer (Offset) >= 0
Default: 0

The offset of the tokens to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the tokens to get.

order_by
string (Order By)
Default: "id"
Enum: "id" "name" "created"

The field to order the tokens by.

order_direction
string (Order Direction)
Default: "asc"
Enum: "asc" "desc"

The direction to order the tokens by.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Delete Key

Delete a API key.

Authorizations:
HTTPBearer
path Parameters
key
required
integer (Key)

The key ID of the key to delete.

Responses

Response samples

Content type
application/json
{
  • "detail": [
    ]
}

Get Key

Get your token by id.

Authorizations:
HTTPBearer
path Parameters
key
required
integer (Key)

The key ID of the key to get.

Responses

Response samples

Content type
application/json
{
  • "object": "key",
  • "id": 0,
  • "name": "string",
  • "token": "string",
  • "expires": 0,
  • "created": 0
}

Get Usage

Get usage for the current user.

Authorizations:
HTTPBearer
query Parameters
offset
integer (Offset) >= 0
Default: 0

The offset of the usages to get.

limit
integer (Limit) [ 1 .. 100 ]
Default: 10

The limit of the usages to get.

Start Time (integer) or Start Time (null) (Start Time)

Start time as Unix timestamp (if not provided, will be set to 30 days ago)

End Time (integer) or End Time (null) (End Time)

End time as Unix timestamp (if not provided, will be set to now)

EndpointUsage (string) or Endpoint (null) (Endpoint)

The endpoint to get usage for.

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

Models

Get Model

Get a model by name and provide basic information.

Authorizations:
HTTPBearer
path Parameters
model
required
string (Model)

The name of the model to get.

Responses

Response samples

Content type
application/json
{
  • "id": "string",
  • "type": "automatic-speech-recognition",
  • "aliases": [
    ],
  • "created": 0,
  • "owned_by": "string",
  • "max_context_length": 0,
  • "costs": {
    },
  • "object": "model"
}

Get Models

Lists the currently available models and provides basic information.

Authorizations:
HTTPBearer

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ]
}

OCR

Ocr

Extracts text from files using OCR.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
ResponseFormat (object) or null

Specify the format that the model must output for the bounding boxes. By default it will use { "type": "text" }. Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is in JSON. When using JSON mode you MUST also instruct the model to produce JSON yourself with a system or a user message. Setting to { "type": "json_schema" } enables JSON schema mode, which guarantees the message the model generates is in JSON and follows the schema you provide.

required
DocumentURLChunk (object) or ImageURLChunk (object) (Document)

Document to run OCR on.

ResponseFormat (object) or null

Specify the format that the model must output for the document. By default it will use { "type": "text" }. Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is in JSON. When using JSON mode you MUST also instruct the model to produce JSON yourself with a system or a user message. Setting to { "type": "json_schema" } enables JSON schema mode, which guarantees the message the model generates is in JSON and follows the schema you provide.

Image Limit (integer) or Image Limit (null) (Image Limit)

Max images to extract

Image Min Size (integer) or Image Min Size (null) (Image Min Size)

Minimum height and width of image to extract

Include Image Base64 (boolean) or Include Image Base64 (null) (Include Image Base64)

Include image URLs in response

Model (string) or Model (null) (Model)

The model to use for the OCR.

Array of Pages (integers) or Pages (null) (Pages)

Specific pages user wants to process in various formats: single number, range, or list of both. Starts from 0

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "bbox_annotation_format": {
    },
  • "document": {
    },
  • "document_annotation_format": {
    },
  • "image_limit": 0,
  • "image_min_size": 0,
  • "include_image_base64": true,
  • "model": "string",
  • "pages": [
    ]
}

Response samples

Content type
application/json
{
  • "document_annotation": "string",
  • "id": "string",
  • "model": "string",
  • "pages": [
    ],
  • "usage": {
    },
  • "usage_info": {
    }
}

Parse

Parse Deprecated

Parse a PDF file into markdown.

Authorizations:
HTTPBearer
Request Body schema: multipart/form-data
required
file
required
string <application/octet-stream> (File)

The file to parse.

page_range
string (Page Range)
Default: ""

Page range to convert, specify comma separated page numbers or ranges. Example: '0,5-10,20'

force_ocr
boolean (Force Ocr)
Default: false

Force OCR on all pages of the PDF. Defaults to False. This can lead to worse results if you have good text in your PDFs (which is true in most cases).

Responses

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ],
  • "usage": {
    }
}

Rerank

Rerank

Creates an ordered array with each text assigned a relevance score, based on the query.

Authorizations:
HTTPBearer
Request Body schema: application/json
required
query
required
string (Query) non-empty

The search query to use for the reranking. query and prompt cannot both be provided.

documents
required
Array of strings (Documents) [ items non-empty ]
model
required
string (Model) non-empty

The model to use for the reranking, call /v1/models endpoint to get the list of available models, only text-classification model type is supported.

Top N (integer) or Top N (null) (Top N)

The number of top results to return. If set to None, all results will be returned.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "query": "string",
  • "documents": [
    ],
  • "model": "string",
  • "top_n": 1
}

Response samples

Content type
application/json
{
  • "object": "list",
  • "id": "string",
  • "results": [
    ],
  • "model": "string",
  • "usage": {
    }
}

Search

Search

Get relevant chunks from the collections and a query.

Authorizations:
HTTPBearer
query Parameters
required
boolean (Required)
Default: true
Request Body schema: application/json
required
collection_ids
Array of integers (Collection Ids) [ 0 .. 100 ] items [ items > 0 ]
Default: []

List of collections ID.

document_ids
Array of integers (Document Ids) [ 0 .. 100 ] items [ items > 0 ]
Default: []

List of document IDs

ComparisonFilter (object) or CompoundFilter (object) or Metadata Filters (null) (Metadata Filters)

Metadata filters to apply to the search.

limit
integer (Limit) ( 0 .. 100 ]
Default: 10

Number of results to return.

offset
integer (Offset) >= 0
Default: 0

Offset for pagination, specifying how many results to skip from the beginning.

method
string (SearchMethod)
Default: "semantic"
Enum: "hybrid" "semantic" "lexical"

Search method to use.

rff_k
integer (Rff K) [ 0 .. 16384 ]
Default: 60

Smoothing constant for Reciprocal Rank Fusion (RRF) algorithm in hybrid search (recommended: from 10 to 100).

score_threshold
number (Score Threshold) [ 0 .. 1 ]
Default: 0

Score of cosine similarity threshold for filtering results, only available for semantic search method.

Query (string) or Query (null) (Query)

Query related to the search.

property name*
additional property
any

Responses

Request samples

Content type
application/json
{
  • "collection_ids": [ ],
  • "document_ids": [ ],
  • "metadata_filters": {
    },
  • "limit": 10,
  • "offset": 0,
  • "method": "hybrid",
  • "rff_k": 60,
  • "score_threshold": 0,
  • "query": "string"
}

Response samples

Content type
application/json
{
  • "object": "list",
  • "data": [
    ],
  • "usage": {
    }
}

Monitoring

Get Metrics

Authorizations:
HTTPBearer

Responses

Response samples

Content type
application/json
null

Health

Responses

Response samples

Content type
application/json
null