OpenGateLLM (0.4.1)

Download OpenAPI specification:

License: MIT

OpenGateLLM connect to your models. You can configuration this swagger UI in the configuration file, like hide routes or change the title.

See documentation

Admin

Create Provider

Authorizations:

HTTPBearer

Request Body schema: application/json
required

router required	integer (Router) ID of the model to create the provider for (router ID, eg. 123).
type required	string (ProviderType) Enum: "albert" "openai" "mistral" "tei" "vllm" Model provider type.
	Url (string) or Url (null) (Url) Model provider API url. The url must only contain the domain name (without `/v1` suffix for example). Depends of the model provider type, the url can be optional (Albert, OpenAI).
	Key (string) or Key (null) (Key) Model provider API key.
timeout	integer (Timeout) Default: 300 Timeout for the model provider requests, after user receive an 503 error (model is too busy).
model_name required	string (Model Name) Model name from the model provider.
model_hosting_zone	string (ProviderCarbonFootprintZone) Default: "WOR" Enum: "ABW" "AFG" "AGO" "AIA" "ALA" "ALB" "AND" "ARE" "ARG" "ARM" "ASM" "ATA" "ATF" "ATG" "AUS" "AUT" "AZE" "BDI" "BEL" "BEN" "BES" "BFA" "BGD" "BGR" "BHR" "BHS" "BIH" "BLM" "BLR" "BLZ" "BMU" "BOL" "BRA" "BRB" "BRN" "BTN" "BVT" "BWA" "CAF" "CAN" "CCK" "CHE" "CHL" "CHN" "CIV" "CMR" "COD" "COG" "COK" "COL" "COM" "CPV" "CRI" "CUB" "CUW" "CXR" "CYM" "CYP" "CZE" "DEU" "DJI" "DMA" "DNK" "DOM" "DZA" "ECU" "EGY" "ERI" "ESH" "ESP" "EST" "ETH" "FIN" "FJI" "FLK" "FRA" "FRO" "FSM" "GAB" "GBR" "GEO" "GGY" "GHA" "GIB" "GIN" "GLP" "GMB" "GNB" "GNQ" "GRC" "GRD" "GRL" "GTM" "GUF" "GUM" "GUY" "HKG" "HMD" "HND" "HRV" "HTI" "HUN" "IDN" "IMN" "IND" "IOT" "IRL" "IRN" "IRQ" "ISL" "ISR" "ITA" "JAM" "JEY" "JOR" "JPN" "KAZ" "KEN" "KGZ" "KHM" "KIR" "KNA" "KOR" "KWT" "LAO" "LBN" "LBR" "LBY" "LCA" "LIE" "LKA" "LSO" "LTU" "LUX" "LVA" "MAC" "MAF" "MAR" "MCO" "MDA" "MDG" "MDV" "MEX" "MHL" "MKD" "MLI" "MLT" "MMR" "MNE" "MNG" "MNP" "MOZ" "MRT" "MSR" "MTQ" "MUS" "MWI" "MYS" "MYT" "NAM" "NCL" "NER" "NFK" "NGA" "NIC" "NIU" "NLD" "NOR" "NPL" "NRU" "NZL" "OMN" "PAK" "PAN" "PCN" "PER" "PHL" "PLW" "PNG" "POL" "PRI" "PRK" "PRT" "PRY" "PSE" "PYF" "QAT" "REU" "ROU" "RUS" "RWA" "SAU" "SDN" "SEN" "SGP" "SGS" "SHN" "SJM" "SLB" "SLE" "SLV" "SMR" "SOM" "SPM" "SRB" "SSD" "STP" "SUR" "SVK" "SVN" "SWE" "SWZ" "SXM" "SYC" "SYR" "TCA" "TCD" "TGO" "THA" "TJK" "TKL" "TKM" "TLS" "TON" "TTO" "TUN" "TUR" "TUV" "TWN" "TZA" "UGA" "UKR" "UMI" "URY" "USA" "UZB" "VAT" "VCT" "VEN" "VGB" "VIR" "VNM" "VUT" "WLF" "WOR" "WSM" "YEM" "ZAF" "ZMB" "ZWE" Model hosting zone using ISO 3166-1 alpha-3 code format (e.g., `WOR` for World, `FRA` for France, `USA` for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.ai
model_total_params	integer (Model Total Params) >= 0 Default: 0 Total params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai
model_active_params	integer (Model Active Params) >= 0 Default: 0 Active params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai
	Metric (string) or null The metric to use for the quality of service policy. If not provided, no QoS policy is applied.
	Qos Limit (number) or Qos Limit (null) (Qos Limit) The value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"router": 0,
"type": "albert",
"url": "string",
"key": "string",
"timeout": 300,
"model_name": "string",
"model_hosting_zone": "ABW",
"model_total_params": 0,
"model_active_params": 0,
"qos_metric": "ttft",
"qos_limit": 0
}

Response samples

Content type

application/json

{"id": 0,
"router_id": 0,
"user_id": 0,
"type": "albert",
"url": "string",
"key": "string",
"timeout": 0,
"model_name": "string",
"model_hosting_zone": "ABW",
"model_total_params": 0,
"model_active_params": 0,
"qos_metric": "ttft",
"qos_limit": 0,
"created": 0,
"updated": 0
}

Get Providers

Get all model providers for a router.

Authorizations:

HTTPBearer

query Parameters

	Router (integer) or Router (null) (Router) Filter providers by router ID.
offset	integer (Offset) >= 0 Default: 0 The offset of the tokens to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the tokens to get.
order_by	string (Order By) Default: "id" Enum: "id" "model_name" "created" The field to order the tokens by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the tokens by.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "provider",
"id": 0,
"router_id": 0,
"user_id": 0,
"type": "albert",
"url": "string",
"key": "string",
"timeout": 0,
"model_name": "string",
"model_hosting_zone": "ABW",
"model_total_params": 0,
"model_active_params": 0,
"qos_metric": "ttft",
"qos_limit": 0,
"created": 0,
"updated": 0
}
]
}

Delete Provider

Authorizations:

HTTPBearer

path Parameters

provider_id

required

integer (Provider Id)

The ID of the provider to delete.

Responses

Response samples

200
422

Content type

application/json

{"object": "provider",
"id": 0,
"router_id": 0,
"user_id": 0,
"type": "albert",
"url": "string",
"key": "string",
"timeout": 0,
"model_name": "string",
"model_hosting_zone": "ABW",
"model_total_params": 0,
"model_active_params": 0,
"qos_metric": "ttft",
"qos_limit": 0,
"created": 0,
"updated": 0
}

Update Provider

Update a model provider.

Authorizations:

HTTPBearer

path Parameters

provider

required

integer (Provider)

The ID of the provider to update.

Request Body schema: application/json
required

	Router (integer) or Router (null) (Router) The ID of the new router to assign to the provider.
	Timeout (integer) or Timeout (null) (Timeout) Timeout for the model provider requests, after user receive an 500 error (model is too busy).
	ProviderCarbonFootprintZone (string) or null Model hosting zone using ISO 3166-1 alpha-3 code format (e.g., `WOR` for World, `FRA` for France, `USA` for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.ai
	Model Total Params (integer) or Model Total Params (null) (Model Total Params) Total params of the model in billions of parameters for carbon footprint computation. If not provided, the active params will be used if provided, else carbon footprint will not be computed. For more information, see https://ecologits.ai
	Model Active Params (integer) or Model Active Params (null) (Model Active Params) Active params of the model in billions of parameters for carbon footprint computation. If not provided, the total params will be used if provided, else carbon footprint will not be computed. For more information, see https://ecologits.ai
	Metric (string) or null The metric to use for the quality of service policy. If not provided, no QoS policy is applied.
	Qos Limit (number) or Qos Limit (null) (Qos Limit) The value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"router": 0,
"timeout": 0,
"model_hosting_zone": "ABW",
"model_total_params": 0,
"model_active_params": 0,
"qos_metric": "ttft",
"qos_limit": 0
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get Provider

Get a model provider by router and provider IDs.

Authorizations:

HTTPBearer

path Parameters

provider

required

integer (Provider)

The ID of the provider to get.

Responses

Response samples

200
422

Content type

application/json

{"object": "provider",
"id": 0,
"router_id": 0,
"user_id": 0,
"type": "albert",
"url": "string",
"key": "string",
"timeout": 0,
"model_name": "string",
"model_hosting_zone": "ABW",
"model_total_params": 0,
"model_active_params": 0,
"qos_metric": "ttft",
"qos_limit": 0,
"created": 0,
"updated": 0
}

Create Router

Authorizations:

HTTPBearer

Request Body schema: application/json
required

name required	string (Name) non-empty Name of the model router.
type required	string (ModelType) Enum: "automatic-speech-recognition" "image-text-to-text" "image-to-text" "text-embeddings-inference" "text-generation" "text-classification" Type of the model router. It will be used to identify the model router type.
aliases	Array of strings (Aliases) [ items [ 1 .. 64 ] characters ] Aliases of the model. It will be used to identify the model by users.
load_balancing_strategy	string (RouterLoadBalancingStrategy) Default: "shuffle" Enum: "shuffle" "least_busy" Routing strategy for load balancing between providers of the model. It will be used to identify the model type.
cost_prompt_tokens	number (Cost Prompt Tokens) >= 0 Default: 0 Cost of a million prompt tokens (decrease user budget)
cost_completion_tokens	number (Cost Completion Tokens) >= 0 Default: 0 Cost of a million completion tokens (decrease user budget)
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "model-router-1",
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"load_balancing_strategy": "shuffle",
"cost_prompt_tokens": 0,
"cost_completion_tokens": 0
}

Response samples

201
401
403
409
422

Content type

application/json

{"id": 0,
"name": "model-router-1",
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"load_balancing_strategy": "shuffle",
"cost_prompt_tokens": 0,
"cost_completion_tokens": 0
}

Get Routers

Authorizations:

HTTPBearer

query Parameters

offset	integer (Offset) >= 0 Default: 0 Number of routers to skip.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 Maximum number of routers to return.
sort_by	string (RouterSortField) Default: "id" Enum: "id" "name" "created" Field to sort by.
sort_order	string (SortOrder) Default: "asc" Enum: "asc" "desc" Sort order.

Responses

Response samples

200
401
403
422

Content type

application/json

{"object": "list",
"total": 0,
"offset": 0,
"limit": 0,
"data": [{"object": "router",
"id": 0,
"name": "string",
"user_id": 0,
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"load_balancing_strategy": "shuffle",
"vector_size": 0,
"max_context_length": 0,
"cost_prompt_tokens": 0,
"cost_completion_tokens": 0,
"providers": 0,
"created": 0,
"updated": 0
}
]
}

Get Router

Authorizations:

HTTPBearer

path Parameters

router_id

required

integer (Router Id)

The router ID.

Responses

Response samples

200
401
403
404
422

Content type

application/json

{"object": "router",
"id": 0,
"name": "string",
"user_id": 0,
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"load_balancing_strategy": "shuffle",
"vector_size": 0,
"max_context_length": 0,
"cost_prompt_tokens": 0,
"cost_completion_tokens": 0,
"providers": 0,
"created": 0,
"updated": 0
}

Delete Router

Authorizations:

HTTPBearer

path Parameters

router_id

required

integer (Router Id)

The ID of the router to delete (router ID, eg. 123).

Responses

Response samples

200
401
403
404
422

Content type

application/json

{"object": "router",
"id": 0,
"name": "string",
"user_id": 0,
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"load_balancing_strategy": "shuffle",
"vector_size": 0,
"max_context_length": 0,
"cost_prompt_tokens": 0,
"cost_completion_tokens": 0,
"providers": 0,
"created": 0,
"updated": 0
}

Create Organization

Authorizations:

HTTPBearer

Request Body schema: application/json
required

name required	string (Name) non-empty The organization name.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string"
}

Response samples

201
422

Content type

application/json

{"id": 0
}

Get Organizations

Authorizations:

HTTPBearer

query Parameters

offset	integer (Offset) >= 0 Default: 0 The offset of the organizations to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the organizations to get.
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" "updated" The field to order the organizations by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the organizations by.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "organization",
"id": 0,
"name": "string",
"users": 0,
"created": 0,
"updated": 0
}
]
}

Delete Organization

Authorizations:

HTTPBearer

path Parameters

organization

required

integer (Organization)

The ID of the organization to delete.

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Update Organization

Authorizations:

HTTPBearer

path Parameters

organization

required

integer (Organization)

The ID of the organization to update.

Request Body schema: application/json
required

	Name (string) or Name (null) (Name) The new organization name.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string"
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get Organization

Authorizations:

HTTPBearer

path Parameters

organization

required

integer (Organization)

The ID of the organization to get.

Responses

Response samples

200
422

Content type

application/json

{"object": "organization",
"id": 0,
"name": "string",
"users": 0,
"created": 0,
"updated": 0
}

Create Role

Create a new role.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

name required	string (Name) non-empty
	Array of Permissions (strings) or Permissions (null) (Permissions) Default: []
	Array of objects (Limits) Default: []
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"permissions": [ ],
"limits": [ ]
}

Response samples

201
422

Content type

application/json

{"id": 0
}

Get Roles

Get all roles.

Authorizations:

HTTPBearer

query Parameters

offset	integer (Offset) >= 0 Default: 0 The offset of the roles to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the roles to get.
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" "updated" The field to order the roles by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the roles by.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "role",
"id": 0,
"name": "string",
"permissions": ["admin"
],
"limits": [{"router": 0,
"type": "tpm",
"value": 0
}
],
"users": 0,
"created": 0,
"updated": 0
}
]
}

Delete Role

Delete a role.

Authorizations:

HTTPBearer

path Parameters

role

required

integer (Role)

The ID of the role to delete.

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Update Role

Update a role.

Authorizations:

HTTPBearer

path Parameters

role

required

integer (Role)

The ID of the role to update.

Request Body schema: application/json
required

	Name (string) or Name (null) (Name) The new role name.
	Array of Permissions (strings) or Permissions (null) (Permissions) The new permissions.
	Array of Limits (objects) or Limits (null) (Limits) The new limits.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"permissions": ["admin"
],
"limits": [{"router": 0,
"type": "tpm",
"value": 0
}
]
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get Role

Get a role by id.

Authorizations:

HTTPBearer

path Parameters

role

required

integer (Role)

The ID of the role to get.

Responses

Response samples

200
422

Content type

application/json

{"object": "role",
"id": 0,
"name": "string",
"permissions": ["admin"
],
"limits": [{"router": 0,
"type": "tpm",
"value": 0
}
],
"users": 0,
"created": 0,
"updated": 0
}

Update Router

Update a router.

Authorizations:

HTTPBearer

path Parameters

router

required

integer (Router)

The ID of the router to update (router ID, eg. 123).

Request Body schema: application/json
required

	Name (string) or Name (null) (Name) Name of the model router.
	ModelType (string) or null Type of the model router. It will be used to identify the model router type.
	Array of Aliases (strings) or Aliases (null) (Aliases) Aliases of the model. It will be used to identify the model by users.
	RouterLoadBalancingStrategy (string) or null Routing strategy for load balancing between providers of the model. It will be used to identify the model type.
	Cost Prompt Tokens (number) or Cost Prompt Tokens (null) (Cost Prompt Tokens) Cost of a million prompt tokens (decrease user budget)
	Cost Completion Tokens (number) or Cost Completion Tokens (null) (Cost Completion Tokens) Cost of a million completion tokens (decrease user budget)
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "model-router-1",
"type": "text-generation",
"aliases": ["model-alias",
"model-alias-2"
],
"load_balancing_strategy": "least_busy",
"cost_prompt_tokens": 0,
"cost_completion_tokens": 0
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Create Token

Create a new token.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

name required	string (Name) non-empty
user required	integer (User) User ID to create the token for another user (by default, the current user). Required CREATE_USER permission.
	Expires (integer) or Expires (null) (Expires) Timestamp in seconds
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"user": 0,
"expires": 0
}

Response samples

201
422

Content type

application/json

{"id": 0,
"token": "string"
}

Get Tokens

Get all your tokens.

Authorizations:

HTTPBearer

query Parameters

	User (integer) or User (null) (User) The user ID of the user to get the tokens for.
offset	integer (Offset) >= 0 Default: 0 The offset of the tokens to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the tokens to get.
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" The field to order the tokens by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the tokens by.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "token",
"id": 0,
"name": "string",
"token": "string",
"user": 0,
"expires": 0,
"created": 0
}
]
}

Delete Token

Delete a token.

Authorizations:

HTTPBearer

path Parameters

user required	integer (User) The user ID of the user to delete the token for.
token required	integer (Token) The token ID of the token to delete.

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get Token

Get your token by id.

Authorizations:

HTTPBearer

path Parameters

token

required

integer (Token)

The token ID of the token to get.

Responses

Response samples

200
422

Content type

application/json

{"object": "token",
"id": 0,
"name": "string",
"token": "string",
"user": 0,
"expires": 0,
"created": 0
}

Create User

Create a new user.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

email required	string (Email) The user email.
	Name (string) or Name (null) (Name) The user name.
password required	string (Password) The user password.
role required	integer (Role) The role ID.
	Organization (integer) or Organization (null) (Organization) The organization ID.
	Budget (number) or Budget (null) (Budget) The budget.
	Expires (integer) or Expires (null) (Expires) The expiration timestamp.
	Priority (integer) or Priority (null) (Priority) Default: 0 The user priority. Higher value means higher priority. 0 is default.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"email": "string",
"name": "string",
"password": "string",
"role": 0,
"organization": 0,
"budget": 0,
"expires": 0,
"priority": 0
}

Response samples

201
422

Content type

application/json

{"id": 0
}

Get Users

Get all users.

Authorizations:

HTTPBearer

query Parameters

	Role (integer) or Role (null) (Role) The ID of the role to filter the users by.
	Organization (integer) or Organization (null) (Organization) The ID of the organization to filter the users by.
	Email (string) or Email (null) (Email) The email of the user to filter the users by.
offset	integer (Offset) >= 0 Default: 0 The offset of the users to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the users to get.
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" "updated" The field to order the users by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the users by.

Responses

Response samples

200
422

Content type

application/json

null

Delete User

Delete a user.

Authorizations:

HTTPBearer

path Parameters

user

required

integer (User)

The ID of the user to delete.

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Update User

Update a user.

Authorizations:

HTTPBearer

path Parameters

user

required

integer (User)

The ID of the user to update.

Request Body schema: application/json
required

	Email (string) or Email (null) (Email) The new user email. If None, the user email is not changed.
	Name (string) or Name (null) (Name) The new user name. If None, the user name is not changed.
	Current Password (string) or Current Password (null) (Current Password) The current user password.
	Password (string) or Password (null) (Password) The new user password. If None, the user password is not changed.
	Role (integer) or Role (null) (Role) The new role ID. If None, the user role is not changed.
	Organization (integer) or Organization (null) (Organization) The new organization ID. If None, the user will be removed from the organization if he was in one.
	Budget (number) or Budget (null) (Budget) The new budget. If None, the user will have no budget.
	Expires (integer) or Expires (null) (Expires) The new expiration timestamp. If None, the user will never expire.
	Priority (integer) or Priority (null) (Priority) The new user priority. Higher value means higher priority. If None, unchanged.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"email": "string",
"name": "string",
"current_password": "string",
"password": "string",
"role": 0,
"organization": 0,
"budget": 0,
"expires": 0,
"priority": 0
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get User

Get a user by id.

Authorizations:

HTTPBearer

path Parameters

user

required

integer (User)

The ID of the user to get.

Responses

Response samples

200
422

Content type

application/json

null

Audio

Audio Transcriptions

Transcribes audio into the input language.

Authorizations:

HTTPBearer

Request Body schema: multipart/form-data
required

file required	string <application/octet-stream> (File) The audio file object (not file name) to transcribe, in one of these formats: mp3 or wav.
model required	string (Model) ID of the model to use. Call `/v1/models` endpoint to get the list of available models, only `automatic-speech-recognition` model type is supported.
language	string (AudioTranscriptionLanguage) Default: "" Enum: "af" "afrikaans" "albanian" "am" "amharic" "ar" "arabic" "armenian" "as" "assamese" "az" "azerbaijani" "ba" "bashkir" "basque" "be" "belarusian" "bengali" "bg" "bn" "bo" "bosnian" "br" "breton" "bs" "bulgarian" "burmese" "ca" "cantonese" "castilian" "catalan" "chinese" "croatian" "cs" "cy" "czech" "da" "danish" "de" "dutch" "el" "en" "english" "es" "estonian" "et" "eu" "fa" "faroese" "fi" "finnish" "flemish" "fo" "fr" "french" "galician" "georgian" "german" "gl" "greek" "gu" "gujarati" "ha" "haitian" "haitian creole" "hausa" "haw" "hawaiian" "he" "hebrew" "hi" "hindi" "hr" "ht" "hu" "hungarian" "hy" "icelandic" "id" "indonesian" "is" "it" "italian" "ja" "japanese" "javanese" "jw" "ka" "kannada" "kazakh" "khmer" "kk" "km" "kn" "ko" "korean" "la" "lao" "latin" "latvian" "lb" "letzeburgesch" "lingala" "lithuanian" "ln" "lo" "lt" "luxembourgish" "lv" "macedonian" "malagasy" "malay" "malayalam" "maltese" "mandarin" "maori" "marathi" "mg" "mi" "mk" "ml" "mn" "moldavian" "moldovan" "mongolian" "mr" "ms" "mt" "my" "myanmar" "ne" "nepali" "nl" "nn" "no" "norwegian" "nynorsk" "oc" "occitan" "pa" "panjabi" "pashto" "persian" "pl" "polish" "portuguese" "ps" "pt" "punjabi" "pushto" "ro" "romanian" "ru" "russian" "sa" "sanskrit" "sd" "serbian" "shona" "si" "sindhi" "sinhala" "sinhalese" "sk" "sl" "slovak" "slovenian" "sn" "so" "somali" "spanish" "sq" "sr" "su" "sundanese" "sv" "sw" "swahili" "swedish" "ta" "tagalog" "tajik" "tamil" "tatar" "te" "telugu" "tg" "th" "thai" "tibetan" "tk" "tl" "tr" "tt" "turkish" "turkmen" "uk" "ukrainian" "ur" "urdu" "uz" "uzbek" "valencian" "vi" "vietnamese" "welsh" "yi" "yiddish" "yo" "yoruba" "yue" "zh" "" The language of the output audio. If the output language is different than the audio language, the audio language will be translated into the output language. Supplying the output language in ISO-639-1 (e.g. en, fr) format will improve accuracy and latency.
prompt	string (Prompt) Default: "" An optional text to tell the model what to do with the input audio.
response_format	string (Response Format) Default: "json" Enum: "json" "text" The format of the transcript output, in one of these formats: `json` or `text`.
temperature	number (Temperature) [ 0 .. 1 ] Default: 0 The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

Responses

Response samples

200
422

Content type

application/json

{"id": "string",
"text": "string",
"model": "string",
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
}
}

Auth

Login

Receive encrypted token from playground encoded with shared key via POST body. The token contains user id. Refresh and return playground api key associated with the user.

Request Body schema: application/json
required

email required	string (Email) non-empty The user email.
password required	string (Password) non-empty The user password.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"email": "string",
"password": "string"
}

Response samples

200
422

Content type

application/json

{"id": 0,
"key": "string"
}

Chat

Chat Completions

Creates a model response for the given chat conversation.

Authorizations:

HTTPBearer

query Parameters

required

boolean (Required)

Default: false

Request Body schema: application/json
required

messages required	Array of any (Messages) A list of messages comprising the conversation so far.
model required	string (Model) ID of the model to use. Call `/v1/models` endpoint to get the list of available models, only `text-generation` model type is supported.
	Frequency Penalty (number) or Frequency Penalty (null) (Frequency Penalty) Default: 0 Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
	Logit Bias (object) or Logit Bias (null) (Logit Bias) Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
	Logprobs (boolean) or Logprobs (null) (Logprobs) Default: false Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.
	Top Logprobs (integer) or Top Logprobs (null) (Top Logprobs) An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.
	Presence Penalty (number) or Presence Penalty (null) (Presence Penalty) Default: 0 Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
	Max Completion Tokens (integer) or Max Completion Tokens (null) (Max Completion Tokens) An upper bound for the number of tokens that can be generated for a completion.
	N (integer) or N (null) (N) Default: 1 How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs.
	Response Format (any) or Response Format (null) (Response Format) Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.
	Seed (integer) or Seed (null) (Seed) If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
	Stop (string) or Array of Stop (strings) or Stop (null) (Stop) Up to 4 sequences where the API will stop generating further tokens.
	Stream (boolean) or Stream (null) (Stream) Default: false If set, partial message deltas will be sent. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.
	Stream Options (any) or Stream Options (null) (Stream Options) Options for streaming response. Only set this when you set `stream: true`.
	Temperature (number) or Temperature (null) (Temperature) Default: 0.7 What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.
	Top P (number) or Top P (null) (Top P) Default: 1 An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.
	(Tools ((Array of Tools (objects or SearchTool (object))) or Tools (null))) or Tools (null) (Tools)
tool_choice	any (Tool Choice) Default: "none" Controls which (if any) tool is called by the model. `none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools. Specifying a particular tool via `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool. `none` is the default when no tools are present. `auto` is the default if tools are present.
	Parallel Tool Calls (boolean) or Parallel Tool Calls (null) (Parallel Tool Calls) Default: false Whether to call tools in parallel or sequentially. If true, the model will call tools in parallel. If false, the model will call tools sequentially. If None, the model will call tools in parallel if the model supports it, otherwise it will call tools sequentially.
	User (string) or User (null) (User) A unique identifier representing the user.
search	boolean (Search) Deprecated Default: false
	SearchArgs (object) or null Deprecated
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"messages": [null
],
"model": "string",
"frequency_penalty": 0,
"logit_bias": {"property1": 0,
"property2": 0
},
"logprobs": false,
"top_logprobs": 0,
"presence_penalty": 0,
"max_completion_tokens": 0,
"n": 1,
"response_format": { },
"seed": 0,
"stop": "string",
"stream": false,
"stream_options": { },
"temperature": 0.7,
"top_p": 1,
"tools": [{ }
],
"tool_choice": "none",
"parallel_tool_calls": false,
"user": "string",
"search": false,
"search_args": {"collection_ids": [ ],
"document_ids": [ ],
"metadata_filters": {"key": "string",
"type": "eq",
"value": "string"
},
"limit": 10,
"offset": 0,
"method": "hybrid",
"rff_k": 60,
"score_threshold": 0
}
}

Response samples

200
404
422
503

Content type

application/json

Example

ChatCompletion

{"id": "string",
"choices": [{"finish_reason": "stop",
"index": 0,
"logprobs": {"content": [{"token": "string",
"bytes": [0
],
"logprob": 0,
"top_logprobs": [{"token": "string",
"bytes": [0
],
"logprob": 0
}
]
}
],
"refusal": [{"token": "string",
"bytes": [0
],
"logprob": 0,
"top_logprobs": [{"token": "string",
"bytes": [0
],
"logprob": 0
}
]
}
]
},
"message": {"content": "string",
"refusal": "string",
"role": "assistant",
"annotations": [{"type": "url_citation",
"url_citation": {"end_index": 0,
"start_index": 0,
"title": "string",
"url": "string"
}
}
],
"audio": {"id": "string",
"data": "string",
"expires_at": 0,
"transcript": "string"
},
"function_call": {"arguments": "string",
"name": "string"
},
"tool_calls": [{"id": "string",
"function": {"arguments": "string",
"name": "string"
},
"type": "function"
}
]
}
}
],
"created": 0,
"model": "string",
"object": "chat.completion",
"service_tier": "auto",
"system_fingerprint": "string",
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
},
"search_results": [ ]
}

Chunks

Get Chunk Deprecated

Get a chunk of a document.

Authorizations:

HTTPBearer

path Parameters

document required	integer (Document) The document ID
chunk required	integer (Chunk) The chunk ID

query Parameters

required

boolean (Required)

Default: true

Responses

Response samples

200
422

Content type

application/json

{"object": "chunk",
"id": 0,
"collection_id": 0,
"document_id": 0,
"content": "string",
"metadata": {"property1": "string",
"property2": "string"
},
"created": 0
}

Get Chunks Deprecated

Get chunks of a document.

Authorizations:

HTTPBearer

path Parameters

document

required

integer (Document)

The document ID

query Parameters

limit	integer (Limit) [ 1 .. 100 ] Default: 10 The number of documents to return
offset	integer (Offset) Default: 0 The offset of the first document to return
required	boolean (Required) Default: true

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "chunk",
"id": 0,
"collection_id": 0,
"document_id": 0,
"content": "string",
"metadata": {"property1": "string",
"property2": "string"
},
"created": 0
}
]
}

Collections

Create Collection

Create a new collection.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

name required	string (Name) non-empty The name of the collection.
	Description (string) or Description (null) (Description) The description of the collection.
visibility	string (CollectionVisibility) Default: "private" Enum: "private" "public" The type of the collection. Public collections are available to all users, private collections are only available to the user who created them.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"description": "string",
"visibility": "private"
}

Response samples

201
422

Content type

application/json

null

Get Collections

Get list of collections.

Authorizations:

HTTPBearer

query Parameters

name	string (Name) Filter by collection name.
	CollectionVisibility (string) or Visibility (null) (Visibility) Filter by collection visibility.
offset	integer (Offset) >= 0 Default: 0 The offset of the collections to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the collections to get.
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" "updated" The order by field to sort the collections by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the collections by.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "collection",
"id": 0,
"name": "string",
"owner": "string",
"description": "string",
"visibility": "private",
"created": 0,
"updated": 0,
"documents": 0
}
]
}

Get Collection

Get a collection by ID.

Authorizations:

HTTPBearer

path Parameters

collection_id

required

integer (Collection Id)

The collection ID

Responses

Response samples

200
422

Content type

application/json

{"object": "collection",
"id": 0,
"name": "string",
"owner": "string",
"description": "string",
"visibility": "private",
"created": 0,
"updated": 0,
"documents": 0
}

Delete Collection

Delete a collection.

Authorizations:

HTTPBearer

path Parameters

collection_id

required

integer (Collection Id)

The collection ID

query Parameters

required

boolean (Required)

Default: true

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Update Collection

Update a collection.

Authorizations:

HTTPBearer

path Parameters

collection_id

required

integer (Collection Id)

The collection ID

Request Body schema: application/json
required

	Name (string) or Name (null) (Name) The name of the collection.
	Description (string) or Description (null) (Description) The description of the collection.
	CollectionVisibility (string) or null The type of the collection. Public collections are available to all users, private collections are only available to the user who created them.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"description": "string",
"visibility": "private"
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Documents

Create Document

Upload a file, parse and split it into chunks, then create a document. If no file is provided, the document will be created without content, use POST /v1/documents/{document_id}/chunks to fill it.

Authorizations:

HTTPBearer

query Parameters

required

boolean (Required)

Default: true

Request Body schema: multipart/form-data

	File (string) or File (null) (File) The file to create a document from. If not provided, the document will be created without content, use POST `/v1/documents/{document_id}/chunks` to fill it.
	Name (string) or Name (null) (Name) Name of document if no file is provided or to override file name.
	Collection Id (integer) or Collection Id (null) (Collection Id) The collection ID to use for the file upload. The file will be vectorized with model defined by the collection.
	Collection (integer) or Collection (null) (Collection) Deprecated
disable_chunking	boolean (Disable Chunking) Default: false Whether to disable `RecursiveCharacterTextSplitter` chunking for the upload file.
chunk_size	integer (Chunk Size) >= 0 Default: 2048 The size in characters of the chunks to use for the upload file. If not provided, the document will not be split into chunks.
chunk_min_size	integer (Chunk Min Size) >= 0 Default: 0 The minimum size in characters of the chunks to use for the upload file.
chunk_overlap	integer (Chunk Overlap) >= 0 Default: 0 The overlap in characters of the chunks to use for the upload file.
is_separator_regex	boolean (Is Separator Regex) Default: false Whether the separator is a regex to use for the upload file.
separators	Array of strings (Separators) >= 0 items Default: [] Delimiters used by RecursiveCharacterTextSplitter for further splitting. If provided, `preset_separators` is ignored.
preset_separators	string (PresetSeparators) Default: "markdown" Enum: "cpp" "go" "java" "kotlin" "js" "ts" "php" "proto" "python" "r" "rst" "ruby" "rust" "scala" "swift" "markdown" "latex" "html" "sol" "csharp" "cobol" "c" "lua" "perl" "haskell" "elixir" "powershell" "visualbasic6" Preset separators used by RecursiveCharacterTextSplitter for further splitting. See implemented details.
metadata	string (Metadata) Default: "" Optional additional metadata to add to each chunk if a file is provided. Provide a stringified JSON object matching the Metadata schema.

Responses

Response samples

201
422

Content type

application/json

{"id": 0
}

Get Documents

Get all documents ID from a collection.

Authorizations:

HTTPBearer

query Parameters

	Name (string) or Name (null) (Name) Filter documents by name
	Collection Id (integer) or Collection Id (null) (Collection Id) Filter documents by collection ID
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The number of documents to return
offset	integer (Offset) Default: 0 The offset of the first document to return
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" The order by field to sort the documents by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the documents by.
required	boolean (Required) Default: true

Responses

Response samples

200
422

Content type

application/json

null

Get Document

Get a document by ID.

Authorizations:

HTTPBearer

path Parameters

document_id

required

integer (Document Id) >= 0

The document ID

query Parameters

required

boolean (Required)

Default: true

Responses

Response samples

200
422

Content type

application/json

{"object": "document",
"id": 0,
"name": "string",
"collection_id": 0,
"created": 0,
"chunks": 0
}

Delete Document

Delete a document.

Authorizations:

HTTPBearer

path Parameters

document_id

required

integer (Document Id) > 0

The document ID

query Parameters

required

boolean (Required)

Default: true

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Create Document Chunks

Fill document with chunks.

Authorizations:

HTTPBearer

path Parameters

document_id

required

integer (Document Id) > 0

The document ID

query Parameters

required

boolean (Required)

Default: true

Request Body schema: application/json
required

required	Array of objects (Chunks) [ 1 .. 64 ] items The list of chunks to create.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"chunks": [{"content": "string",
"metadata": {"property1": "string",
"property2": "string"
}
}
]
}

Response samples

201
422

Content type

application/json

null

Get Document Chunks

Get chunks of a document.

Authorizations:

HTTPBearer

path Parameters

document_id

required

integer (Document Id) > 0

The document ID

query Parameters

limit	integer (Limit) [ 1 .. 100 ] Default: 10 The number of chunks to return
offset	integer (Offset) Default: 0 The offset of the first chunk to return
required	boolean (Required) Default: true

Responses

Response samples

200
422

Content type

application/json

null

Delete Document Chunk

Delete a chunk of a document.

Authorizations:

HTTPBearer

path Parameters

document_id required	integer (Document Id) > 0 The document ID
chunk_id required	integer (Chunk Id) >= 0 The chunk ID

query Parameters

required

boolean (Required)

Default: true

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get Document Chunk

Get a chunk of a document.

Authorizations:

HTTPBearer

path Parameters

document_id required	integer (Document Id) > 0 The document ID
chunk_id required	integer (Chunk Id) >= 0 The chunk ID

query Parameters

required

boolean (Required)

Default: true

Responses

Response samples

200
422

Content type

application/json

null

Embeddings

Creates an embedding vector representing the input text.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

required	Array of Input (integers) or Array of Input (integers) or Input (string) or Array of Input (strings) (Input) Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (call `/v1/models` endpoint to get the `max_context_length` by model) and cannot be an empty string.
model required	string (Model) ID of the model to use. Call `/v1/models` endpoint to get the list of available models, only `text-embeddings-inference` model type is supported.
	Dimensions (integer) or Dimensions (null) (Dimensions) The number of dimensions the resulting output embeddings should have.
	"float" (string) or Encoding Format (null) (Encoding Format) Default: "float" The format of the output embeddings. Only `float` is supported.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"input": [0
],
"model": "string",
"dimensions": 0,
"encoding_format": "float"
}

Response samples

200
422

Content type

application/json

{"data": [{"embedding": [0
],
"index": 0,
"object": "embedding"
}
],
"model": "string",
"object": "list",
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
},
"id": "string"
}

Me

Get User

Get information about the current user.

Authorizations:

HTTPBearer

Responses

Response samples

200

Content type

application/json

{"object": "userInfo",
"id": 0,
"email": "string",
"name": "string",
"organization": 0,
"budget": 0,
"permissions": ["admin"
],
"limits": [{"router": 0,
"type": "tpm",
"value": 0
}
],
"expires": 0,
"priority": 0,
"created": 0,
"updated": 0
}

Update User

Update information about the current user.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

	Name (string) or Name (null) (Name) The user name.
	Email (string) or Email (null) (Email) The user email.
	Current Password (string) or Current Password (null) (Current Password) The current user password.
	Password (string) or Password (null) (Password) The new user password. If None, the user password is not changed.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"email": "string",
"current_password": "string",
"password": "string"
}

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Create Key

Create a new API key.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

name required	string (Name)
	Expires (integer) or Expires (null) (Expires) Timestamp in seconds
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"name": "string",
"expires": 0
}

Response samples

201
422

Content type

application/json

{"id": 0,
"token": "string"
}

Get Keys

Get all your tokens.

Authorizations:

HTTPBearer

query Parameters

offset	integer (Offset) >= 0 Default: 0 The offset of the tokens to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the tokens to get.
order_by	string (Order By) Default: "id" Enum: "id" "name" "created" The field to order the tokens by.
order_direction	string (Order Direction) Default: "asc" Enum: "asc" "desc" The direction to order the tokens by.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "key",
"id": 0,
"name": "string",
"token": "string",
"expires": 0,
"created": 0
}
]
}

Delete Key

Delete a API key.

Authorizations:

HTTPBearer

path Parameters

key

required

integer (Key)

The key ID of the key to delete.

Responses

Response samples

422

Content type

application/json

{"detail": [{"loc": ["string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": { }
}
]
}

Get Key

Get your token by id.

Authorizations:

HTTPBearer

path Parameters

key

required

integer (Key)

The key ID of the key to get.

Responses

Response samples

200
422

Content type

application/json

{"object": "key",
"id": 0,
"name": "string",
"token": "string",
"expires": 0,
"created": 0
}

Get Usage

Get usage for the current user.

Authorizations:

HTTPBearer

query Parameters

offset	integer (Offset) >= 0 Default: 0 The offset of the usages to get.
limit	integer (Limit) [ 1 .. 100 ] Default: 10 The limit of the usages to get.
	Start Time (integer) or Start Time (null) (Start Time) Start time as Unix timestamp (if not provided, will be set to 30 days ago)
	End Time (integer) or End Time (null) (End Time) End time as Unix timestamp (if not provided, will be set to now)
	EndpointUsage (string) or Endpoint (null) (Endpoint) The endpoint to get usage for.

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "me.usage",
"model": "string",
"key": "string",
"endpoint": "string",
"method": "string",
"status": 0,
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"metrics": {"latency": 0,
"ttft": 0
}
},
"created": 0
}
]
}

Models

Get Model

Get a model by name and provide basic information.

Authorizations:

HTTPBearer

path Parameters

model

required

string (Model)

The name of the model to get.

Responses

Response samples

200
401
404
422

Content type

application/json

{"id": "string",
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"created": 0,
"owned_by": "string",
"max_context_length": 0,
"costs": {"prompt_tokens": 0,
"completion_tokens": 0
},
"object": "model"
}

Get Models

Lists the currently available models and provides basic information.

Authorizations:

HTTPBearer

Responses

Response samples

200
401
404

Content type

application/json

{"object": "list",
"data": [{"id": "string",
"type": "automatic-speech-recognition",
"aliases": ["model-alias",
"model-alias-2"
],
"created": 0,
"owned_by": "string",
"max_context_length": 0,
"costs": {"prompt_tokens": 0,
"completion_tokens": 0
},
"object": "model"
}
]
}

OCR

Ocr

Extracts text from files using OCR.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

	ResponseFormat (object) or null Specify the format that the model must output for the bounding boxes. By default it will use `{ "type": "text" }`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is in JSON. When using JSON mode you MUST also instruct the model to produce JSON yourself with a system or a user message. Setting to `{ "type": "json_schema" }` enables JSON schema mode, which guarantees the message the model generates is in JSON and follows the schema you provide.
required	DocumentURLChunk (object) or ImageURLChunk (object) (Document) Document to run OCR on.
	ResponseFormat (object) or null Specify the format that the model must output for the document. By default it will use `{ "type": "text" }`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is in JSON. When using JSON mode you MUST also instruct the model to produce JSON yourself with a system or a user message. Setting to `{ "type": "json_schema" }` enables JSON schema mode, which guarantees the message the model generates is in JSON and follows the schema you provide.
	Image Limit (integer) or Image Limit (null) (Image Limit) Max images to extract
	Image Min Size (integer) or Image Min Size (null) (Image Min Size) Minimum height and width of image to extract
	Include Image Base64 (boolean) or Include Image Base64 (null) (Include Image Base64) Include image URLs in response
	Model (string) or Model (null) (Model) The model to use for the OCR.
	Array of Pages (integers) or Pages (null) (Pages) Specific pages user wants to process in various formats: single number, range, or list of both. Starts from 0
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"bbox_annotation_format": {"type": "text",
"json_schema": {"name": "string",
"schema_definition": { },
"strict": false,
"description": "string"
}
},
"document": {"document_name": "string",
"document_url": "string",
"type": "document_url"
},
"document_annotation_format": {"type": "text",
"json_schema": {"name": "string",
"schema_definition": { },
"strict": false,
"description": "string"
}
},
"image_limit": 0,
"image_min_size": 0,
"include_image_base64": true,
"model": "string",
"pages": [0
]
}

Response samples

200
404
422
503

Content type

application/json

{"document_annotation": "string",
"id": "string",
"model": "string",
"pages": [{"dimensions": {"dpi": 0,
"height": 0,
"width": 0
},
"images": [{"bottom_right_x": 0,
"bottom_right_y": 0,
"id": "string",
"image_annotation": "string",
"image_base64": "string",
"top_left_x": 0,
"top_left_y": 0
}
],
"index": 0,
"markdown": "string"
}
],
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
},
"usage_info": {"doc_size_bytes": 0,
"pages_processed": 0
}
}

Parse

Parse Deprecated

Parse a PDF file into markdown.

Authorizations:

HTTPBearer

Request Body schema: multipart/form-data
required

file required	string <application/octet-stream> (File) The file to parse.
page_range	string (Page Range) Default: "" Page range to convert, specify comma separated page numbers or ranges. Example: '0,5-10,20'
force_ocr	boolean (Force Ocr) Default: false Force OCR on all pages of the PDF. Defaults to False. This can lead to worse results if you have good text in your PDFs (which is true in most cases).

Responses

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"object": "documentPage",
"content": "string",
"images": {"property1": "string",
"property2": "string"
},
"metadata": {"document_name": "string",
"page": 0
}
}
],
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
}
}

Rerank

Creates an ordered array with each text assigned a relevance score, based on the query.

Authorizations:

HTTPBearer

Request Body schema: application/json
required

query required	string (Query) non-empty The search query to use for the reranking. `query` and `prompt` cannot both be provided.
documents required	Array of strings (Documents) [ items non-empty ]
model required	string (Model) non-empty The model to use for the reranking, call `/v1/models` endpoint to get the list of available models, only `text-classification` model type is supported.
	Top N (integer) or Top N (null) (Top N) The number of top results to return. If set to None, all results will be returned.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"query": "string",
"documents": ["string"
],
"model": "string",
"top_n": 1
}

Response samples

200
422

Content type

application/json

{"object": "list",
"id": "string",
"results": [{"relevance_score": 0,
"index": 0
}
],
"model": "string",
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
}
}

Search

Get relevant chunks from the collections and a query.

Authorizations:

HTTPBearer

query Parameters

required

boolean (Required)

Default: true

Request Body schema: application/json
required

collection_ids	Array of integers (Collection Ids) [ 0 .. 100 ] items [ items > 0 ] Default: [] List of collections ID.
document_ids	Array of integers (Document Ids) [ 0 .. 100 ] items [ items > 0 ] Default: [] List of document IDs
	ComparisonFilter (object) or CompoundFilter (object) or Metadata Filters (null) (Metadata Filters) Metadata filters to apply to the search.
limit	integer (Limit) ( 0 .. 100 ] Default: 10 Number of results to return.
offset	integer (Offset) >= 0 Default: 0 Offset for pagination, specifying how many results to skip from the beginning.
method	string (SearchMethod) Default: "semantic" Enum: "hybrid" "semantic" "lexical" Search method to use.
rff_k	integer (Rff K) [ 0 .. 16384 ] Default: 60 Smoothing constant for Reciprocal Rank Fusion (RRF) algorithm in hybrid search (recommended: from 10 to 100).
score_threshold	number (Score Threshold) [ 0 .. 1 ] Default: 0 Score of cosine similarity threshold for filtering results, only available for semantic search method.
	Query (string) or Query (null) (Query) Query related to the search.
property name* additional property	any

Responses

Request samples

Payload

Content type

application/json

{"collection_ids": [ ],
"document_ids": [ ],
"metadata_filters": {"key": "string",
"type": "eq",
"value": "string"
},
"limit": 10,
"offset": 0,
"method": "hybrid",
"rff_k": 60,
"score_threshold": 0,
"query": "string"
}

Response samples

200
422

Content type

application/json

{"object": "list",
"data": [{"method": "hybrid",
"score": 0,
"chunk": {"object": "chunk",
"id": 0,
"collection_id": 0,
"document_id": 0,
"content": "string",
"metadata": {"property1": "string",
"property2": "string"
},
"created": 0
}
}
],
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"cost": 0,
"carbon": {"kWh": {"min": 0,
"max": 0
},
"kgCO2eq": {"min": 0,
"max": 0
}
},
"requests": 0
}
}

Monitoring

Get Metrics

Authorizations:

HTTPBearer

Responses

Response samples

200

Content type

application/json

null

Health

Responses

Response samples

200

Content type

application/json

null

OpenGateLLM (0.4.1)

Admin

Create Provider

Authorizations:

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Get Providers

Authorizations:

query Parameters

Responses

Response samples

Delete Provider

Authorizations:

path Parameters

Responses

Response samples

Update Provider

Authorizations:

path Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Get Provider

Authorizations:

path Parameters

Responses

Response samples

Create Router

Authorizations:

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Get Routers

Authorizations:

query Parameters

Responses

Response samples

Get Router

Authorizations:

path Parameters

Responses

Response samples

Delete Router

Authorizations:

path Parameters

Responses

Response samples

Create Organization

Authorizations:

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Get Organizations

Authorizations:

query Parameters

Responses

Response samples

Delete Organization

Authorizations:

path Parameters

Responses

Response samples

Update Organization

Authorizations:

path Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Get Organization

Authorizations:

path Parameters

Responses

Response samples

Create Role

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required