Skip to content

Configuration file

OpenGateLLM requires configuring a configuration file. This defines models, dependencies, and settings parameters. Playground and API need a configuration file (could be the same file), see API configuration and Playground configuration.

By default, the configuration file must be ./config.yml file.

You can change the configuration file by setting the CONFIG_FILE environment variable.

You can pass environment variables in configuration file with pattern ${ENV_VARIABLE_NAME}. All environment variables will be loaded in the configuration file.

Example

models:
[...]
- name: my-language-model
type: text-generation
providers:
- type: openai
url: https://api.openai.com
key: ${OPENAI_API_KEY}
model_name: gpt-4o-mini

The following is an example of configuration file:

# ----------------------------------- models ------------------------------------
models:
- name: albert-testbed
type: text-generation
# aliases: ["model-alias"]
# owned_by: Me
# load_balancing_strategy: shuffle
# cost_prompt_tokens: 0.10
# cost_completion_tokens: 0.10
providers:
- type: vllm
url: http://albert-testbed.etalab.gouv.fr:8000
# key: sk-xxx
model_name: "gemma3:1b"
# timeout: 60
# model_hosting_zone: FRA
# model_total_params: 8
# model_active_params: 8
# -------------------------------- dependencies ---------------------------------
dependencies:
postgres: # required
url: postgresql+asyncpg://${POSTGRES_USER:-postgres}:${POSTGRES_PASSWORD:-changeme}@${POSTGRES_HOST:-localhost}:${POSTGRES_PORT:-5432}/postgres
echo: False
pool_size: 5
connect_args:
server_settings:
statement_timeout: "120s"
command_timeout: 60
redis: # required
url: redis://:${REDIS_PASSWORD:-changeme}@${REDIS_HOST:-localhost}:${REDIS_PORT:-6379}
max_connections: 200
socket_connect_timeout: 5
retry_on_timeout: True
health_check_interval: 30
decode_responses: False
socket_keepalive: True
elasticsearch: # optional
index_name: opengatellm
index_language: english
number_of_shards: 1
index_name: "opengatellm"
number_of_replicas: 0
hosts: "http://${ELASTICSEARCH_HOST:-localhost}:${ELASTICSEARCH_PORT:-9200}"
basic_auth:
- "elastic"
- ${ELASTICSEARCH_PASSWORD}
# sentry:
# dsn: ${SENTRY_DSN}
# ---------------------------------- settings -----------------------------------
settings:
# session_secret_key: ${SESSION_SECRET_KEY}
# disabled_routers: ["admin", "audio"]
# hidden_routers: ["auth"]
# usage_tokenizer: tiktoken_gpt2
# app_title: My OpenGateLLM API
# log_level: INFO
# log_format: [%(asctime)s][%(process)d:%(name)s][%(levelname)s] %(client_ip)s - %(message)s
swagger_version: 0.4.1
# swagger_contact_url: https://github.com/etalab-ia/OpenGateLLM
# swagger_contact_email: john.doe@example.com
# swagger_docs_url: /docs
# swagger_redoc_url: /redoc
auth_master_username: master
auth_master_key: changeme
# auth_max_token_expiration_days: 365
# rate_limiting_strategy: fixed_window
# monitoring_sentry_enabled: True
# monitoring_postgres_enabled: True
# monitoring_prometheus_enabled: True
# vector_store_model: my-model
# search_multi_agents_synthesis_model: my-model
# search_multi_agents_reranker_model: my-model
playground_opengatellm_url: ${OPENGATELLM_URL}
# playground_default_model: my-model
# playground_theme_has_background: True
# playground_theme_accent_color: purple
# playground_theme_appearance: dark
# playground_theme_gray_color: gray
# playground_theme_panel_background: solid
# playground_theme_radius: medium
# playground_theme_scaling: 100%

Configuration file is composed of 3 sections, models:

  • models: to declare models API exposed to the API.
  • dependencies: to declare both required plugins for the API (e.g. PostgreSQL, Redis) and optional ones (e.g. Elasticsearch).
  • settings: to configure the API.

We don’t recommend to use the configuration file to declare models, prefer to use the API to declare models, by endpoints or on the Playground UI (see Models configuration).



AttributeTypeDescriptionDefaultValuesExamples
dependenciesobjectDependencies used by the API. For details of configuration, see the Dependencies section.required
modelsarrayModels used by the API. For details of configuration, see the Model section.required
settingsobjectFor details of configuration, see the Settings section.required



General settings configuration fields.

AttributeTypeDescriptionDefaultValuesExamples
app_titlestringDisplay title of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.OpenGateLLMMy API
auth_key_max_expiration_daysintegerMaximum number of days for a new API key to be valid.None
auth_master_keystringMaster key for the API. It should be a random string with at least 32 characters. This key has all permissions and cannot be modified or deleted. This key is used to create the first role and the first user. This key is also used to encrypt user tokens, watch out if you modify the master key, you’ll need to update all user API keys.changeme
auth_playground_session_durationintegerDuration of the playground postgres_session in seconds.3600
disabled_routersarrayDisabled routers to limits services of the API.[]• embeddings

• …
[‘embeddings’]
document_parsing_max_concurrentintegerMaximum number of concurrent document parsing tasks per worker.10
front_urlstringFront-end URL for the application.http://localhost:8501
hidden_routersarrayRouters are enabled but hidden in the swagger and the documentation of the API.[]• admin

• …
[‘admin’]
log_formatstringLogging format of the API.[%(asctime)s][%(process)d:%(name)s][%(levelname)s] %(client_ip)s - %(message)s
log_levelstringLogging level of the API.INFO• DEBUG

• INFO

• WARNING

• ERROR

• CRITICAL
monitoring_postgres_enabledbooleanIf true, the log usage will be written in the PostgreSQL database.True
monitoring_prometheus_enabledbooleanIf true, Prometheus metrics will be exposed in the /metrics endpoint.True
rate_limiting_strategystringRate limiting strategy for the API.fixed_window• moving_window

• fixed_window

• sliding_window
routing_max_priorityintegerMaximum allowed priority in routing tasks.4
routing_max_retriesintegerMaximum number of retries for routing tasks.3
routing_retry_countdownintegerNumber of seconds before retrying a failed routing task.3
session_secret_keystringSecret key for postgres_session middleware. If not provided, the master key will be used.NoneknBnU1foGtBEwnOGTOmszldbSwSYLTcE6bdibC8bPGM
swagger_contactobjectContact informations of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.None
swagger_descriptionstringDisplay description of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.See documentationSee documentation
swagger_docs_urlstringDocs URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./docs
swagger_license_infoobjectLicence informations of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.{'name': 'MIT Licence', 'identifier': 'MIT', 'url': 'https://raw.githubusercontent.com/etalab-ia/opengatellm/refs/heads/main/LICENSE'}
swagger_openapi_tagsarrayOpenAPI tags of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.[]
swagger_openapi_urlstringOpenAPI URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./openapi.json
swagger_redoc_urlstringRedoc URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./redoc
swagger_summarystringDisplay summary of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.OpenGateLLM connect to your models. You can configuration this swagger UI in the configuration file, like hide routes or change the title.My API description.
swagger_terms_of_servicestringA URL to the Terms of Service for the API in swagger UI. If provided, this has to be a URL.Nonehttps://example.com/terms-of-service
swagger_versionstringDisplay version of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.latest2.5.0
usage_tokenizerstringTokenizer used to compute usage of the API.tiktoken_gpt2• tiktoken_gpt2

• tiktoken_r50k_base

• tiktoken_p50k_base

• tiktoken_p50k_edit

• tiktoken_cl100k_base

• tiktoken_o200k_base
vector_store_modelstringModel used to vectorize the text in the vector store database. Is required if a vector store dependency is provided (Elasticsearch). This model must be defined in the models section and have type text-embeddings-inference.None



In the models section, you define a list of models. Each model is a set of API providers for that model. Users will access the models specified in this section using their name. Load balancing is performed between the different providers of the requested model. All providers in a model must serve the same type of model (text-generation or text-embeddings-inference, etc.). We recommend that all providers of a model serve exactly the same model, otherwise users may receive responses of varying quality. For embedding models, the API verifies that all providers output vectors of the same dimension. You can define the load balancing strategy between the model’s providers. By default, it is random.

For more information to configure model providers, see the ModelProvider section.

AttributeTypeDescriptionDefaultValuesExamples
aliasesarrayAliases of the model. It will be used to identify the model by users.[][‘model-alias’, ‘model-alias-2’]
cost_completion_tokensnumberModel costs completion tokens for user budget computation. The cost is by 1M tokens. Set to 0.0 to disable budget computation for this model.0.00.1
cost_prompt_tokensnumberModel costs prompt tokens for user budget computation. The cost is by 1M tokens.0.00.1
load_balancing_strategystringRouting strategy for load balancing between providers of the model.shuffle• shuffle

• least_busy
least_busy
namestringUnique name exposed to clients when selecting the model.requiredgpt-4o
providersarrayAPI providers of the model. If there are multiple providers, the model will be load balanced between them according to the routing strategy. The different models have to the same type. For details of configuration, see the ModelProvider section.required
typestringType of the model. It will be used to identify the model type.required• automatic-speech-recognition

• image-text-to-text

• image-to-text

• text-embeddings-inference

• text-generation

• text-classification
text-generation



AttributeTypeDescriptionDefaultValuesExamples
keystringModel provider API key.Nonesk-1234567890
model_active_paramsintegerActive params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai08
model_hosting_zonestringModel hosting zone using ISO 3166-1 alpha-3 code format (e.g., WOR for World, FRA for France, USA for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.aiWOR• W

• O

• R

• …
WOR
model_namestringModel name from the model provider.requiredgpt-4o
model_total_paramsintegerTotal params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai08
qos_limitnumberThe value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.None0.5
qos_metricstringThe metric to use for the quality of service. If not provided, no QoS policy is applied.None• ttft

• latency

• inflight

• performance
inflight
timeoutintegerTimeout for the model provider requests, after user receive an 500 error (model is too busy).30010
typestringModel provider type.required• albert

• openai

• mistral

• tei

• vllm
openai
urlstringModel provider API url. The url must only contain the domain name (without /v1 suffix for example). Depends of the model provider type, the url can be optional (Albert, OpenAI).Nonehttps://api.openai.com



AttributeTypeDescriptionDefaultValuesExamples
albertobject[DEPRECATED] See the AlbertDependency section for more information. For details of configuration, see the AlbertDependency section.None
celeryobject[DEPRECATED] See the CeleryDependency section for more information. For details of configuration, see the CeleryDependency section.None
elasticsearchobjectSee the ElasticsearchDependency section for more information. For details of configuration, see the ElasticsearchDependency section.None
markerobject[DEPRECATED] See the MarkerDependency section for more information. For details of configuration, see the MarkerDependency section.None
postgresobjectSee the PostgresDependency section for more information. For details of configuration, see the PostgresDependency section.required
redisobjectSee the RedisDependency section for more information. For details of configuration, see the RedisDependency section.required
sentryobjectSee the SentryDependency section for more information. For details of configuration, see the SentryDependency section.None



Sentry is an optional dependency of OpenGateLLM. Sentry helps you identify, diagnose, and fix errors in real-time. In this section, you can pass all sentry python SDK arguments, see https://docs.sentry.io/platforms/python/configuration/options/ for more information.



Redis is a required dependency of OpenGateLLM. Redis is used to store rate limiting counters and performance metrics. Pass all from_url() method arguments of redis.asyncio.connection.ConnectionPool class, see https://redis.readthedocs.io/en/stable/connections.html#redis.asyncio.connection.ConnectionPool.from_url for more information.

AttributeTypeDescriptionDefaultValuesExamples
urlstringRedis connection url.requiredredis://:changeme@localhost:6379



Postgres is a required dependency of OpenGateLLM. In this section, you can pass all postgres python SDK arguments, see https://github.com/etalab-ia/opengatellm/blob/main/docs/dependencies/postgres.md for more information. Only the url argument is required. The connection URL must use the asynchronous scheme, postgresql+asyncpg://. If you provide a standard postgresql:// URL, it will be automatically converted to use asyncpg.

AttributeTypeDescriptionDefaultValuesExamples
urlstringPostgreSQL connection url.requiredpostgresql+asyncpg://postgres:changeme@localhost:5432/postgres



[DEPRECATED]

AttributeTypeDescriptionDefaultValuesExamples
headersobjectMarker API request headers.{}{'Authorization': 'Bearer my-api-key'}
timeoutintegerTimeout for the Marker API requests.30010
urlstringMarker API url.required



Elasticsearch is an optional dependency of OpenGateLLM. Elasticsearch is used as a vector store. If this dependency is provided, all documents endpoint are enabled. Pass all arguments of elasticsearch.Elasticsearch class, see https://elasticsearch-py.readthedocs.io/en/latest/api/elasticsearch.html for more information. Other arguments declared below are used to configure the Elasticsearch index.

AttributeTypeDescriptionDefaultValuesExamples
index_languagestringLanguage of the Elasticsearch index.english• english

• french

• german

• italian

• portuguese

• spanish

• swedish
english
index_namestringName of the Elasticsearch index.opengatellmmy_index
number_of_replicasintegerNumber of replicas for the Elasticsearch index.11
number_of_shardsintegerNumber of shards for the Elasticsearch index.241



[DEPRECATED]

AttributeTypeDescriptionDefaultValuesExamples
broker_urlstringCelery broker url like Redis (redis://) or RabbitMQ (amqp://). If not provided, use redis dependency as broker.None
enable_utcbooleanEnable UTC.TrueTrue
result_backendstringCelery result backend url. If not provided, use redis dependency as result backend.None
timezonestringTimezone.UTCUTC



[DEPRECATED]

AttributeTypeDescriptionDefaultValuesExamples
headersobjectAlbert API request headers.{}{'Authorization': 'Bearer my-api-key'}
timeoutintegerTimeout for the Albert API requests.30010
urlstringAlbert API url.https://albert.api.etalab.gouv.fr



The following parameters allow you to configure the Playground application. The configuration file can be shared with the API, as the sections are identical and compatible. Some parameters are common to both the API and the Playground (for example, app_title).

For Plagroud deployment, some environment variables are required to be set, like Reflex backend URL. See Environment variables for more information.

AttributeTypeDescriptionDefaultValuesExamples
dependenciesobjectDependencies used by the playground. For details of configuration, see the Dependencies section.required
settingsobjectGeneral settings configuration fields. Some fields are common to the API and the playground. For details of configuration, see the Settings section.required



AttributeTypeDescriptionDefaultValuesExamples
app_titlestringThe title of the application.OpenGateLLM
auth_key_max_expiration_daysintegerMaximum number of days for a token to be valid.None
documentation_urlstringDocumentation URL. If not provided, deactivated documentation link in the navigation bar.https://docs.opengatellm.org/docs
playground_default_modelstringThe first model selected in chat page.None
playground_opengatellm_timeoutintegerThe timeout in seconds for the OpenGateLLM API.60
playground_opengatellm_urlstringThe URL of the OpenGateLLM API.http://localhost:8000
playground_theme_accent_colorstringThe primary color used for default buttons, typography, backgrounds, etc. See available colors at https://www.radix-ui.com/colors.purple
playground_theme_appearancestringThe appearance of the theme.light
playground_theme_gray_colorstringThe secondary color used for default buttons, typography, backgrounds, etc. See available colors at https://www.radix-ui.com/colors.gray
playground_theme_has_backgroundbooleanWhether the theme has a background.True
playground_theme_panel_backgroundstringWhether panel backgrounds are translucent: ‘solid''translucent’.solid
playground_theme_radiusstringThe radius of the theme. Can be ‘small’, ‘medium’, or ‘large’.medium
playground_theme_scalingstringThe scaling of the theme.100%
reference_urlstringReference URL. If not provided, deactivated reference link in the navigation bar.http://localhost:8000/redoc
routing_max_priorityintegerMaximum allowed priority in routing tasks.10
swagger_urlstringSwagger URL. If not provided, deactivated swagger link in the navigation bar.http://localhost:8000/docs



AttributeTypeDescriptionDefaultValuesExamples
redisobjectSet the Redis connection url to use as stage manager. See https://reflex.dev/docs/api-reference/config/ for more information. For details of configuration, see the RedisDependency section.None



AttributeTypeDescriptionDefaultValuesExamples
urlstringRedis connection url.requiredredis://:changeme@localhost:6379