Skip to content

Configuration file

OpenGateLLM requires configuring a configuration file. This defines models, dependencies, and settings parameters. Playground and API need a configuration file (could be the same file), see API configuration and Playground configuration.

By default, the configuration file must be ./config.yml file.

You can change the configuration file by setting the CONFIG_FILE environment variable.

You can pass environment variables in configuration file with pattern ${ENV_VARIABLE_NAME}. All environment variables will be loaded in the configuration file.

Example

models:
[...]
- name: my-language-model
type: text-generation
providers:
- type: openai
url: https://api.openai.com
key: ${OPENAI_API_KEY}
model_name: gpt-4o-mini

The following is an example of configuration file:

# -------------------------------- dependencies ---------------------------------
dependencies:
postgres: # required
url: postgresql+asyncpg://${POSTGRES_USER:-postgres}:${POSTGRES_PASSWORD:-changeme}@${POSTGRES_HOST:-localhost}:${POSTGRES_PORT:-5432}/postgres
echo: False
pool_size: 5
connect_args:
server_settings:
statement_timeout: "120s"
command_timeout: 60
redis: # required
url: redis://:${REDIS_PASSWORD:-changeme}@${REDIS_HOST:-localhost}:${REDIS_PORT:-6379}
max_connections: 200
socket_connect_timeout: 5
retry_on_timeout: True
health_check_interval: 30
decode_responses: False
socket_keepalive: True
elasticsearch: # optional
index_name: opengatellm
index_language: english
number_of_shards: 1
number_of_replicas: 0
hosts: "http://${ELASTICSEARCH_HOST:-localhost}:${ELASTICSEARCH_PORT:-9200}"
basic_auth:
- "elastic"
- ${ELASTICSEARCH_PASSWORD}
# sentry:
# dsn: ${SENTRY_DSN}
# langfuse:
# public_key: ${LANGFUSE_PUBLIC_KEY}
# secret_key: ${LANGFUSE_SECRET_KEY}
# base_url: http://localhost:3000
# ---------------------------------- settings -----------------------------------
settings:
# disabled_routers: ["admin", "audio"]
# hidden_routers: ["auth"]
# usage_tokenizer: tiktoken_gpt2
# app_title: My OpenGateLLM API
# log_level: INFO
# log_format: [%(asctime)s][%(process)d:%(name)s][%(levelname)s] %(client_ip)s - %(message)s
swagger_version: 0.4.5
# swagger_contact_url: https://github.com/etalab-ia/OpenGateLLM
# swagger_contact_email: john.doe@example.com
# swagger_docs_url: /docs
# swagger_redoc_url: /redoc
auth_secret_key: changeme
auth_bootsrap_admin_username: admin
auth_bootsrap_admin_password: changeme
# rate_limiting_strategy: fixed_window
# monitoring_sentry_enabled: True
# monitoring_postgres_enabled: True
# monitoring_prometheus_enabled: True
# vector_store_model: my-model
playground_opengatellm_url: ${OPENGATELLM_URL}
# playground_opengatellm_timeout: 60
# playground_disabled_pages: []
# playground_default_model: my-model
# playground_theme_has_background: True
# playground_theme_accent_color: purple
# playground_theme_appearance: dark
# playground_theme_gray_color: gray
# playground_theme_panel_background: solid
# playground_theme_radius: medium
# playground_theme_scaling: 100%
# playground_swagger_url: http://localhost:8000/swagger
# playground_reference_url: http://localhost:8000/redoc
# playground_documentation_url: https://docs.opengatellm.org
# ----------------------------------- models ------------------------------------
# models:
# - name: albert-testbed
# type: text-generation
# # aliases: ["model-alias"]
# # owned_by: Me
# # load_balancing_strategy: shuffle
# # cost_prompt_tokens: 0.10
# # cost_completion_tokens: 0.10
# providers:
# - type: vllm
# url: http://albert-testbed.etalab.gouv.fr:8000
# # key: sk-xxx
# model_name: "gemma3:1b"
# # timeout: 60
# # model_hosting_zone: FRA
# # model_total_params: 8
# # model_active_params: 8

Configuration file is composed of 3 sections, models:

  • models: to declare models API exposed to the API.
  • dependencies: to declare both required plugins for the API (e.g. PostgreSQL, Redis) and optional ones (e.g. Elasticsearch).
  • settings: to configure the API.

We don’t recommend to use the configuration file to declare models, prefer to use the API to declare models, by endpoints or on the Playground UI (see Models configuration).



AttributeTypeDescriptionDefaultValuesExamples
modelsarrayModels used by the API. For details of configuration, see the Model section.required
dependenciesDependencies used by the API. For details of configuration, see the Dependencies section.required
settingsFor details of configuration, see the Settings section.required



In the model section, you define a list of models (routers and providers). These models are only used for the initial bootstrap of the API. The model section of the configuration is ignored if any models are already registered in the database.



AttributeTypeDescriptionDefaultValuesExamples
namestringUnique name exposed to clients when selecting the model.requiredgpt-4o
typestringType of the model. It will be used to identify the model type.requiredtext-generation
aliasesarrayAliases of the model. It will be used to identify the model by users.[][‘model-alias’, ‘model-alias-2’]
load_balancing_strategystringRouting strategy for load balancing between providers of the model.shuffleleast_busy
cost_prompt_tokensnumberModel costs prompt tokens for user budget computation. The cost is by 1M tokens.0.00.1
cost_completion_tokensnumberModel costs completion tokens for user budget computation. The cost is by 1M tokens. Set to 0.0 to disable budget computation for this model.0.00.1
providersarrayAPI providers of the model. If there are multiple providers, the model will be load balanced between them according to the routing strategy. The different models have to the same type. For details of configuration, see the ModelProvider section.required





AttributeTypeDescriptionDefaultValuesExamples
typestringModel provider type.required
urlnull, stringModel provider API url. The url must only contain the domain name (without /v1 suffix for example). Depends of the model provider type, the url can be optional (Albert, OpenAI).None
keynull, stringModel provider API key.None
timeoutintegerTimeout for the model provider requests, after user receive an 503 error (model is too busy).300
model_namestringModel name from the model provider.required
model_hosting_zonestringModel hosting zone using ISO 3166-1 alpha-3 code format (e.g., WOR for World, FRA for France, USA for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.aiWOR
model_total_paramsintegerTotal params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai0
model_active_paramsintegerActive params of the model in billions of parameters for carbon footprint computation. For more information, see https://ecologits.ai0
qos_metricnull, stringThe metric to use for the quality of service policy. If not provided, no QoS policy is applied.None
qos_limitnull, numberThe value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.None





AttributeTypeDescriptionDefaultValuesExamples
elasticsearchnullElasticsearch is an optional dependency of OpenGateLLM. Elasticsearch is used as a vector store. If this dependency is provided, all documents endpoint are enabled. For details of configuration, see the ElasticsearchDependency section.None
langfusenullSee the LangfuseDependency section for more information. For details of configuration, see the LangfuseDependency section.None
postgresPostgres is a required dependency of OpenGateLLM to store API data. For details of configuration, see the PostgresDependency section.required
redisRedis is a required dependency of OpenGateLLM to store rate limiting counters and performance metrics. For details of configuration, see the RedisDependency section.required
sentrynullSentry is an optional dependency of OpenGateLLM. Sentry helps you identify, diagnose, and fix errors in real-time. For details of configuration, see the SentryDependency section.None



Elasticsearch is an optional dependency of OpenGateLLM. Elasticsearch is used as a vector store. If this dependency is provided, all documents endpoint are enabled. Pass all arguments of elasticsearch.Elasticsearch class, see https://elasticsearch-py.readthedocs.io/en/latest/api/elasticsearch.html for more information. Other arguments declared below are used to configure the Elasticsearch index.



AttributeTypeDescriptionDefaultValuesExamples
index_namestringName of the Elasticsearch index.opengatellmmy_index
index_languagestringThe language of the Elasticsearch index, composed by the value, the stopwords and the stemmer.englishenglish
For more information about stemmer, see https://www.elastic.co/docs/reference/text-analysis/analysis-stemmer-tokenfilter#analysis-stemmer-tokenfilter-configure-parms.
number_of_shardsintegerNumber of shards for the Elasticsearch index.124
number_of_replicasintegerNumber of replicas for the Elasticsearch index.11



Langfuse is an optional dependency of OpenGateLLM. Langfuse is used for LLM observability and tracing. In this section, you can pass all Langfuse client arguments, see https://python.reference.langfuse.com/langfuse for more information.



AttributeTypeDescriptionDefaultValuesExamples
public_keystringLangfuse public key.requiredpk-lf-…
secret_keystringLangfuse secret key.requiredsk-lf-…
base_urlstringLangfuse server URL.http://localhost:3000http://localhost:3000



Postgres is a required dependency of OpenGateLLM. In this section, you can pass all postgres python SDK arguments, see https://docs.sqlalchemy.org/en/21/core/engines.html#engine-creation-apihttps://docs.sqlalchemy.org/en/21/core/engines.html#engine-creation-api for more information. Only the url argument is required. The connection URL must use the asynchronous scheme, postgresql+asyncpg://. If you provide a standard postgresql:// URL, it will be automatically converted to use asyncpg.



AttributeTypeDescriptionDefaultValuesExamples
urlstringPostgreSQL connection url.requiredpostgresql+asyncpg://postgres:changeme@localhost:5432/postgres



Redis is a required dependency of OpenGateLLM. Redis is used to store rate limiting counters and performance metrics. Pass all from_url() method arguments of redis.asyncio.connection.ConnectionPool class, see https://redis.readthedocs.io/en/stable/connections.html#redis.asyncio.connection.ConnectionPool.from_url for more information.



AttributeTypeDescriptionDefaultValuesExamples
urlstringRedis connection url.requiredredis://:changeme@localhost:6379



Sentry is an optional dependency of OpenGateLLM. Sentry helps you identify, diagnose, and fix errors in real-time. In this section, you can pass all sentry python SDK arguments, see https://docs.sentry.io/platforms/python/configuration/options/ for more information.



No settings.



General settings configuration fields.



AttributeTypeDescriptionDefaultValuesExamples
disabled_routersarrayDisabled routers to limits services of the API.[][‘embeddings’]
hidden_routersarrayRouters are enabled but hidden in the swagger and the documentation of the API.[][‘admin’]
app_titlestringDisplay title of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.OpenGateLLMMy API
routing_max_retriesintegerMaximum number of retries for routing tasks.3
routing_retry_countdownintegerNumber of seconds before retrying a failed routing task.3
routing_max_priorityintegerMaximum allowed priority in routing tasks.4
usage_tokenizerstringTokenizer used to compute usage of the API.tiktoken_gpt2
log_levelstringLogging level of the API.INFO
log_formatstringLogging format of the API.[%(asctime)s][%(process)d:%(name)s][%(levelname)s] %(client_ip)s - %(message)s
swagger_summarystringDisplay summary of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.OpenGateLLM connect to your models. You can configuration this swagger UI in the configuration file, like hide routes or change the title.My API description.
swagger_versionstringDisplay version of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.latest2.5.0
swagger_descriptionstringDisplay description of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.See documentationSee documentation
swagger_contactobject, nullContact informations of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.None
swagger_license_infoobjectLicence informations of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.{‘name’: ‘MIT Licence’, ‘identifier’: ‘MIT’, ‘url’: ‘https://raw.githubusercontent.com/etalab-ia/opengatellm/refs/heads/main/LICENSE’\}
swagger_terms_of_servicenull, stringA URL to the Terms of Service for the API in swagger UI. If provided, this has to be a URL.Nonehttps://example.com/terms-of-service
swagger_openapi_tagsarrayOpenAPI tags of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.[]
swagger_openapi_urlstringOpenAPI URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./openapi.json
swagger_docs_urlstringDocs URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./docs
swagger_redoc_urlstringRedoc URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./redoc
auth_secret_keynull, stringSecret key for the API. It should be a random string with at least 32 characters. This key is used to encrypt user tokens, watch out if you modify the secret key, you’ll need to update all user API keys. If not provided, the master key will be used.None
auth_bootsrap_admin_usernamestringUsername of the admin user created at the first startup.admin
auth_bootsrap_admin_passwordstringPassword of the admin user created at the first startup.changeme
auth_key_max_expiration_daysnull, integerMaximum number of days for a new API key to be valid.None
auth_playground_session_durationintegerDuration of the playground postgres_session in seconds.3600
rate_limiting_strategystringRate limiting strategy for the API.fixed_window
monitoring_postgres_enabledbooleanIf true, the log usage will be written in the PostgreSQL database.True
monitoring_prometheus_enabledbooleanIf true, Prometheus metrics will be exposed in the /metrics endpoint.True
vector_store_modelnull, stringModel used to vectorize the text in the vector store database. Is required if a vector store dependency is provided (Elasticsearch). This model must be defined in the models section and have type text-embeddings-inference.None
document_parsing_max_concurrentintegerMaximum number of concurrent document parsing tasks per worker.10
front_urlstringFront-end URL for the application.http://localhost:8501



The following parameters allow you to configure the Playground application. The configuration file can be shared with the API, as the sections are identical and compatible. Some parameters are common to both the API and the Playground (for example, app_title).

For Plagroud deployment, some environment variables are required to be set, like Reflex backend URL. See Environment variables for more information.



AttributeTypeDescriptionDefaultValuesExamples
dependenciesDependencies used by the playground. For details of configuration, see the Dependencies section.required
settingsGeneral settings configuration fields. Some fields are common to the API and the playground. For details of configuration, see the Settings section.required





AttributeTypeDescriptionDefaultValuesExamples
redisnullSet the Redis connection url to use as stage manager. See https://reflex.dev/docs/api-reference/config/ for more information. For details of configuration, see the RedisDependency section.None





AttributeTypeDescriptionDefaultValuesExamples
urlstringRedis connection url.requiredredis://:changeme@localhost:6379





AttributeTypeDescriptionDefaultValuesExamples
auth_key_max_expiration_daysnull, integerMaximum number of days for a token to be valid.None
routing_max_priorityintegerMaximum allowed priority in routing tasks.10
app_titlestringThe title of the application.OpenGateLLM
playground_opengatellm_urlstringThe URL of the OpenGateLLM API.http://localhost:8000
playground_opengatellm_timeoutintegerThe timeout in seconds for the OpenGateLLM API.60
playground_disabled_pagesarrayList of pages to disable from the navigation bar.required
playground_default_modelnull, stringThe first model selected in chat page.None
playground_theme_has_backgroundbooleanWhether the theme has a background.True
playground_theme_accent_colorstringThe primary color used for default buttons, typography, backgrounds, etc. See available colors at https://www.radix-ui.com/colors.purple
playground_theme_appearancestringThe appearance of the theme.light
playground_theme_gray_colorstringThe secondary color used for default buttons, typography, backgrounds, etc. See available colors at https://www.radix-ui.com/colors.gray
playground_theme_panel_backgroundstringWhether panel backgrounds are translucent: ‘solid’ | ‘translucent’.solid
playground_theme_radiusstringThe radius of the theme. Can be ‘small’, ‘medium’, or ‘large’.medium
playground_theme_scalingstringThe scaling of the theme.100%
playground_swagger_urlnull, stringSwagger URL. If not provided, deactivated swagger link in the navigation bar.http://localhost:8000/docs
playground_reference_urlnull, stringReference URL. If not provided, deactivated reference link in the navigation bar.http://localhost:8000/redoc
playground_documentation_urlnull, stringDocumentation URL. If not provided, deactivated documentation link in the navigation bar.https://docs.opengatellm.org