User budget

OpenGateLLM allows you to define the costs for each model router. For more information about model routers, see setup your models documentation. Then it attach a budget to each user to limit the usage of amount of requests made by the user. The compute cost is calculated based on the number of tokens used and the budget defined for the model based on the following formula:

cost = round((prompt_tokens / 1000000 * router.costs.prompt_tokens) + (completion_tokens / 1000000 * router.costs.completion_tokens), ndigits=6)

The compute cost returned in the response, in the usage.cost field. After the request is processed, the budget amount of the user is updated by the hooks decorator attached to each endpoint. The request cost is stored in the usage table, see usage monitoring documentation for more information.

Configuration

There are three ways to configure model pricing used for budget computation: Playground UI, API, or configuration file.

To define pricing in the Playground, go to the Provider page and create or edit a provider with:

Prompt token cost: Cost per million input tokens.
Completion token cost: Cost per million output tokens.

See POST and PUT /v1/admin/routers endpoints for defining model_cost_prompt_tokens and model_cost_completion_tokens of a router in API reference. Prompt and completion token costs are expressed per million tokens.

API Reference

To define model pricing in the configuration file, set the following fields for each provider:

model_cost_prompt_tokens: Cost per million prompt/input tokens.
model_cost_completion_tokens: Cost per million completion/output tokens.

Example:

models:
  [...]
  - name: my-language-model
    type: text-generation
    providers:
      - type: openai
        url: https://api.openai.com
        key: ${OPENAI_API_KEY}
        model_name: gpt-4o-mini
        model_cost_prompt_tokens: 0.1
        model_cost_completion_tokens: 0.3

Configuration file documentation

Assign budget to a user

Each user has a budget defined by create user endpoint or update user endpoint. The budget is defined in the budget field. You need has admin permission to create or update a user.

See POST and PATCH /v1/admin/users endpoints for more information on API reference.

Budget monitoring

The user can see each request cost in the response of the API request. The cost is returned in the usage.cost field. Moreover, Usage page in the Playground allows the user to see the history of the requests made by him and the cost of each.

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "my-language-model",
  "choices": [
    ...
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30,
    "cost": 0.000015,
    "carbon": {"kWh": 0.0001456, "kgCO2eq": 0.0000672 }
  }
}