User budget
OpenGateLLM allows you to define the costs for each model router. For more information about model routers, see setup your models documentation. Then it attach a budget to each user to limit the usage of amount of requests made by the user. The compute cost is calculated based on the number of tokens used and the budget defined for the model based on the following formula:
cost = round((prompt_tokens / 1000000 * router.costs.prompt_tokens) + (completion_tokens / 1000000 * router.costs.completion_tokens), ndigits=6)The compute cost returned in the response, in the usage.cost field. After the request is processed, the budget amount of the user is updated by the hooks decorator attached to each endpoint. The request cost is stored in the usage table, see usage monitoring documentation for more information.
Configuration
Section titled “Configuration”There are three ways to configure model pricing used for budget computation: Playground UI, API, or configuration file.
To define pricing in the Playground, go to the Provider page and create or edit a provider with:
Prompt token cost: Cost per million input tokens.Completion token cost: Cost per million output tokens.
See POST and PUT /v1/admin/routers endpoints for defining model_cost_prompt_tokens and model_cost_completion_tokens of a router in API reference.
Prompt and completion token costs are expressed per million tokens.
To define model pricing in the configuration file, set the following fields for each provider:
model_cost_prompt_tokens: Cost per million prompt/input tokens.model_cost_completion_tokens: Cost per million completion/output tokens.
Example:
models: [...] - name: my-language-model type: text-generation providers: - type: openai url: https://api.openai.com key: ${OPENAI_API_KEY} model_name: gpt-4o-mini model_cost_prompt_tokens: 0.1 model_cost_completion_tokens: 0.3Assign budget to a user
Section titled “Assign budget to a user”Each user has a budget defined by create user endpoint or update user endpoint. The budget is defined in the budget field. You need has admin permission to create or update a user.
See POST and PATCH /v1/admin/users endpoints for more information on API reference.
Budget monitoring
Section titled “Budget monitoring”The user can see each request cost in the response of the API request. The cost is returned in the usage.cost field.
Moreover, Usage page in the Playground allows the user to see the history of the requests made by him and the cost of each.
{ "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "my-language-model", "choices": [ ... ], "usage": { "prompt_tokens": 10, "completion_tokens": 20, "total_tokens": 30, "cost": 0.000015, "carbon": {"kWh": 0.0001456, "kgCO2eq": 0.0000672 } }}