Skip to content

Supported models

When you create a model router, you need to specify the model type. The model type determines the capabilities of the model and enables a specific set of endpoints.

Model typeEndpoint enabledDescription
text-generation/v1/chat/completionsLLM models for text generation.
text-classification/v1/rerankReranking models for text classification.
text-embeddings-inference/v1/embeddingsText embeddings models for text similarity and clustering.
image-to-text/v1/ocrOCR models for image to text conversion (only Mistral On-Premise supported).
image-text-to-text/v1/chat/completionsMulti-modal models to chat and analyze images.
automatic-speech-recognition/v1/audio/transcriptionsAutomatic speech recognition models for audio to text conversion.

vLLM is open-source production-grade LLM server. It supports a wide range of models and is a great choice for self-hosted models.

Supported model types:

Model typeExample model
text-generationopenai/gpt-oss-120b
image-text-to-textmistralai/Mistral-Small-3.2-24B-Instruct-2506
text-classificationBAAI/bge-reranker-v2-m3
text-embeddings-inferenceintfloat/e5-mistral-7b-instruct
automatic-speech-recognitionopenai/whisper-large-v3
vLLM documentation