Speech-to-Text Providers
VEXYL supports a wide range of STT providers to ensure the best accuracy and latency for your specific language and use case.
Provider Comparison
| Provider | Mode | Latency | Best For |
|---|---|---|---|
| Sarvam | Streaming | 300-800ms | Indian Languages |
| Deepgram | Streaming | 300-500ms | English, Speed |
| Groq | Batch | 1-3s | Accuracy (Whisper) |
| OpenAI | Batch | 2-5s | General Purpose |
Configuration
To configure a provider, add the corresponding API key to your environment variables.
Sarvam (Recommended for India)
SARVAM_API_KEY=your_key
STT_PROVIDER=sarvam
Deepgram (Recommended for Speed)
DEEPGRAM_API_KEY=your_key
STT_PROVIDER=deepgram
DEEPGRAM_STT_MODEL=nova-2
Groq (Recommended for Accuracy)
GROQ_API_KEY=your_key
STT_PROVIDER=groq
GROQ_MODEL=whisper-large-v3-turbo
Auto-Selection
Set STT_PROVIDER=auto to let VEXYL automatically choose the best provider based on the session language (Sarvam for Indian languages, Groq for others).