Open Source · Apache 2.0

VEXYL-TTS
Indian Language
Text‑to‑Speech

Self-hosted synthesis server for 22 Indian languages.
WebSocket streaming + Batch REST API. Zero API costs. Full data sovereignty.

⭐ Star on GitHub See How It Works Deploy to Cloud Run

22 Languages

44+ Speakers

Sub-200ms Inference Latency

$0 API Cost

Architecture

Two modes,
one port.

A single container on port 8080 serves both real-time streaming via WebSocket and async batch transcription via REST — designed for every workflow.

⚡

WebSocket Streaming

Real-time text-to-speech. Send JSON text requests and receive base64-encoded WAV audio chunks as soon as they are synthesized.

Real-time chunked synthesis

Sub-200ms first-byte latency

Up to 50 concurrent WebSocket connections

In-memory LRU cache for repeated phrases

📦

Batch REST API

Submit an audio file and poll for results. Handles cloud cold starts gracefully — the job queues instantly, the model loads in the background. Clients never wait in place.

Up to 5,000 characters per request

22 languages and 44+ pre-built voices

1,000 concurrent pending jobs

Auto-cleanup after 1-hour TTL

// WebSocket session lifecycle

SERVER → {"type":"ready","model":"indic-parler-tts","sample_rate":22050}

CLIENT → {"type":"synthesize","text":"നമസ്കാരം","lang":"ml-IN","style":"default","request_id":"abc123"}

SERVER → {"type":"audio","request_id":"abc123","audio_b64":"...","latency_ms":2400}

Why VEXYL-TTS

Built for production,
open by design.

01 🔒

Full Data Sovereignty

Text never leaves your infrastructure. Deploy on-premise or in your own cloud account. No third-party API calls, no data sharing, no API costs.

02 🧠

ai4bharat Model

Powered by the indic-parler-tts model — fine-tuned on Indian language corpora, supporting 44+ pre-built voices and emotion control.

03 ☁️

Scale to Zero

Designed for Google Cloud Run with session affinity, CPU boost, and a batch API that absorbs cold starts. Pay $0 when idle. Scale to 250 connections across 5 instances.

04 🛡️

API Key Auth

Timing-safe shared-secret authentication on every endpoint. The /health endpoint is always exempt for Cloud Run probes. Backwards-compatible when key is unset.

05 🔌

Voice Gateway Integration

Plug-and-play with the VEXYL AI Voice Gateway. Replace or supplement cloud TTS providers (ElevenLabs, OpenAI, Deepgram) with zero code changes using the drop-in client library.

06 🐳

Docker-First

The ~6.0 GB image bakes the model at build time — no large download on every cold start. One-command Cloud Build deployment. ffmpeg included for diverse audio formats.

Coverage

22 Indian languages,
one model.

From Hindi to Malayalam, Telugu to Sanskrit — a single model handles the full breadth of India's linguistic landscape.

हिं Hindi hi-IN

മല Malayalam ml-IN

தமி Tamil ta-IN

తెలు Telugu te-IN

ಕನ್ Kannada kn-IN

বাং Bengali bn-IN

ગુ Gujarati gu-IN

मरा Marathi mr-IN

ਪੰਜ Punjabi pa-IN

ଓଡ଼ Odia or-IN

অস Assamese as-IN

اردو Urdu ur-IN

संस् Sanskrit sa-IN

नेपा Nepali ne-IN

बो Bodo brx-IN

डोग Dogri doi-IN

En English en-IN

कों Konkani kok-IN

मै Maithili mai-IN

মৈ Manipuri mni-IN

ᱥᱟ Santali sat-IN

سن Sindhi sd-IN

Deployment

Three commands
to production.

Local setup, Docker, or Cloud Run — pick your environment. The deploy script handles APIs, Artifact Registry, Cloud Build, and deployment automatically.

Serverless — Scale to Zero

deploy.sh

$ export GCP_PROJECT_ID=my-project
$ export HF_TOKEN=hf_xxxx
$ ./deploy.sh

→ Enabling GCP APIs...
→ Building via Cloud Build (~15 min)...
→ Deploying to asia-south1...

✓ Service URL: https://vexyl-tts-xxx.run.app
✓ WebSocket:   wss://vexyl-tts-xxx.run.app

💻 Local / Self-Hosted

On-Premise Setup

bash

# One-command setup
$ ./setup.sh
  Downloads model, sets up venv...

# Start the server
$ ./run.sh
✓ Listening on ws://127.0.0.1:8092

# Health check
$ curl http://localhost:8092/health
{"status":"ok","model":"indic-parler-tts"}

🐳 Docker

Containerised Deployment

docker

# Build (bakes model at build time)
$ docker build \
    --build-arg HF_TOKEN=$HF_TOKEN \
    -t vexyl-tts .

# Run with API key
$ docker run -p 8080:8080 \
    -e VEXYL_TTS_API_KEY=secret \
    vexyl-tts

⚙️ Node.js Client

Voice Gateway Integration

vexyl-stt-client.js

// .env
VEXYL_TTS_URL=wss://vexyl-tts-xxx.run.app
VEXYL_TTS_API_KEY=your-secret
TTS_PROVIDER=vexyl-tts

// Usage
const tts = new VexylTTS('ml-IN');
await tts.connect();
tts.synthesize('നമസ്കാരം', audioChunk => play(audioChunk));

API Reference

Batch API

Submit audio files and poll for results. CORS-enabled, API key protected, file-format agnostic.

Batch Transcription — Submit → Poll → Result

# 1. Submit a job
$ curl -X POST https://vexyl-tts-xxx.run.app/batch/synthesize \
     -H "X-API-Key: your-secret" \
     -H "Content-Type: application/json" \
     -d '{"text":"नमस्ते दुनिया","lang":"hi-IN"}'

{"job_id":"batch_a1b2c3d4e5f6","status":"queued","language":"hi-IN","text_length":14}

# 2. Poll for completion
$ curl https://vexyl-tts-xxx.run.app/batch/status/batch_a1b2c3d4e5f6 \
     -H "X-API-Key: your-secret"

{"job_id":"batch_a1b2c3d4e5f6","status":"completed","audio_b64":"...","latency_ms":2400}

# Health check (no auth required)
$ curl https://vexyl-tts-xxx.run.app/health
{"status":"ok","active_connections":0,"batch_jobs_queued":0,"uptime_seconds":42.3}

Limit	Value	Notes
Max text length	5,000 characters	HTTP 400 returned if exceeded
Max pending jobs	1,000	HTTP 429 when queue is full
Job result TTL	1 hour	Cleaned up every 5 minutes
Voice Styles	default · warm · formal	Controls the speaker selection

Cost

~$0.0006 per request.
$0 when idle.

Cloud Run bills per-second. With --min-instances=0, you pay nothing when there's no traffic. The free tier covers most light usage entirely.

Usage	Requests / Month	Estimated Cost
Light — Testing / Dev	~100	~$0.06
Medium — Internal Tool	~1,000	~$0.60
Heavy — Production	~10,000	~$6.00
Always-Warm (min-instances=1)	Any	~$50–70 / month
GCP Free Tier	First 180K vCPU-sec + 360K GiB-sec	FREE

Open Source

Start synthesizing
in three commands.

Apache 2.0 licensed. Self-host it, fork it, integrate it into your stack. Contributions welcome.

View on GitHub VEXYL Voice Gateway →

Apache 2.0 Python 3.10+ 22 Languages WebSocket + REST Docker Ready Cloud Run Ready

VEXYL-TTS Indian Language Text‑to‑Speech

Two modes,one port.

Built for production,open by design.

22 Indian languages,one model.

Three commandsto production.