New V2 Model Available

Turn text into lifelike speech instantly.

Build apps that speak. Our API provides the most realistic AI voices for your content, applications, and workflows. Simple integration, powerful results.

TRUSTED BY TEAMS AT

ACME INC GLOBEX SOYENT UMBRELLA

audio_generation.ts

const audio = await voicekit.audio.speech.create({
model: "tts-1-hd",
voice: "clarion",
input: "The quick brown fox jumps over the lazy dog."
});
audio.play();

00:04 / 00:12

Everything you need

Designed for developers and creators who need high-fidelity speech synthesis at scale.

View Documentation arrow_forward

record_voice_over

Diverse Voice Portfolio

Choose from our carefully curated selection of voices including Clarion, Resonance, Vibe, Pulse, and Nova. Each voice is optimized for performance.

Clarion

Resonance

Vibe

+ 5 More

speed

Real-time Latency

Engineered for real-time applications. Get audio back in milliseconds, suitable for conversational AI.

translate

Multi-lingual Support

Generate spoken audio in over 50 languages with automatic language detection and native-like pronunciation.

graphic_eq

HD Quality Audio

Crystal clear 48kHz audio output available in MP3, Opus, AAC, and FLAC formats for any use case.

savings

Usage Based Pricing

Pay only for the characters you synthesize. No hidden fees.

$0.015 / 1K characters

bolt Ultra-low Latency

Engineered for immediate conversation.

Traditional TTS pipelines add seconds of delay. VoiceKit uses a proprietary streaming architecture to deliver the first byte of audio in under 200ms, making natural, interruption-friendly AI conversations possible.

check_circle
WebSocket Streaming Persistent bi-directional connections for instant throughput.
check_circle
Edge Caching Deployed to 25+ regions globally to be close to your users.

Request Response

Standard TTS

800ms+

150ms

packet_init audio_buffer_full

import { VoiceClient } from '@voicekit/sdk';

const client = new VoiceClient(apiKey);

// Generate speech with cloning
const response = await client.clone({
audio_sample: './sample.mp3',
text: 'Hello world, this is my cloned voice.',
stability: 0.75
});

code Developer Experience

Instant Voice Cloning & Customization

Create a digital twin of any voice with just 30 seconds of audio. Our SDK allows you to fine-tune pitch, stability, and similarity to create the perfect persona for your brand.

Python

Node.js

Loved by developers

Join thousands of engineers building the future of voice.

format_quote

"The latency is undetectable. We switched from AWS Polly and our user engagement for the conversational agent went up by 40% overnight."

James Smith

CTO at TechFlow

format_quote

"VoiceKit's documentation is best-in-class. I integrated the voice cloning feature into our game engine in less than two hours."

Ana Lee

Lead Dev at IndieGame

format_quote

"The emotional range of the 'Resonance' voice is scary good. It actually understands context and adjusts intonation perfectly."

Mike K.

Product at Podcaster

1B+

Characters/Day

99.9%

Uptime SLA

50ms

Avg Latency

50+

Languages

Ready to give your app a voice?

Start with our free tier, no credit card required. Upgrade as you scale with flexible usage-based pricing.

By subscribing, you agree to our Terms of Service and Privacy Policy.