New V2 Model Available

Turn text into lifelike speech instantly.

Build apps that speak. Our API provides the most realistic AI voices for your content, applications, and workflows. Simple integration, powerful results.

TRUSTED BY TEAMS AT
ACME INC GLOBEX SOYENT UMBRELLA
audio_generation.ts
1
const audio = await voicekit.audio.speech.create({
2
model: "tts-1-hd",
3
voice: "clarion",
4
input: "The quick brown fox jumps over the lazy dog."
5
});
6
audio.play();
00:04 / 00:12

Everything you need

Designed for developers and creators who need high-fidelity speech synthesis at scale.

View Documentation arrow_forward
record_voice_over

Diverse Voice Portfolio

Choose from our carefully curated selection of voices including Clarion, Resonance, Vibe, Pulse, and Nova. Each voice is optimized for performance.

Clarion
Resonance
Vibe
+ 5 More
speed

Real-time Latency

Engineered for real-time applications. Get audio back in milliseconds, suitable for conversational AI.

translate

Multi-lingual Support

Generate spoken audio in over 50 languages with automatic language detection and native-like pronunciation.

graphic_eq

HD Quality Audio

Crystal clear 48kHz audio output available in MP3, Opus, AAC, and FLAC formats for any use case.

savings

Usage Based Pricing

Pay only for the characters you synthesize. No hidden fees.

$0.015 / 1K characters

Engineered for immediate conversation.

Traditional TTS pipelines add seconds of delay. VoiceKit uses a proprietary streaming architecture to deliver the first byte of audio in under 200ms, making natural, interruption-friendly AI conversations possible.

  • check_circle
    WebSocket Streaming Persistent bi-directional connections for instant throughput.
  • check_circle
    Edge Caching Deployed to 25+ regions globally to be close to your users.
Request Response
Standard TTS
800ms+
VK
150ms
packet_init audio_buffer_full
import { VoiceClient } from '@voicekit/sdk';

const client = new VoiceClient(apiKey);

// Generate speech with cloning
const response = await client.clone({
audio_sample: './sample.mp3',
text: 'Hello world, this is my cloned voice.',
stability: 0.75
});

Instant Voice Cloning & Customization

Create a digital twin of any voice with just 30 seconds of audio. Our SDK allows you to fine-tune pitch, stability, and similarity to create the perfect persona for your brand.

Python Python
Node.js Node.js
Go Go

Loved by developers

Join thousands of engineers building the future of voice.

format_quote

"The latency is undetectable. We switched from AWS Polly and our user engagement for the conversational agent went up by 40% overnight."

format_quote

"VoiceKit's documentation is best-in-class. I integrated the voice cloning feature into our game engine in less than two hours."

format_quote

"The emotional range of the 'Resonance' voice is scary good. It actually understands context and adjusts intonation perfectly."

1B+
Characters/Day
99.9%
Uptime SLA
50ms
Avg Latency
50+
Languages

Ready to give your app a voice?

Start with our free tier, no credit card required. Upgrade as you scale with flexible usage-based pricing.

By subscribing, you agree to our Terms of Service and Privacy Policy.