xAI Launches Voice Cloning API: 80+ Voices, 28 Languages, 2-Min Training

xAI has released a production voice cloning API supporting 80+ stock voices across 28 languages, with custom voice creation requiring only a 2-minute reference audio sample at $3/hr for speech-to-speech. A verification gate restricts cloning to the user's own voice to prevent arbitrary speaker impersonation. All custom voices inherit xAI's full TTS feature surface including speech tags, multilingual output, and both REST and WebSocket streaming.

Why It Matters

Voice agents now move from a multi-week R&D effort to a 2-minute setup. Brand impersonation risk concentrates on xAI's identity verification mechanism rather than technical capability — making the verification gate the critical governance point for enterprises adopting the API.