Mistral has launched Voxtral, an open automatic speech recognition (ASR) software bundle aimed at disrupting the ASR market by breaking the traditional trade-off between cost and quality.

Why it matters
Using ASR in production has often meant choosing between high-error, open-source models or expensive proprietary ones with better accuracy. Voxtral claims to bridge this gap by delivering state-of-the-art accuracy and native semantic understanding — at less than half the price of leading APIs.

The big picture
OpenAI’s Whisper charges $0.006 per minute, and GPT-4o-mini-transcribe costs around $0.003 per minute. Voxtral starts at $0.001 per minute, scaling up to $0.004, while reportedly outperforming these competitors on key benchmarks, including multilingual transcription and short-form English.

Zoom in
Mistral claims Voxtral beats Whisper large-v3, GPT-4o mini Transcribe, Gemini 2.5 Flash, and ElevenLabs Scribe across all tested tasks. However, unlike Whisper, Mistral hasn’t disclosed hallucination rates — a key quality metric for ASR.

Yes, but
The ASR space is competitive, and real-world adoption depends on integration ease, transparency of benchmark data, and ecosystem support. Voxtral’s open approach and pricing could pressure incumbents but must prove itself in production environments.

You might also enjoy: