Mistral’s Voxtral challenges closed speech models with open performance and lower costs

Mistral has launched Voxtral, an open automatic speech recognition (ASR) software bundle aimed at disrupting the ASR market by breaking the traditional trade-off between cost and quality.

Why it matters
Using ASR in production has often meant choosing between high-error, open-source models or expensive proprietary ones with better accuracy. Voxtral claims to bridge this gap by delivering state-of-the-art accuracy and native semantic understanding — at less than half the price of leading APIs.

The big picture
OpenAI’s Whisper charges $0.006 per minute, and GPT-4o-mini-transcribe costs around $0.003 per minute. Voxtral starts at $0.001 per minute, scaling up to $0.004, while reportedly outperforming these competitors on key benchmarks, including multilingual transcription and short-form English.

Zoom in
Mistral claims Voxtral beats Whisper large-v3, GPT-4o mini Transcribe, Gemini 2.5 Flash, and ElevenLabs Scribe across all tested tasks. However, unlike Whisper, Mistral hasn’t disclosed hallucination rates — a key quality metric for ASR.

Yes, but
The ASR space is competitive, and real-world adoption depends on integration ease, transparency of benchmark data, and ecosystem support. Voxtral’s open approach and pricing could pressure incumbents but must prove itself in production environments.

Mistral’s Voxtral challenges closed speech models with open performance and lower costs

You might also enjoy:

Can the Clubhouse App break through?

Apple India continues its stellar run, doubles business in last fiscal

5th Gen Intel Xeon Scalable Processors: Unveiling Next-Gen Potential for Network and Edge Deployments

The new unified, comprehensive Qualcomm AI Stack for the Connected Intelligent Edge