Speech to Text
Transcribe speech from microphone, audio or video file using AI. No uploads, no account required.
What it does
Speech to Text transcribes audio from your microphone, audio file, or video file into plain text using an AI speech recognition model.
How to use
- Choose source โ switch between Microphone and Audio file.
- Select language โ pick the spoken language or leave Auto-detect to let the model identify it.
- Microphone mode โ hold the button while speaking, release to transcribe.
- With punctuation โ enable to add commas and periods automatically based on pause length. Punctuation follows pauses, not grammar rules, so results are approximate.
- File mode โ drop an audio or video file (MP3, WAV, M4A, OGG, MP4, WebMโฆ) or click to choose, then press Transcribe.
- Copy the result โ use the Copy button or select text manually.
Each new recording or file replaces the previous transcript. Use Clear to empty it manually.
First-run download
On the very first use, the AI model is downloaded and stored in your browser cache. Subsequent uses load instantly โ no repeated downloads.
Supported formats
Any format your browser can decode: MP3, WAV, FLAC, OGG, M4A, WebM, MP4, and more.
Accuracy
The model handles everyday speech well in most major languages. For accented or technical speech, results may vary. Best results come from clean recordings without heavy background noise.
Languages
Supports 90+ languages. When Auto-detect is selected, the model identifies the language from the first 30 seconds of audio.
Privacy
Audio never leaves your device. No account required.