WhatsApp added a built-in speech-to-text feature that transcribes individual voice messages directly inside the app — tap a voice note and the transcript appears right there, on-device, without sending audio anywhere. It is genuinely useful for reading a voice note when you cannot press play. In my experience using it across different chats, the built-in feature is the right choice for a quick in-context read, but it is not built for saving or exporting anything. What it does not do is batch-transcribe a whole chat, export the transcripts, or produce anything you can save outside WhatsApp. For that, you need a dedicated export tool like ChatToPDF's Premium+Voice tier at $49 per chat.

How WhatsApp's built-in speech to text works

WhatsApp's transcription feature processes voice messages on your device — the audio does not leave the phone when you request a transcript. This is a deliberate design choice that keeps the feature privacy-preserving and available offline (as long as the relevant language model has been downloaded to your device).
The steps to use it vary slightly across WhatsApp versions and operating systems, and the exact menu wording has changed over time — check your current WhatsApp for the precise label — but the general workflow runs like this:
Open a chat that contains a voice message
Navigate to any WhatsApp conversation that has a voice note bubble — the dark waveform pill with a play button. One-on-one chats and group chats both work the same way.
Tap or long-press the voice message to reveal the transcription option
On most recent Android and iPhone versions, a transcription option appears when you tap or long-press the voice message bubble. Depending on your version it may be labelled Transcribe, Show transcript, or a similar phrase. If you do not see it, make sure WhatsApp is updated to the latest version — the feature rolled out gradually from late 2024.
Wait for on-device processing
WhatsApp processes the audio using a language model stored on your device. For a short voice note of 30 seconds or less, the transcript usually appears within a couple of seconds. Longer voice notes take proportionally longer. If your device does not yet have the language model downloaded, WhatsApp may prompt you to download it before the first transcript.
Read the transcript in-line beneath the voice note
The transcript appears directly below the voice note bubble in the chat. You can read it there, copy the text, and then the transcript disappears when you navigate away — it is not saved separately in WhatsApp, and it does not appear in the chat export.

WhatsApp's transcription support has expanded over time; check your app for current language availability. The feature was initially English-first and has been adding languages in subsequent releases. For the most up-to-date details on what is supported, the WhatsApp FAQ is the authoritative source.
The limits of WhatsApp's native transcription


WhatsApp's built-in transcription is designed for a specific use case: quickly reading one voice note without playing audio. It does that job well. Outside that use case, there are genuine constraints worth understanding before you rely on it for anything beyond casual convenience.
Per-message only — no batching. The feature transcribes one voice note at a time. A conversation with twenty voice notes requires twenty separate taps, each producing a transcript you have to read and manually copy before it disappears. There is no "transcribe all voice notes in this chat" option.
Transcripts stay inside WhatsApp. The text appears as an overlay in the chat but is not saved as a separate record anywhere. It does not appear in the WhatsApp chat export. If you export the chat as a ZIP and upload it elsewhere, the transcripts are not in the text file — only the placeholder markers for the audio files are. The transcript exists on your screen momentarily; it is not archived.
No export to text, PDF, or any other format. You can manually copy a single transcript from the overlay, but there is no export-to-email, export-to-notes, or export-to-document option built in. Saving multiple transcripts means manually copying each one into another app, which is tedious for anything more than a handful of messages.
Language availability varies. WhatsApp's transcription supports a limited set of languages, and availability depends on which language model your device has downloaded. Support has expanded since the feature launched; check the current WhatsApp documentation or in-app settings for the current language list. Languages beyond the core set may not be available.
Not designed for records, evidence, or archives. Because transcripts are not stored, cannot be exported, and appear only as a transient overlay, the feature is not useful if you need a permanent record of what was said in a voice note — for a legal matter, a business archive, an HR file, or a personal keepsake. The transcript is gone once you move away from the message.
These are not criticisms — they reflect the design intent of the feature, which is real-time in-chat convenience rather than documentation. But they explain why a different approach is needed when the goal is documentation.
When you need a dedicated speech-to-text tool


The gap that a dedicated tool fills is precisely what WhatsApp's built-in transcription does not cover: the whole chat, as a document, exported with sender attribution and timestamps, in a format you can save, search, and share outside WhatsApp.
| Feature | WhatsApp built-in | ChatToPDF Premium+Voice |
|---|---|---|
| Scope | One voice message at a time | Entire chat — all voice notes batched |
| Output format | Stays inside WhatsApp — no export | PDF (and optionally XLSX/CSV) |
| Language support | Varies; check your app version | 17 high-accuracy languages; 30+ auto-detected |
| Sender name + timestamp | Visible in chat but not in transcript | Inline with every transcript entry |
| Searchable archive | Not archived — transient overlay | Full-text searchable PDF |
| Best for | Quickly reading one voice note without playing audio | Legal records, business archives, accessibility, long chats |
| Cost | Free (built into WhatsApp) | $49 per chat — one-time, no subscription |
To use ChatToPDF for batch transcription, you export the WhatsApp chat with the "Including Media" option — this puts the voice note audio files inside the ZIP — then upload the ZIP to chattopdf.app and pick the Premium+Voice tier. Every voice note in the conversation is transcribed inline in the resulting PDF, at its correct position in the chat, with the sender name and timestamp. The sibling guide WhatsApp voice to text walks through that workflow in full, step by step.
The cases where the dedicated approach makes more sense than the built-in feature:
Legal and compliance situations. If you need a record of what was said in a voice note — for a dispute, an HR matter, a court filing — a transient in-app overlay is not a record. A PDF with inline transcripts, sender attribution, and timestamps is. The WhatsApp to PDF guide covers formatting options for formal use cases including legal styling.
Business chats with many voice notes. Sales calls, client updates, project approvals — if a substantive WhatsApp conversation contains a dozen voice notes from different participants, transcribing them one at a time and manually compiling the results is slow and error-prone. A single batch conversion produces one document with everything in context.
Accessibility. Voice notes are inaccessible to people who are deaf or hard of hearing. A transcript document is accessible; a transient in-app overlay helps in the moment but does not solve the archive problem.
Long-term preservation. Phones change, WhatsApp accounts close, device storage gets cleared. A PDF of a conversation — voice notes and all — outlasts the app.
Accuracy

For ChatToPDF's voice transcription, the engine is Deepgram Nova-3. In my own testing, clear recordings in supported languages — recorded indoors, phone close to the speaker — produce transcripts that are clean and useful with very few errors. Background noise (a busy café, wind, a moving car) degrades accuracy noticeably. The transcribe WhatsApp audio pillar covers the accuracy picture in detail, including what happens at different noise levels.
For WhatsApp's own built-in feature, I am not in a position to benchmark it precisely — the underlying model is not publicly documented, and accuracy will vary by language, device, and audio conditions. The honest framing is: for a short, clear voice note in a well-supported language, both approaches generally produce readable output. For long or noisy recordings, or for languages on the edge of each system's support, results vary and testing with your own audio before committing to an approach is the sensible call.
Key takeaways
- WhatsApp has a built-in speech-to-text feature that transcribes individual voice messages on-device. It is free and works without sending audio to external servers.
- The built-in feature transcribes one message at a time. Transcripts appear as a temporary overlay in the app — they are not saved and do not appear in the WhatsApp export.
- WhatsApp's native transcription is the right tool for quickly reading a single voice note. It is not suited for batch transcription, exporting, or archiving.
- For the whole chat — all voice notes transcribed inline, with sender names and timestamps, in a searchable PDF — ChatToPDF's Premium+Voice tier ($49 per chat) is the dedicated tool for that job.
- WhatsApp's transcription support has expanded over time; check your app version for current language availability.
- For the step-by-step ChatToPDF workflow, see WhatsApp voice to text. For the full technical depth on accuracy and the Deepgram pipeline, see the transcribe WhatsApp audio pillar.
- The two approaches are not competing alternatives for the same job — WhatsApp's built-in is for convenience in the moment; a dedicated tool is for documentation and archiving.
FAQ
Does WhatsApp transcribe voice messages itself?
Yes — WhatsApp added an on-device speech-to-text feature that rolled out from late 2024. When you tap or long-press a voice message bubble, an option to transcribe it appears (the exact label varies by app version and OS). The transcript is processed locally on your device and appears as an overlay beneath the voice note. The feature is free and does not require sending audio to external servers. It covers a limited set of languages, and availability depends on which language model has been downloaded to your device — check your WhatsApp settings or the in-app prompt for the current list.
Why can't I export WhatsApp's transcription?
WhatsApp's built-in transcription is designed as a convenience feature for reading a voice note in the moment — not as a record-keeping or documentation tool. The transcript text appears temporarily in the app but is not saved to the chat log, does not appear in the WhatsApp export ZIP, and cannot be sent or saved in bulk. To copy a single transcript you can manually select and copy the text while it is displayed, but there is no batch-export or save-to-file option. If you need transcripts as a document — for legal, business, or archival purposes — you need a different approach: export the chat with Including Media and use a dedicated tool to batch-process the voice notes.
What is the difference between speech to text and voice to text for WhatsApp?
In everyday use the terms are interchangeable — both refer to converting spoken audio into written text. "Speech to text" is the generic technical term for the process; "voice to text" is how the same capability is often described in consumer contexts, particularly in the WhatsApp ecosystem where "voice notes" is the product term for the audio messages in a chat. Whether you search "WhatsApp speech to text" or "WhatsApp voice to text," you are looking for the same thing: a way to read the content of a voice note as written text rather than listening to the audio. This page covers WhatsApp's built-in feature and its limitations; the WhatsApp voice to text guide covers the ChatToPDF workflow for converting an entire chat's voice notes to a transcript PDF.
Which languages does WhatsApp's built-in transcription support?
WhatsApp's transcription support has expanded over time, and the current language list is best checked in your app — language availability depends on the version of WhatsApp you are running, your device's operating system, and which language model has been downloaded. The feature launched with a focus on English and has added languages in subsequent releases. If transcription is not available for your language, the option may not appear on the voice note bubble, or WhatsApp may not yet have released it for your region or device type.
Does WhatsApp speech to text work offline?
WhatsApp's on-device transcription is designed to work without an internet connection once the relevant language model has been downloaded to your device. The initial download of the language model requires connectivity. After that, individual transcriptions are processed locally — so if you are in an area with poor signal and you have already downloaded the model, the feature should still work. Note that this applies to the built-in WhatsApp feature specifically; any third-party transcription tool that processes audio via a cloud API (including ChatToPDF) requires an internet connection, because the audio is sent to a remote transcription engine.

I'm Paul. I built ChatToPDF after watching a friend try to print a 4-year-old WhatsApp chat across forty-something one-page PDFs. I write here about exporting WhatsApp chats, converting them to PDF, transcribing voice notes, and the messy edge cases nobody else writes about (40,000-message export limits, broken emojis, RTL Arabic, Samsung Secure Folder).