Skip to content

Free AI Audio Transcription – Convert Speech to Text Offline with Whisper

Transcribe speech to text locally with Whisper AI. No uploads, no minute limits, no subscription. Free for Windows, Mac & Linux.

Transcribe audio in 4 steps:

  1. Install the free OpenVINO AI plugin for Audacity
  2. Import your recording and select the track or region
  3. Go to Analyze → OpenVINO Whisper Transcription
  4. Pick a Whisper model and language, then click Apply

What Is AI Transcription?

AI transcription turns spoken audio into time-stamped written text. Audacity's OpenVINO Whisper Transcription effect runs OpenAI's Whisper speech-recognition model entirely on your own computer — no uploads, no minute limits, no subscription. Feed it an interview, podcast, lecture, or voice memo and it writes every word into a label track you can edit, search, and export as SRT subtitles, VTT, or plain text. Because it runs offline, your audio never leaves your machine, making it a privacy-friendly alternative to cloud transcription services.

How to Transcribe Audio in Audacity

Step 1: Install the OpenVINO AI Plugin

Download and install the free OpenVINO AI plugin for Audacity from the official Audacity plugins page. The plugin adds a set of AI-powered effects and analysis tools, including Whisper Transcription.

Step 2: Import and Select Your Recording

Open your audio file with File → Open. Click and drag to select the region you want to transcribe, or press Ctrl+A / Cmd+A to select the whole track.

Step 3: Open Analyze → OpenVINO Whisper Transcription

Go to Analyze → OpenVINO Whisper Transcription. The dialog opens with model selection, language, and inference device options.

Step 4: Choose Model, Language and Apply

Select a Whisper model size (base, small, medium, or large), set the source language or leave it on auto-detect, then click Apply. The transcription is written to a new label track below your audio, with each phrase as a time-stamped label.

Transcription Settings Explained

Whisper Model (base / small / medium / large)

Choose the model size that trades speed for accuracy. Base is fastest and works well for clean English. Small and medium handle accents, noisy audio, and most non-English languages. Large (v1/v2/v3) is the most accurate. A special small.en-tdrz model adds experimental speaker diarization.

Mode (Transcribe vs Translate)

Transcribe keeps the spoken language in the output. Translate converts any of Whisper's 99 supported source languages into English text automatically — useful for subtitling foreign-language clips without a separate translation step.

Source Language

Defaults to auto-detect, which samples the first seconds of audio to guess the language. Pick a language manually for short clips, code-switching, or when auto-detect lands on the wrong one. Whisper supports 99 languages.

Inference Device (CPU / GPU / NPU)

Picks which chip runs the model. CPU works everywhere. GPU is faster on discrete or integrated graphics. NPU uses the neural accelerator on modern Intel Core Ultra laptops. Click Device Details to see what Audacity detected on your system.

Advanced Options (Initial Prompt, Max Segment Length, Beam Size)

Use Initial Prompt to steer spelling of names, jargon, or acronyms. Max Segment Length controls how long each label can be — shorter values help word-level editing and subtitle formatting. Beam Size improves accuracy at the cost of processing time.

Whisper Model Reference

Model Best For Accuracy Typical Speed (CPU)
baseClean English, quick draftsGood~0.3× audio length
smallAccents, noisier audio, many languagesBetter~0.7× audio length
small.en-tdrzEnglish + speaker diarizationBetter + speaker change~0.7× audio length
mediumReliable multi-language transcriptionHigh~1.5× audio length
large-v3Final subtitles, tough recordingsHighest~3–5× audio length

Common Use Cases

Tips for Best Results

Frequently Asked Questions

Is AI transcription in Audacity really free?
Yes. Audacity is free and open source, and the OpenVINO AI plugin that powers Whisper Transcription is also free. There are no minute caps, no subscriptions, and no watermarks — everything runs locally on your own PC.

Does Audacity audio transcription work offline?
Yes. Audacity runs Whisper entirely on your local machine. Your audio and transcript never leave your computer — ideal for interviews, legal recordings, and anything you'd rather not upload to the cloud.

How accurate is Whisper transcription in Audacity?
Very accurate for clean English on the medium and large models — comparable to paid cloud tools. Noisy audio, strong accents, or overlapping speakers reduce accuracy; denoise first and use a larger model for best results.

How many languages does Audacity transcription support?
Whisper supports 99 languages for transcription. In Translate mode, audio in any of those languages can be converted directly into English text.

How do I export a transcript from Audacity as SRT or text?
Go to File → Export Other → Export Labels and pick SRT, WebVTT, or plain text. SRT and VTT are ready to drop into YouTube, Premiere, or DaVinci Resolve as subtitles.

Can Audacity identify different speakers in a recording?
Yes — experimentally. Pick the small.en-tdrz model to enable speaker diarization. Audacity creates two label tracks and alternates labels when it detects a speaker change.

Download Audacity Free

Ready to transcribe your audio offline? Download Audacity for free on Windows, macOS, or Linux.

Download Audacity 3.7.7

Download without MuseHub