View Full Version : Software tool for transcribing from WAV/Mp3 files


Bhanu Neti
December 9th, 2009, 03:45 PM
I am looking for software tool that can take a WAV or MP3 file and generate the dialogue text . The audio files will have mostly one voice talking with some Background sound/noise.

I looked around but could not locate any. Appreciate any pointers.

Marshall Staton
December 9th, 2009, 07:31 PM
Dragon dictate may be able to do it. I would check out some medical transcription software if not.

Chris Barcellos
December 9th, 2009, 09:05 PM
Premiere Pro in CS4 ??

Streaming Media - Creating Automatic Transcripts in Flash Video Using Adobe CS4 (http://www.streamingmedia.com/article.asp?id=10923)

David B. Sanders
December 18th, 2009, 04:18 PM
I have Dragon NaturallySpeaking 10 from nuance.com. I have only used it to text my speech but it is very quirky. You have to speak slow and very clear, mostly monotone. It takes time to train the program to your voice before you can start using it. In your case, you wouldn’t be able to do that with your recordings because the training is done by YOU reading predetermined text that the program displays for you.

I bet you could find someone who is fast at typing and you could pay them to type out the dialog from your recordings. Maybe a college student.

Battle Vaughan
December 18th, 2009, 08:20 PM
Premiere Pro in CS4 ??

Streaming Media - Creating Automatic Transcripts in Flash Video Using Adobe CS4 (http://www.streamingmedia.com/article.asp?id=10923)

In my experience, this feature(?) in PPro is only good for yucks. If you want a hoot, just talk into it in Soundbooth or Premiere and see what it reads back. I tried it with my own non-professional voice and recordings from pro tracks and even clear music lyrics and the results were hysterical but not particularly useful. /Battle Vaughan

Warren Kawamoto
October 2nd, 2011, 11:50 AM
Fast forward to 2011...Has any good programs emerged? Dragon Speaking 11?

Paul Elertson
June 13th, 2013, 08:08 AM
fast foward to 2013... Anything new?

Garrett Low
June 13th, 2013, 09:35 AM
Some of the best speech recognition I've experience is actually on my Motorola Android phone. If someone has something like that available in a stand alone package it would probably be about 90% accurate speaking in a natural way. On my phone it also can take a while to transcribe.

Mark Ahrens
November 7th, 2013, 04:24 PM
Google's speech recognition is quite good - i wonder why they don't put out a product?

Richard Crowley
November 7th, 2013, 04:59 PM
Google's speech recognition is quite good - i wonder why they don't put out a product?
Google (and Apple Siri, et.al) voice-recognition are not smart-phone "apps". They are huge banks of computers in gigantic server farms back at the other end of the internet. Dunno how you could "productize" that. Unless you sold a "service" where you send them your audio file, and they return a text transcription.

Speaking of which, there are many transcription services available online who do this with live humans at competitive prices. These services are popular with producers of "reality" or even newsmagazine shows (60 Minutes, et.al.) Many (most?) are in Asia where you can send them files at the end of the day shooting, and have the transcripts in your email the next morning.

There is also a web-based "app" where you can play a sound file in "chunks" while you type the transcription. I have used it and it seemed pretty useful (and free!): Transcribe - online transcription software with free trial (http://transcribe.wreally.com)

Jim Michael
November 7th, 2013, 05:35 PM
Medical transcription software works in a special place where there is a more predefined grammar, so speech recognition in that area would be expected to be better than more free-form recognition. There are other types of recognition that work off a pre-defined grammar, such as "enter or say your phone number" which would expect a string of digits of a certain length. The holy grail is a universal voice recognition system that understands every dialect without training. The technology has gotten quite good but it's still not there yet. The software would have to analyze what was heard and interpret it in a grammatical context in order to avoid confusion with homonyms and the like. What you'll find is that you'll spend nearly as much time fixing the errors as you would typing it, so a transcriptionist is still your best value.

Marco Leavitt
November 8th, 2013, 08:39 AM
I've done a fair amount of research into this and I'm pretty sure the product you want still doesn't exist, although Google has obviously developed the technology.

Greg Miller
November 8th, 2013, 09:21 AM
Google's speech recognition is quite good - i wonder why they don't put out a product?

Not if I were to judge based on the transcripts that Google Voice generates from my incoming phone calls. Some of the transcripts are literally obscene. I'm surprised they even had those words in their vocabulary. ;-) Always good for a laugh, but I have to play back the voice file to find out what the caller really said.