Mechanical Turk for transcribing audio recordings and captioning
June 13, 2013
Audio

I sometimes record lectures or meetings for later reference, and have found Mechanical Turk to be an efficient and affordable method for transcribing them. It can be quite confusing to get started using MTurk though, and the available blogs on the subject don't really help you navigate the current MTurk interface, so here is the step-by-step process I use to have a lecture transcribed:

Cut your audio into 5 or 10 minute clips

Download Audacity - a free audio editing tool that works on both Windows and Mac.

Import your voice-recording file. (File > Import >Audio)

If you see a warning about FFMpeg being missing, just ignore it.


Ads by Google

Posted by ellen at June 13, 2013 11:48 PM

Click on the waveform-timeline and drag to the right or left to select an area about 10 minutes long. 

screenshot 2013-06-13 at 6-13-13  - 11.53.37 PM .jpg

Select Add Label At Selection from the Tracks menu.

screenshot 2013-06-13 at 6-13-13  - 11.53.28 PM .jpg

Select slightly overlapping regions of similar duration and label each one until you reach the end of the recording. 

screenshot 2013-06-13 at 6-13-13  - 11.42.24 PM .jpg

If you make a mistake and need to delete a label, select Edit Labels...

screenshot 2013-06-14 at 6-14-13  - 12.01.36 AM .jpg

Edit or remove labels in this dialog.

screenshot 2013-06-14 at 6-14-13  - 12.01.58 AM .jpg

Once your labels are set correctly, select Export Multiple from the File menu.

screenshot 2013-06-14 at 6-14-13  - 12.02.20 AM .jpg

Select a directory to save to, and choose naming options for the files.

screenshot 2013-06-14 at 6-14-13  - 12.02.34 AM .jpg

Set up your HITs on Mechanical Turk

Once the files are exported, upload them to some publicly accessible location like an ftp site, a public folder on box.com, a public folder  on Google Docs, etc.

Go to Amazon's Requester page for Mechanical Turk, at https://requester.mturk.com/

Create an account on MTurk and put some money in it for paying the workers.  

I create my HITs (Human Intelligence Tasks) individually because I usually only have a few of them, but if you have a lot of audio to transcribe, you will probably want to create a batch. I would test your batch-making process first, using a small set of HITS in a CSV file. Don't risk going live with a lot of incorrect HITs.

To create an individual HIT, click Create HIT's Individually:

2013-06-14_15-41-12.jpg

Fill out all the fields marked in yellow. I usually put some numbering (1 of 4, 1 of 3) in the title, so I can tell at a glance which of the recording segments this HIT covers.

2013-06-14_15-48-13.jpg

In the title, it is helpful to the worker to put the duration of the audio, so they can tell what the real rate is on the HIT. A typical title I would use would be:

Transcribe this 10 min. MP3 of a lecture or meeting [Learning Lecture 4-29-13 30-45]

For instructions I usually put something like: 

Listen to a short audio clip and transcribe what is said. Do not include "hmm" and "errs" in the transcription. Do not correct for grammar mistakes but transcribe as spoken. Use punctuation where appropriate. Indicate different speakers with "speaker 1" or "speaker 2". 

There are some people who speak extremely softly on the clip - don't try too hard to transcribe them, just get what you can of them and indicate inaudible parts with "[inaudible]". If you get a few words from them that's great. Concentrate on the clearer speakers.

Audio is at:

http://path.to.recording/audio.mp3

Always test the location given for the recording, since the worker will be forced to write and ask for help if the audio is not where it is supposed to be. 

I use both a typed path to the audio AND click the "Add Audio" button just below the instructions field. I've found that sometimes the audio gets lost when you just use the button to create a link.

I usually set the HIT to expire in 12 hours and give the worker a few hours to complete it after they accept it.

What to pay:

The higher the payment offered, the more likely your HITs will be completed quickly by high-quality workers. Most of the transcription HITs are from big crowdsourcing companies who have to add a percentage. Scanning the available HITs on mturk.com, the prices are all over the map, mostly very low. I've seen recommendations of $2.00 for 5 minutes of audio, but it seemed too low, so for transcription for my own use, I pay $5.00 for 10 minutes or $30.00 for an hour of audio. Basically, pay as much as you can afford to pay, given the volume of HITs you intend to offer. 

Once you have previewed and submitted the HIT, you will get the chance to create another new HIT based on the one you just completed. 

When you get a message that the hits are completed, make sure and approve them quickly. Go back to the Requester site and click Manage HITS individually.  HITS you have not yet approved are in blue, approved ones are gray.  Check over the transcription, check the Approve box and click Submit.  

Approved individual HITS will remain in your "Manage HITS individually" area unless you delete them. To review a transcription after you've approved a HIT, go to "Manage HITs individually." 

2013-06-14_17-35-17.jpg

Click Download Results

2013-06-14_17-36-39.jpg

Although the text will look like a solid block of words in this screen, I've found that if I select the text and "Copy", then paste it into a Google Doc, all the paragraph spacing reappears. The CSV file doesn't preserve the formatting in the same way.

2013-06-14_17-29-35.jpg

For more information on Mechanical Turk, see the references below.

 Resources


    Ads by Google


    Ads by Google

     RSS   |   Contact Me


    Ads by Google