Thinking about outsourcing transcription or doing it myself

| analysis, decision, kaizen, speaking

I like reading much more than I like listening to someone talk, and much, much more than listening to myself talk. Text can be quickly read and shared. Audio isn’t very searchable. Besides, I still need to work on breathing between sentences and avoiding the temptation to let a sentence run on and on because another cool idea has occurred to me. Perhaps that’s what I’d focus on next, if I ever resume Toastmasters; my prepared speeches can be nice and tight, but my ad-libbed ones wander. More pausing needed.

So. Transcription. I could do it myself. I type quickly. Unfortunately, I speak quite a bit faster than I type, so I usually need to slow it down to 50% and rewind occasionally. ExpressScribe keyboard shortcuts are handy. I’ve remapped rewind to Ctrl-H so that I don’t need to take my fingers off the home row. But there’s still the there’s the argh factor of listening to myself. This is useful for reminding me to breathe, yes, but it only takes five minutes for me to get that point. ;) The other night, it took me an hour to get through fifteen minutes, which is slower than I expected. An hour-long podcast interview should take about four hours of work, then.

I could use transcription as an excuse to train Dragon NaturallySpeaking 11, the dictation software I’d bought but for this very purpose but haven’t used as much as I thought I would. It recognizes many words, but I have a lot of training to do before I get it up to speed, and I still need to edit. This would be a time investment for uncertain rewards. I still need to time how long it takes me to dictate and edit a segment.

Foot pedals would be neat, particularly if I could reprogram them for other convenient shortcuts. Three-button pedals cost from $50-$130, not including shipping. In addition to using it to stop, play, and rewind recordings, I’d love to use it for scrolling webpages or pressing modifier keys. I often work with two laptops, so it’s tempting. (And then there’s the idea of learning how to build my own human interface device using the Arduino… ) – UPDATE: I’ve built one using the Arduino! I can’t wait to try it out.

In terms of trading money for time, I’ve been thinking about trying Casting Words, which is an Amazon Mechanical Turk-based business that slices up submitted files into short chunks. Freelancers work on transcribing these chunks, which are then reassembled and edited. The budget option costs USD 0.75 per audio minute, which means an hour-long interview will cost about USD 45 to transcribe. That option doesn’t have a guaranteed turnaround, though, so I could be waiting for weeks. In addition, I tend to talk quickly, so that might trigger a “Difficult Audio” surcharge of another USD 0.75 per minute, or about USD 90 per audio hour.

For better quality at a higher price, I could work with other transcription companies. For example, Transcript Divas will transcribe audio for CAD 1.39/minute, and they guarantee a 3-day turnaround (total for 1 hour: CAD 83.40). Production Transcripts charges USD 2.05/minute for phone interviews.

I could hire a contractor through oDesk or similar services. One of the benefits of hiring someone is that he or she can become familiar with my voice and way of speaking. Pricing is based on effort instead of a flat rate per audio minute, and it can vary quite a bit. One of my virtual assistants took 14 hours to transcribe three recordings that came to 162 minutes total. At $5.56 per work hour, that came to $0.48 per audio minute, or $28 per audio hour. oDesk contractors are usually okay with an as-needed basis, which is good because I’ve scaled down my talks a lot. (I enjoy writing more!)

So here are the options:

  • Type it myself: 4 hours of discretionary time
  • Dictation: Unknown hours of discretionary time, possible training improvements for Dragon NaturallySpeaking
  • Foot pedals: Probably down to 3.5 hours / audio hour, but requires a little money; hackability
  • Casting Words: USD 90 per audio hour, unknown timeframe
  • Transcript Divas: CAD 84 per audio hour, 3-day turnaround
  • Contractor: Can be around USD 30 per audio hour, depending on contractor

I’m going to go with dictating into Dragon NaturallySpeaking because I need to train it before I can get a sense of how good it is. It takes advantage of something I already own and am underusing. Who knows, if I can get the hang of this, I might use it to control more functionality. We’ll see!

You can comment with Disqus or you can e-mail me at sacha@sachachua.com.