REAL-TIME TRANSCRIPTION is a vital accommodation for deaf and heard of hearing people in their daily lives. Captioning is typically expensive due to the years of training that is required.
LEGION:SCRIBE introduced a method that used multiple non-experts to caption audio with high quality and low latency, at far lower costs. We recently developed TimeWarp to help make the task easier for individual workers without hurting collective performance. TimeWarp makes each captionist’s job easier by selectively slowing down and speeding up the playback speed of the audio.
OFFLINE CAPTIONISTS OFTEN SLOW DOWN AUDIO to make it easier to caption. However, this necessarily puts the worker behind real-time. That’s fine for offline captioning, but means it can’t be used by one person and still keep up with real-time speech.
TimeWarp relies on:
- People’s ability to hear faster than they can type
- Scribe’s need for workers to only caption a small part of what they hear
For the parts of the audio workers as asked to type, the audio is played slower. In order to catch up with real-time, the audio is played slightly faster during parts in between where the worker listens for context.
WARPING TIME IMPROVES ACCURACY, COVERAGE, AND EVEN LATENCY. Our experiments showed:
- 12.6% mean improvement in accuracy
- 11.4% mean improvement in coverage
- 19.1% mean improvement in latency
The surprising improvement in latency is due to workers being able to keep up with each word as it was said, instead of memorizing it and then typing it later.
For more, see our full paper, Warping Time for More Effective Real-Time Crowdsourcing.
Walter S. Lasecki, University of Rochester
Christopher D. Miller, University of Rochester
Jeffrey P. Bigham, University of Rochester