» Final Project Design Document – voice-activated video player

Final Project Design Document – voice-activated video player

November 22, 2011 by Jonathan Fretheim. Filed under cs491 mobile, fall 2011, postmortems.

For my final project, I’ll be building an incremental prototype that will be used for the Reader-animated storybooks research project I’m involved with.

The big idea is to build a framework for multi-page animated storybooks where animation on each page is triggered by the user reading on-screen words. In the process, we will experiment with different methods of animation (i.e., using video files, a flip book approach, or perhaps writing an interface that an artist familiar with Flash could use directly for simple keyframe animation), and hopefully different methods of speech recognition (the built-in Android SpeechRecognizer class for a cloud-based solution, CMU Sphinx for on-device, offline functionality, or even some direct audio signal processing).

That said, this assignment will be a small first step. Users will be able to select videos from a menu screen and activate controls via voice commands like “Play Video 3”. “Stop”, “Rewind”, “Menu”, etc.

In addition to being able to re-use the custom video player code, this assignment should help the research group start to understand how network latency that comes with SpeechRecognizer will affect user experience.