by Rob Miller (MIT CSAIL), workshop organizer
with Jeff Bigham (University of Rochester) and Michael Bernstein (MIT CSAIL)
Many applications of crowd computing to date have been batch computation. Consider the ESP Game , LabelMe , reCaptcha , and most tasks seen on Mechanical Turk. In these applications, people are labeling and transcribing data for a corpus that will eventually be used for indexing or machine learning. This work has two interesting properties. First, it’s asynchronous. Eventually the crowd gets around to processing it, but you won’t want to sit around waiting for the answer. Second, it’s computation, and purely functional computation at that. The crowd is working entirely in an online world, taking input data in one digital form (like images or audio) and producing output to another digital form (like text or labels). All the work happens on a web site, in cyberspace.
What if we relax those limits — first that crowd work is asynchronous, and second that it’s only about computation? What if we could gather a crowd right now to work synchronously with the system, with an end-user, with each other? What if the work involved the physical world, not just the digital one? What new applications would be possible? What new challenges would arise?
Our group at MIT CSAIL has already been studying what might be possible if we can do crowd work on demand, on behalf of an end user. In collaboration with Jeff Bigham at University of Rochester, Joel Brandt at Adobe, and Bjoern Hartmann at Berkeley, among others, we have built several systems that explore aspects of this idea. Soylent  puts a crowd inside Microsoft Word, so that selecting some text and pressing a button gathers people on demand to help you edit. VizWiz  puts a crowd inside a blind person’s smartphone, letting them take photos and ask for help from sighted people on the web.
VizWiz is a point in this new design space for crowd computing. First, it’s synchronous; the blind person is waiting for the answer, so VizWiz has to get an answer fast. Using a basket of techniques, it often manages to get an answer (from people hired on Mechanical Turk) in 30 seconds or less. Second, the end-user is mobile, out in the physical world, effectively carrying and directing a sensor (the phone’s camera) for the sake of the crowd. The crowd’s effort is still purely computational, but the real world is directly involved.
What if the crowd were also situated in the real world? What if the crowd carried the sensors on their own mobile devices? Google GPS traffic sensing and CarTel’s Pothole Patrol  are good examples of crowd sensing, but still asynchronous, not on demand. What if the crowd did physical work as well? A brilliant example of this is the San Ramon Fire Department’s iPhone app. If you have this app and someone near you is in cardiac arrest, the 911 dispatcher can pop up a notice on your phone with the location of the nearest Automatic Emergency Defibrillator, asking you to bring it to the heart attack victim. A small amount of effort exerted at the right time can save a life. What are the more everyday applications for crowd work in the real world?
Finally, a major research challenge for real-time, real-world crowd computing is the nature of the crowd itself. “Crowd” typically implies a group of people making small contributions that may not be correct, or even well-intentioned. How can we get a high-quality answer from noisy contributions made in a short time? Soylent tackles the quality problem using algorithms, at the cost of greater latency; real-time requirements make this approach still more challenging. VizWiz uses redundancy, getting multiple answers from the crowd. How does redundancy work in the real world of limited resources and side-effects? Multiple defibrillators arriving at a heart-attack scene can’t hurt, but would I have to ask the crowd for three cups of coffee just to guarantee that I’ll get at least one? If I ask a crowd to buy the last donut on the shelf, what will the latecomers do?
Let’s think about real-time, real-world crowd computing, because it’s coming.
Rob Miller is an associate professor of computer science at MIT. His research interests focus on people and programming.
- Luis von Ahn and Laura Dabbish. Labeling images with a computer game. CHI 2004.
- Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman. LabelMe: A Database and Web-Based Tool for Image Annotation. Int. J. Comput. Vision 77, 1-3 (May 2008), 157-173.
- Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321 (September 2008): 1465–1468.
- Michael S. Bernstein, Greg Little, Robert C. Miller, Björn Hartmann, Mark S. Ackerman, David R. Karger, David Crowell, and Katrina Panovich. Soylent: a word processor with a crowd inside. UIST 2010.
- Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and Tom Yeh. VizWiz: nearly real-time answers to visual questions. UIST 2010.
- Jakob Eriksson, Lewis Girod, Bret Hull, Ryan Newton, Samuel Madden, and Hari Balakrishnan. The pothole patrol: using a mobile sensor network for road surface monitoring. MobiSys 2008.