What if the crowd acted like a single (really awesome) worker?
We already know that the crowd can do amazing things. And, Michael Bernstein and I have already advocated on this blog that it’s important to get this work done quickly for all sorts of cool interactive applications.
But, what do you do once you have your crowd in 2 seconds? I argue that one of the most interesting things would be to get the crowd to act like a single high-quality worker.
Crowds aren’t reliable (take as witness the litany of work trying to make them more reliable). The work of individual crowd workers can be of low quality for a host of reasons, and you can’t count on any particular high-quality worker you’ve identified to be around when you need them — let alone for your whole job.
I just want one really high-quality worker recruited on-demand for the duration of my job!
An individual worker could execute a long-term strategy, and respond to changes in the application in real-time (without having to wait for a quorum of other workers to vet their actions). My really awesome worker could do complex control tasks, with the interfaces I already have: “Hey, boss, want me to drive your robot, complete that Excel spreadsheet, or fill in for you on WoW while you’re taking a break, no problem!”
But, I also want my worker to have the advantages of the crowd — to be available on-demand all the time, and to benefit from collective intelligence. I want to choose the crowd that’s in my worker’s head — maybe paid workers, but maybe my friends or family instead.
That’s what our paper is about: the Legion system that we introduce turns the unreliable dynamic crowd into really awesome workers that you can recruit on-demand to reliably control your existing interfaces in real-time.
The fact that Legion works on existing interfaces is important. Many of us in this area build substantial one-off systems and support infrastructure so the crowd can do our tasks, but Legion allows new crowd-powered systems to be created from the interfaces you already have. We’ve used Legion to create a robot that follows natural language commands, to outsource bits of office work, to make an assistive keyboard more accurate, and, yes, even to fill in for us in video games.
To use Legion, users first select a portion of their desktop interface that they would like the crowd to control, provide a natural language description of the task for the crowd to perform, and offer a price that they are willing to pay. Legion forwards a video feed of the interface to the crowd and forwards key presses and mouse clicks made by the crowd back to the interface (think VNC). To improve reliability, multiple workers are recruited to collaboratively complete the task. A fundamental question is how to effectively mediate crowd work to balance reliability with the desire for real-time, closed-loop control of the interface. Legion coordinates task completion by recruiting crowd workers, distributing the video feed, and providing a flexible mediation framework to synthesize the input of multiple crowd workers.
So, how do we decide which input actually makes it to the interface being controlled? We use crowd agreement to figure out which input is likely to be the best, and forward it on. We evaluated a number of these mediation strategies, and (spoiler alert!) the best ended up being a mediator that used crowd agreement to dynamically select a leader who would temporarily (but unwittingly) assume full control.
Here are some visual results of these mediation strategies on a robot navigation task:
Work in human-computer interaction usually assumes either a single user, or groups of users collaborating in the same virtual space, each in control of a personal cursor. Legion advances a new model in which a diverse and dynamic group collectively acts as a single operator, which introduces all kinds of interesting problems regarding feedback, quality control, and reliability.
We’re all really excited about this project, and think it brings up a whole slew of interesting research questions. The papers outlines a number of interesting angles future work might explore — like how to give good feedback to workers when the actions of each worker might not be taken, the potential of what we call desktop mash-ups, and how Legion might enable the crowd to be programmed by demonstration.
This blog post brought to you by Walter Lasecki, Kyle Murray, Samuel White, Rob Miller, and Jeffrey Bigham.
Jeffrey P. Bigham is an assistant professor at the University of Rochester. His research is about improving access for everyone, which leads him on adventures in HCI, AI, Human Computation, Disability Studies, and Systems. He tweets from @jeffbigham.