crowdsourcing general computation, one application at a time

If you can leverage a crowd to do anything, what would it be?

My collaborators and I are studying ways to harness the crowd to do more by coupling the wealth of (computer) algorithmic understanding with our on-going discoveries of how the crowd works.  I tend to think of this as crowdsourcing 2.0, or crowd programming 102: now that we know a crowd exists and that we have programming access into it, what algorithms/interfaces/crowd-interfaces do we use to control the crowd for solving complex tasks?

I believe strongly that this will is quickly becoming a hot area, because there is so much we don’t know about how to organize the crowd around more complex tasks. My position paper with Eric Horvitz, Rob Miller, and David Parkes sets out an agenda identifying three subareas of study in this space, and recent works like Turkomatic and CrowdForge are building the tools that will help us explore this space (as well as exploring it in interesting ways themselves). Instead of rehashing the arguments in our paper and these works, let me argue a slightly different point:

We should build super novel crowd-powered applications that require an understanding of how to harness the collective power of the crowd to solve larger, more complex problems.

I believe crowdsourcing 2.0 applications will help move us forward as an academic community, and provide tremendous value to end users in the meanwhile. In this vein, I am particularly excited about my recent, on-going work with Edith Law on collaborative planning, where we are exploring how to leverage a crowd to come up with a plan for solving a problem, in the context of
(a) breaking down high level search queries into actionable steps as a new approach to web search, and
(b) collaborative event planning, either with family and friends, or crowdsourced out [*].

Since Edith and I love food, we recently planned a potluck using our tool (or rather, the potluck participants did), where people specify dishes they can bring, add to a wish list, make requests, fulfill wishes and requests, and so on, to collaborative plan a menu. Here is a picture of most of the entrees (appetizers/salads/desserts were in a different room, and yes, we ate in courses):

Entrees at our crowdsourced potluck (3/25/11)

These and other crowdsourcing 2.0 applications will draw on innovations in task decomposition (how should we break up and combine the work), crowd control of program flow (have the crowd tell us what needs work and where to search) and human program synthesis (having humans come up with the steps that make up a plan). But while we went into these applications thinking algorithmic paradigm first, we find more and more that designing for how people can best think/work/decompose play an equally important role in enabling such applications. How these pieces fit together is something we should study academically, but let’s have the applications drive us (and feed us… I had a great meal).

Haoqi Zhang is a 4th year PhD candidate at Harvard University. Many of the ideas expressed here are from collaborations and conversations with Eric Horvitz, Edith Law, Rob Miller, and David Parkes.

[*] Please be patient with us if you are looking forward to seeing the first crowdsourced wedding. If you’d like to have your wedding crowdsourced, please contact me immediately.

Humanizing Human Computation

The Internet is packed with crowds of people building, interpreting, synthesizing, and establishing a hodgepodge of interesting and valuable artifacts. Whether the crowds are creating something as grand as an encyclopedia of all world knowledge or as mundane as a discussion on good restaurants in Pittsburgh, PA, the human capability to interact socially and to create an ad hoc whole out of many individual accomplishments is staggering.  However, current efforts in human computation largely do not take advantage of these amazing human capabilities. They focus on single workers and rigid functions. The common computational tasks suggested to newcomers in Amazon’s Mechanical Turk include among others tagging images and classifying web content. In these tasks a worker is given some input data (source images) and performs some ‘human’ function on it to produce useful output (tags) that the job requester has to incorporate into their final product.

While perhaps expedient, such tasks do not leverage some key, unique capabilities that separate human workers from input-output machines. Without training or delay humans can think creatively, socially interact, and make highly nuanced judgments. The next generation of human computation and crowdsourcing ought to leverage more of the ‘human’-ness in workers. Yet, how do we incorporate these uniquely human characteristics like creativity and social interaction into crowdsourcing and encode them into markets?  Efforts like CrowdForge suggest there may in fact be an answer to this question by demonstrating just how powerful crowdworkers can be in highly complex, generative tasks like writing news articles. Similarly, I’ve seen success in allowing Turkers to self organize to complete a task in a collaborative text editor.

Check out this YouTube video: http://www.youtube.com/watch?v=VEGhXNcyTRg
MTurkers collaborating to translate in an Etherpad shared text editor

Collaboration might be one way to get at the core of the ‘human’ element of human computation. Workers in real-world organizations are well adapted for teamwork, dividing and directing individual expertise where it is needed and providing social motivation. This might be extended into crowdsourcing. Could a future market enable projects rather than tasks that require a team of people who curate their own final product, with milestones and payment based both on individual achievement and overall progress? Might workers take on extemporized or formal roles, for example having experts in editing proofread the work of those more skilled in content generation? Can social interaction methods such as work teams provide encouragement as they already have in Wikipedia and also foster higher quality end products? On the other hand, what are the costs of collaboration in rapid-fire microtasks? Are there certain types of tasks for which collaboration is well suited?

By pushing the boundaries of both the types of tasks we use in human computation and the expectations we hold for workers, we can enable a host of new possibilities in crowdsourcing. The melding of social interaction with microtasks is worthy of much more consideration.

Jeff Rzeszotarski (rez-oh-tar-ski) is a first year PhD student in human-computer interaction at Carnegie Mellon University. His research primarily concerns synthesis and interpretation in online content generation communities and extending crowdsourcing techniques into the social realm.

Crowdsourcing Contextual User Information

by Brian Tidball, PhD Student (ID StudioLab, Delft University of Technology)

The creative activities common in crowdsourcing have promising links to the creative activities used in generative and participatory user research.

As Pieter Jan Stappers and I wrote in our position paper, the ID-StudioLab has been working with and developing design tools and methods that engage users and elicit user-driven information for the design process. These participatory and generative techniques gather rich multilayered information about users and their lives: building empathy, informing, and inspiring the design process. Unfortunately these techniques are resource (time, money, expertise) intensive, consequently impeding their use in practice. We see crowdsourcing as an opportunity to more readily access rich information from and about users.

MT Sustainable
By asking ‘Turkers’ to submit personal photos of sustainable living, we gained new insights into the role of sustainability in peoples lives.

Our initial explorations with crowdsourcing explore this idea of crowdsourcing user insights. Preliminary findings highlight the ability to collect rich and personal information, emphasize the roll of intrinsic motivations (interest in the topic, supporting others, etc.), and the ability to not only elicit a single focused response from users, but also engage them in creative dialog (see our position paper for a little more info).

From these experiences I developed a framework to depict the key elements of the crowdsourcing process as they relate to accessing user insights.

Designers CS Framework

The blue elements identify the items that the designer (solicitor) can influence in order to access a segment of the crowd, and motivate them to provide a desired response. Especially the cyclical elements of feedback and discussion appeal to a view of crowdsourcing that goes beyond the limitations of merely outsourcing. This framework provides a foundation to further study both the process and the results of crowdsourcing user information, as we continue to build our understanding of crowdsourcing as a tool for HCI.

Leading the Crowd

by Kurt Luther (Georgia Tech)

Who tells the crowd what to do? In the mid-2000s, when online collaboration was just beginning to attract mainstream attention, common explanations included phrases like “self-organization” and “the invisible hand.” These ideas, as Steven Weber has noted, served mainly as placeholders for more detailed, nuanced theories that had yet to be developed [6]. Fortunately, the last half-decade has filled many of these gaps with a wealth of empirical research looking at how online collaboration really works.

One of the most compelling findings from this literature is the central importance of leadership. Rather than self-organizing, or being guided by an invisible hand, the most successful crowds are led by competent, communicative, charismatic individuals [2,4,5]. For example, Linus Torvalds started Linux, and Jimmy Wales co-founded Wikipedia. The similar histories of these projects suggest a more general lesson about the close coupling between success and leadership. With both Wikipedia and Linux, the collaboration began when the project founder brought some compelling ideas to a community and asked for help. As the project gained popularity, its success attracted new members. Fans wanted to get involved. Thousands of people sought to contribute–but how could they coordinate their efforts?

(from “The Wisdom of the Chaperones” by Chris Wilson, Slate, Feb. 22, 2008)

Part of the answer, as with traditional organizations, includes new leadership roles. For a while, the project founder may lead alone, acting as a “benevolent dictator.” But eventually, most dictators crowdsource leadership, too. They step back, decentralizing their power into an increasingly stratified hierarchy of authority. As Wikipedia has grown to be the world’s largest encyclopedia, Wales has delegated most day-to-day responsibilities to hundreds of administrators, bureaucrats, stewards, and other sub-leaders [1]. As Linux exploded in popularity, Torvalds appointed lieutenants and maintainers to assist him [6]. When authority isn’t decentralized among the crowd, however, leaders can become overburdened. Amy Bruckman and I have studied hundreds of crowdsourced movie productions and found that because leaders lack technological support to be anything other than benevolent dictators, they struggle mightily, and most fail to complete their movies [2,3].

This last point is a potent reminder: all leadership is hard, but leading online collaborations brings special challenges. As technologists and researchers, we can help alleviate these challenges. At Georgia Tech, we are building Pipeline, a movie crowdsourcing platform meant to ease the burden on leaders, but also help us understand which leadership styles work best. Of course, Pipeline is just the tip of the iceberg–many experiments, studies, and software designs can help us understand this new type of creative collaboration. We’re all excited about the wisdom of crowds, but let us not forget the leaders of crowds.

Kurt Luther is a fifth-year Ph.D. candidate in social computing at the Georgia Institute of Technology. His dissertation research explores the role of leadership in online creative collaboration.

References

  1. Andrea Forte, Vanesa Larco, and Amy Bruckman, “Decentralization in Wikipedia Governance,” Journal of Management Information Systems 26, no. 1 (Summer): 49-72.
  2. Kurt Luther, Kelly Caine, Kevin Ziegler, and Amy Bruckman, “Why It Works (When It Works): Success Factors in Online Creative Collaboration,” in Proceedings of GROUP 2010 (New York, NY, USA: ACM, 2010), 1–10.
  3. Kurt Luther and Amy Bruckman, “Leadership in Online Creative Collaboration,” in Proceedings of CSCW 2008 (San Diego, CA, USA: ACM, 2008), 343-352.
  4. Siobhán O’Mahony and Fabrizio Ferraro, “The Emergence of Governance in an Open Source Community,” Academy of Management Journal 50, no. 5 (October 2007): 1079-1106.
  5. Joseph M. Reagle, “Do As I Do: Authorial Leadership in Wikipedia,” in Proceedings of WikiSym 2007 (Montreal, Quebec, Canada: ACM, 2007), 143-156.
  6. Steven Weber, The Success of Open Source (Harvard University Press, 2004).

Workshop Paper
Fast, Accurate, and Brilliant: Realizing the Potential of Crowdsourcing and Human Computation

Capitalizing on Mobile Moments

When mobile, the time period that people have to engage in an activity is generally short — on the order of minutes and sometimes as short as a few seconds. Unlike the non-mobile situation such as at the office or at home, these time periods that we characterized as mobile moments are fleeting.  Tasks performed at such times need to be facilitated by a mobile interface that permits users to get to the core of their activity as quickly and easily as possible with minimal overhead.

Mobile moments are also potential opportunities to harness human resources for computation especially when people have free time on their hands.  The smartphone, being always available and on, enables people to use such free times on activities that are pleasant and entertaining.  If the activities, as a side effect, are beneficial to others, mobile moments can be leveraged for the greater good.  Thus, empowered by their smartphones, crowdsourcing efforts can tap such users in their mobile moments to perform human computation tasks. These tasks could be location-based but need not — they should simply be performed in those serendipitous moments.

Our work on FishMarket, a mobile-based prediction market game, was born out of an interest in crowdsourcing amongst enterprise workers during their mobile moments.  The game enables these workers to use their mobile devices, anytime and anyplace, to share specialized knowledge quickly and efficiently.  The game’s user experience evolved through several iterations as we attempted to make the game concepts accessible and engaging, and game play easy and quick, to encourage people to play the game during their brief mobile moments.

The space and the types of possible human computation tasks for mobile moments are largely unmapped;  we are interested in exploring these possibilities.  Also, we are particularly interested in the design aspects (e.g., UI, game, social) as well as attributes of the crowdsourcing tools. Examples of attributes include how the tools channel experts’ desire to solve problems, how the tools tap into people’s willingness to share, and how the tools use the crowd to sort through the solutions to find the best one.

Alison Lee and Richard Hankins are Principal Research Scientists at Nokia Research Center in Palo Alto.  Alison is developing mobile services that enhance mobile work, mobile collaboration, and mobile recreation.  Richard’s research focus is on future mobile devices and systems. They both hold a Ph.D. in Computer Science — Alison from the University of Toronto and Richard from the University of Michigan.

Just hiring people to do stuff

As many of you know, my recent interest has been “just hiring people to do stuff”.

Let me make a case for why I think this is research, and why it is important.

Mankind has never before had such easy, affordable, and fast access to expert labor at such a small scale.

I’m not talking about Mechanical Turk. I’m talking about real expertise: people who know how to program, people who know how to draw, people who know how to write. These people can be found on sites like oDesk, Freelancer and Elance. They can be hired within a day, sometimes within an hour, for bite-sized projects as small as $5. Few people do this, however. Few people know they can, but the day is coming.

We are on the cusp of a new way of working.

Consider the effect web search had on information. As I write this blog post, I make Google queries to gain and verify information. I think about information differently because of web search — I need less of it in my head.

Consider the effect outsourcing may have on expertise. As I write this blog post, why am I not dictating in crude Greg-isms to an expert word-smith that I hired just now to craft these sentences? We will think about expertise differently because of outsourcing — we will need to acquire less of it ourselves.

We needed to learn how to use web search as part of our everyday workflow. We didn’t know how at first. Not everyone knows how even now. My mom has difficulty forming effective search queries. But it is a crucial skill to acquire.

We need to learn how to outsource as part of our everyday workflow. Practically nobody knows how. Most outsourcing is large scale — an entire website, or an entire program. It is like searching Google for a book on Java programming, and then reading the book, rather than searching for specific information needs when they arise.

The game is changing. This isn’t just bridging the gap in AI until we get there, this is the industrial revolution of knowledge work. It will change the economic, cultural and political landscape of mankind. It is worth researching.

Greg Little is an n-year PhD student at MIT. He is finishing his thesis as we speak, on human computation algorithms.

Would you be a worker in your crowdsourcing system?

As a Computer Scientist I am interested in two primary research questions about crowdsourcing:

  1. How might new systems broaden the range and increase the utility of crowdsourced work?
  2. What models, tools, and languages can help designers and developers create new applications that rely on crowdsourcing at their core?

I am investigating these questions together with my students at the Berkeley Institute of Design, in our Crowdsourcing Course, and through external collaborations (e.g., Soylent). At CHI, we will present works-in-progress on letting workers recursively divide and conquer complex tasks and on integrating feedback loops into work processes.

As a humanist, I believe it incumbent upon us to also think about the values our systems embody. I have a recurring uneasiness with the brave new world conjured by some of our projects for two reasons. The first one has been articulated before: many crowdsourcing research projects (including my own) rely at their core on a supply of cheap labor on microtask markets. Techniques we introduce to insure quality and responsiveness (e.g., redundancy, busy-waiting) are fundamentally inefficient ways of organizing labor that are only feasible because we exploit orders of magnitude in global income differences [1].

My second reservation is that the language used to describe how our systems decompose, monitor, and regulate the efforts of online workers recalls that of Taylor’s Scientific management. By observing, measuring and codifying skilled work, Taylorism moved knowledge from people into processes. This increased efficiency and made mass manufacturing possible; but it also led to the creation of entire classes of repetitive, undesirable, deskilled jobs.

I believe Stu Card had it right when he wrote that “We should be careful to design a world we actually want to live in.” As a step in this direction we might want to consider whether we ourselves would participate as workers in our own crowdsourcing systems. An exercise in my class, where students had to earn at least $1 as workers on Mechanical Turk suggests that the answer is today is a resounding “No.”

This leads me to ask a third research question – one I am less prepared to answer but where finding an answer is important if we believe that crowdsourcing will actually grow into a significant role in our future economy:

  1. How might we increase the utility, satisfaction and beneficience of crowdsourcing for workers?

I am looking forward to discuss these questions with you at the workshop.

1: Thanks to Volker Wulf for this thought.

Shepherding the Crowd: An Approach to More Creative Crowd Work

By Steven Dow and Scott Klemmer (Stanford HCI Group)

Why should we approach crowdsourcing differently than any collaborative computing system? Sure, crowdsourcing platforms make on-demand access to people easier than ever before. And this access provides new opportunities for distributed systems and social experiments.  However, workers are not simply “artificial artificial intelligence,” but real people with different skills, motivations, and aspirations. At what point did we stop treating people like human beings?

Our work focuses on people. Can we help workers improve their abilities? Can we keep them motivated? Can workers effectively carry out more creative and complex projects? Our experiments show that simple changes in work processes can significantly affect the quality of results. Our goal is to understand the cognitive, social, and motivational factors that govern creative work.

Along with our Berkeley colleagues Björn Hartmann and Anand Kulkarni, we introduce the Shepherd system to manage and provide feedback to workers on content-creation tasks. We propose two key features to help modern micro-task platforms accomplish more complex and creative work. First, formal feedback can improve worker motivation and task performance. Second, real-time visualizations of completed tasks can provide requesters a means to monitor and shepherd workers. We hypothesize that providing infrastructural support for timely and task-specific feedback and worker interaction will lead to better educated, more motivated workers, and better work results. Our next experiment will compare externally provided feedback with self assessment. Does the added cost of assessing work outweigh simpler mechanisms such as asking workers to evaluate their own work?

What’s the potential for creative crowd work?  Check out The Johnny Cash Project and Star Wars Uncut.

Steven Dow examines design thinking, prototyping practices, and crowdsourcing as a Stanford postdoc and Scott Klemmer advocates for high-speed rail in America and co-directs the Stanford HCI group.

Leveraging Online Virtual Agents to Crowdsource Human-Robot Interaction

Human-Robot Interaction (HRI) studies the social aspects of robotic behaviors. Results in HRI have emphasized not only the need for us to utilize human-computer interaction design principles but also principles of psychology. Non-verbal social cues such as gaze, attention, prosody, transparency with goal oriented behavior, and intentions are just a few aspects of behavior that become important with actuated agents. Along with these HRI principles, the community must also focus on traditional topics in robotics and machine learning : dialog management, navigation, manipulation, and learning by demonstration to name a few.

The A.I. community has had a lot of positive results with knowledge based agency. Whether this is in the form of policy learning, symbolic A.I. or so called classic A.I. or straightforward sense-think-act or sense-act architectures, these results have been very promising. Much of this has focused on knowledge acquisition from a large corpus : both from crowds or from standard benchmark data sources like the WSJ corpus. The analog in the robotics community has been working on learning agent behaviors tabula-rasa (from scratch) from direct user interaction. These can sometimes emphasize that we must raise the robots as if they were “babies” (motor babbling, learning by demonstration, kinematic learning, etc).

Our paper argues that the HRI community can also benefit from a data-driven approach to HRI in which the agent mimics non-verbal observed behaviors as well as learning from observed dialog, observed tasks online, and learning from labeled objects it can perceive. We have set up a preliminary study collecting more than 50,000 interactions in our online game, Mars Escape, and used them to train our real-world robot to mimic the role that the human took in the game, that of the robot. While our game doesn’t cover the entire realm that our dream may source from, we are just now establishing what a virtual agent using data from the internet or in general from a virtual world would be trained to use. I hope to present these results and have a discussion with a real group of experts who have had success in harnessing the crowd to give us more appropriate data and to help us gather more participants.

Video for this preliminary work can be found here. [Warning: file is large]

Making Databases more Human

by Adam Marcus (MIT CSAIL), Ph.D. Student

As Eugene Wu and I wrote in our crowd research workshop submission, it’s time to involve the (computer science) systems community in supporting human computation. We’re certainly not the only ones thinking about the topic, but I’d like to talk to you about two systems we’re building at MIT: Qurk for declarative specification of human computation workflows, and Djurk for standardizing human computation platform development.

Qurk lets you write queries in a declarative language (like SQL) that merges crowd- and silicon-powered operations. A simple query in Qurk to select images of males from a table of pictures would be “SELECT image_url FROM images WHERE gender(image_url) == ‘Male’;” In this case, gender is a user-defined function which would ask the crowd to identify the gender of the person in the image.

Human computation and databases research have traditionally been separate. Why cross the streams?

  • Databases eat, speak, and breathe adaptive optimization. The parameters (money, accuracy, latency) and models are different, but databases can integrate these new models into traditional workflows.
  • Common operators, such as filters, joins, and sorts, give us common optimization goals. Databases speak a limited number of common and useful operations. Once we cast popular human computation tasks into this common language, the community can iteratively improve operator implementations.
  • Best practices can be encoded into a package of user-defined functions. Want to batch or verify HITs? Someone will likely have written a package you can use for it in Qurk.

The challenges in integrating databases and human computation are fourfold. First, we need to identify the signals (e.g., worker agreement rate) through which Qurk should adapt query execution. Second, we must learn how common building blocks (e.g., item comparisons or ratings) of larger algorithms (e.g., joins or sorts) are best implemented with the crowd. Third, we have to identify how new challenges (e.g., extremely high operation latency) change how we implement traditional query execution engines. Finally, we should identify the ideal crowd workflow specification language.  Will we build workflows through traditional langauges like SQL, visual workflow builders, or something completely different?

We also offer a call-to-arms to the open source platform-building community. It pains us to see so many human computation platforms being built from scratch, each with its own set of quirks and limitations. Developer time should not be wasted re-implementing common human computation platform kernels. Like Hadoop does for distributed computation and WordPress does for publishing, we would like to see a pluggable, white label platform for human computation. This platform, which we call Djurk, would let developers innovate on questions that matter, such as incentives and interfaces, rather than building yet another job submission framework.

We’re excited to meet everyone at the workshop!

Adam Marcus and Eugene Wu are Ph.D. students at MIT collectively advised by Sam Madden, Rob Miller, and David Karger.