Email search is often difficult for tasks such as:

  • What is my flight confirmation number?
  • What is my ACM member number?
  • Where is my meeting with Dan?
  • Was there an event I was supposed to go to today?
  • What deadlines do I have coming up?

Additionally, often we need answers to these questions on the go, such as when we’re in a taxi to the airport.

WearMail System

We built WearMail – a system where you can speak to your watch, and the watch will search your inbox. When the user requests specific types of information, such as flight confirmation numbers, and it triggers a special search that returns only the specific data.

jeffwearmail1-sm jeffwearmail2-sm

Currently, WearMail works on any AndroidWear and will search GMail using the API provided by Google.

Crowd Constructed Queries

We deployed two surveys in order to determine how well the crowd is able to generate useful Gmail queries based on natural language queries from the watch. In the first survey, we asked both workers on Amazon Mechanical Turk and workshop attendees to provide keyword search terms for three questions:

  1. “What is my Delta flight confirmation for today?”
  2. “I want to find my ACM Membership Number in my email.”
  3. “What room was I supposed to meet Dan Weld in today?”

Overall, both groups did reasonably well in constructing queries from the example questions, although most simply used queries from the original questions, e.g., “Meeting Dan Weld”, “ACM Membership”, “Delta confirmation”. Some workers tried to add additional information, such as today’s date. Workshop members were able to add a bit of additional expertise in formulating their queries, especially for the ACM Membership number. One query included the word “registration” and the other included the word “renewal,” presumably because workshop attendees thought these keywords would find those emails where the membership number was most likely to be mentioned.

Interfaces for Crowds to Create Search Patterns

We also asked survey participants to provide information that could be useful for constructing regular expression queries, both in terms of minimum and maximum range values and in terms of whether the target terms contained numbers, letters, or a combination of both. The results were largely inconsistent, but a preliminary interface for this approach is shown in the figure below. As a result, we hypothesized that a more promising approach may be to ask workers to find examples of the target terms on the internet, and to generalize from those. This worked reasonably for some — you can find examples of flight confirmation numbers, license plates, and room numbers. But, workers could not find other examples, such as ACM membership numbers. With our current UI, we had mixed success in getting workers to generalize the examples they found to other examples that could be reasonable.

Screen Shot 2016-09-01 at 6.16.24 PM


WearMail was one of the group projects pursued at the CMO-BIRS 2016 WORKSHOP ON MODELS AND ALGORITHMS FOR CROWDS AND NETWORKS.

Crowdsorcery: A Proposal for an Open-Source Toolkit Integrating Best Lessons from Industry & Academia

Want to collect data using the crowd but afraid of poor quality results? Unsure how to best design your task to ensure success? Want to use all the power of modern machine learning algorithms for quality control, but without having to understand all that math? Want a solve a complex task but unsure how to effectively piece multiple tasks together in a single automated workflow? Want to make your workflow faster, cheaper, and more accurate using the latest and greatest optimization techniques from databases and decision theory, but without having to develop those techniques yourself? Like the programmatic power of Mechanical Turk micro-tasks and the high-level expertise found on UpWork? Want to combine the best of both worlds in a seamless workflow?

We do too. In reflecting on these challenges, we’ve realized that one reason it has been difficult to solve them is due to the lack of any integrated framework for the entire crowdsourcing process that encompasses the design of workflows and UIs: the implementation and selection of optimization and quality assurance algorithms; and the design of the final task primitives that are assigned to workers on crowdsourcing platforms.

To help address this, we put together a proposal for an end-to-end, integrated ecosystem for crowdsourcing task design, implementation, execution, monitoring, and evaluation. We call it Crowdsorcery in the hope that it will take some of the magic out of designing crowdsourcing tasks (or that it would make us all crowd sorcerers). Our goal is that Crowdsorcery would enable

  • new requesters to easily create and run common crowdsourcing tasks which consistently deliver quality results.
  • experts to more easily create specialized tasks via a built-in support for rapid interactive prototyping and evaluation of alternative, complex workflows.
  • task designers to easily integrate task optimization as a core service.
  • requesters to access various populations of workers and different underlying platform functionalities in a seamless fashion
  • and researchers and practitioners to contribute latest advances in task design as plug and play modules, which can be rapidly deployed in practical applications as open-source software.

crowdsorcery_stackProposal. Crowdsorcery would implement the software stack (at right) with five key components. The arrows on left are two ways in which a requester can interface with the toolkit, either using its API programmatically or its user interface, built as a wrapper on top of the API.

Inspirations. We’ve realized that achieving any of the above five visions requires defining an integrated solution across the “crowdsourcing stack” that cuts across the user specification interface (whether through a GUI or programming language), through the optimization and primitives library, down to the actual platform specific bindings.

While existing work in research and industry have considered many of these aspects (e.g., B12’s Orchestra), no single platform or requester tool integrates all of them. For example, one popular platform lists best practices for common task types, but does not provide a way to combine these tasks into a larger workflow. Another popular platform provides a GUI for chaining together these tasks, but in rather simplistic ways that don’t take advantage of optimization algorithms. Automan (Barowy et al., 2012) is a programming language, where complex workflows combining human and machine intelligence can be easily coded (see the Follow the Crowd post on Automan), but it locks the user into default optimization approaches, and does not surface platform specific bindings needed for requester customization. We also do not know of any existing tools that can seemlessly pool together workers from different marketplaces.

Crowdsorcery Software Stack

  • Platform-specific bindings. At the bottom of the software stack, the Platform-specific bindings layer  will enable Crowdsourcery to run on diverse worker platforms, such as Mechanical Turk, Upwork, and Facebook (e.g. to facilitate friendsourcing). This layer encapsulates specifics of each platform and abstracts away such details from the higher layers.
  • Primitives. Above this, the Primitives layer will encompass a pre-built library of atomic primitives, such as “binary choice”, “rate an item”, “fill in the blank”, “draw a bounding box”, etc. These will form the basic building blocks in which all crowdsourcing tasks will be composed. Furthermore, more complex primitives can be hierarchically architected from atomic primitives. For example, a “sort a list of items” primitive could combine rating and binary choice primitives with appropriate some control logic.
  • Optimization. A key focus of Crowdsorcery is providing rich support for optimization, implemented in the next layer up. Crowdsorcery integrates underlying task optimization as a core service and capability, providing a valuable separation of concerns for task designers and enabling them to benefit as methods for automatic task optimization continue to improve over time.
  • Programming API. Continuing up the software stack, Crowdsorcery’s Programming API will provide an environment for an advanced requester to quickly prototype a complex workflow combining existing and new primitive task types. Existing optimization routines could help with parameter optimization. Advanced users would be able to access the logic in these routines, and retarget/reimplent them for their specific use case.
  • GUI. Finally, the GUI layer will provide a wrapper on top of the programming API for the lay requesters, which will hide many technical details, but will expose interface for execution monitoring of running tasks.

While research and industry solutions have been proposed for each of the above layers, they have typically addressed each layer in isolation. No single platform or requester tool integrates all of them today. This means that it is virtually impossible for (1) novice requesters to ever take advantage of optimization libraries and workflows, (2) optimization libraries to be used in practical settings which would necessarily require worker interfaces, or (3) best practices to be integrated into primitives for workflows. CrowdSorcery’s end-to-end toolkit will enable these novel possibilities in an effective and user-friendly manner. Its open-source nature will allow distributed maintenance and incorporation of latest developments into the toolkit rapidly.

What’s the next step? In this blog post, our goal is simple: consider all aspects of crowdsourcing task design and create a framework that integrates them together. In retrospect, the software stack we came up with is pretty obvious. Our hope is that this can be the starting point for a more detailed document detailing specific research directions in each of these domains (as related to the entire stack), and ultimately, for a crowdsourcing compiler (see the great 2016 Theoretical Foundations for Social Computing Workshop Report) or IDE which takes the magic out of crowdsourcing.

Crowdsorcery Team
Aditya Parameswaran (UIUC)
David Lee (UC Santa Cruz)
Matt Lease (UT Austin)
(IIT Delhi & U. Washington)

Crowdsorcery was one of the group projects pursued at the CMO-BIRS 2016 WORKSHOP ON MODELS AND ALGORITHMS FOR CROWDS AND NETWORKS.


Report: CMO-BIRS Workshop on Models and Algorithms for Crowds and Networks

The Banff International Research Station (BIRS) along with the Casa Matemática Oaxaca (CMO) generously sponsored a 4-day workshop on Models and Algorithms for Crowds and Networks, which was held in Oaxaca, Mexico from August 29 to September 1, 2016. It was a stimulating week of tutorials, conversations, and research meetings in a tranquil environment, free of one’s routine daily responsibilities. Our goal was to help find common ground and research directions across the multiple subfields of computer science that all use crowds, including making a connection between crowds and networks.

More than a year ago, Elisa Celis, Panos Ipeirotis, Dan Weld and myself, Yiling Chen, proposed the workshop to BIRS. It was accepted in September 2015. Lydia Chilton later joined us and provided incredible insights and leadership on running Research-a-Thon at the workshop. Twenty eight researchers from North America, India and Europe attended the workshop. We mingled, exchanged ideas and perspectives, and worked together closely during the week.

The workshop featured nine excellent high-level, tutorial-style talks spanning many areas of computer science topics related to models, crowds, and networks:

  • Auction theory for crowds,
  • Design, crowds and markets,
  • Random walk and network properties,
  • Real-time crowdsourcing,
  • Decision making at scale: a practical perspective,
  • The collaboration and communication networks within the crowd,
  • Mining large-scale networks,
  • Crowd-powered data management, and
  • Bandits in crowdsourcing.

Several of these videos are now available, so take a look if you get a chance.

Outside of talks, much of our time was spent in small groups participating in Research-a-Thon similar to CrowdCamp, a crowdsourcing hack-a-thon run at several HCI conferences. People teamed up and worked on their chosen projects over a period of two and a half days. It was my first experience of a Research-a-Thon and I was totally sold. It worked as follows:

  • Each participant gave a brief pitch of two project ideas.
  • The group did an open brainstorming session and “speed dating” for exchanging project ideas.
  • Six teams were formed and set off to explore their respective problems.
  • At the end of the Research-a-Thon, teams came back and shared their progress.

The groups were able to make productive use of their 3 days: one formalized a social sampling model and proved initial results on the group-level behavior, another had a prototype where a user can ask the crowd to search information in their emails while preserving the privacy of the content, and another had already launched their MTurk experiment. (I wish I could be this productive all the time!) In the next few blog posts, several of the teams will each share their findings of the Research-a-Thon with readers of this blog.

Following a CCC workshop on Mathematical Foundations for Social Computing, we also had a visioning and long-term future discussion at the workshop. Participants collectively identified the following five directions or problems that are believed to be important for the healthy growth of the field:

  • Identifying a quantifiable grand challenge problem. Identifying and running a grand challenge can be one of the best ways to push the frontier of research on crowds and networks.
  • Comparisons, benchmarks and reproducibility. It’s been difficult to make comparisons of research results and hence difficult to know whether progresses have been made. This has led to the desire of having benchmarks for research comparisons as well as formal good-practice guidelines and ideas on how to increase reproducibility of research in this field.
  • Theory, guarantees and formal models. Participants recognized the benefits and challenges of developing formal models and theoretical guarantees for systems that involve humans. Some fields, such as economics, have had enormous success despite such challenges — one suggestion is to identify reachable goals towards formalizing models and theoretical approaches.
  • Human interpretable components. Many visions about joint human-machine systems require that humans are interchangeable blocks and map computational models to humans. More progresses can potentially be made if we change our perspective and try to make components of systems more interpretable to humans.
  • The future of the crowd. The excellent The Future of Crowd Work paper published in 2013 continues to represent the concerns about and promises of crowd work.

The next blog post is from the Crowdsorcery team, discussing their Research-a-Thon project. Stay tuned!

Accessible Crowdwork?

Crowdsourcing is an important, and growing, new class of digital labor that may well transform the future of work. Microtasking marketplaces, such as Amazon’s Mechanical Turk, are a key example of this new work form, which is still evolving and is largely unregulated. In addition to concern about lack of regulation regarding minimum wages for workers, it is unclear to what extent, if any, crowd labor marketplaces must comply with the Americans with Disabilities Act, which, among other provisions, requires employers to make “reasonable accommodations” for workers with disabilities. While several people have reported on the demographics of turkers with respect to factors like gender, age, country of origin, and socioeconomic status, such demographies have not reported on turkers’ disability status. Our research set out to understand whether people with disabilities are currently participating in crowd labor as workers, and what challenges they face in participating fully and equally in this new class of work.

Our methods included interviewing eight crowd workers with disabilities and nine professional job coaches who help match disabled workers with employment opportunities. Based on these interviews, we then developed an online survey which was taken by 486 American adults with disabilities, 12.4% of whom had tried crowd work.

Our findings established that people with disabilities are indeed participating in crowd work, and that many are eager to do so — benefits include a feeling of accomplishment, the ability to work from home and on a flexible schedule, and the ability to conceal one’s disability status if desired. However, crowd workers with disabilities faced several challenges, in three tiers of accessibility:

  • The first level of accessibility challenge was with the basic usability of crowdwork software, including both the crowdwork platform itself (e.g., AMT) and the many third-party sites that requesters’ tasks linked to. These sites had highly varying levels of compliance with issues such as compatibility with screen reader technology used by people who are blind, and these issues make completing microtasks frustrating, slow, or impossible for many people with disabilities.
  • The second level of accessibility challenge was in the way that microtasking workflows are structuredIdentifying tasks that are a match for a users’ abilities is quite difficult. For example, a hearing-impaired user might select a survey task, and discover halfway through the survey that they will be required to listen to audio for one of the survey questions. Time limits that are often embedded in tasks to weed out spammers or other types of malicious or lazy workers are quite problematic; workers with cognitive challenges such as dyslexia or motor impairments that require the use of alternative input devices might legitimately need a few extra seconds or minutes to complete a task.
  • The third level of accessibility challenge was in the fundamental accessibility of new job experiences; for example, learning about the availability of crowd work as an employment opportunity. Seven of the nine job coaches we interviewed had not been familiar with crowd work, for instance, suggesting a lack of dissemination of information about digital labor opportunities to the disability community. Further, those workers with disabilities who had tried crowd work were limited in their ability to grow in their crowd work careers by factors such as reputation systems in which they were penalized due to accessibility issues such as not completing tasks within the allotted time or not completing tasks that they discovered to be inaccessible partway through.

Our findings have several implications for the design of more accessible crowd work systems. For example, tasks could include metadata that indicates what abilities are required (sight, vision, speed, etc.) — such metadata could be manually added by task requesters or by other crowd workers, or could be automated through machine learning or recommender algorithms. Platforms could allow workers to sub-contract components of tasks to other workers, e.g., addressing our example of the hearing-impaired worker who encountered an audio-based component within a task. Optional self-identification of abilities or disabilities in a worker’s profile could help with better task suggestions or allow automatic time extensions or microbreak opportunities. Creating an online community specifically for crowd workers with disabilities could also provide important social and organizational opportunities, as well as build up a repository of knowledge about platforms and task requesters that are and are not accessible.

Currently, platform operators and task requesters do not appear to be legally compelled to provide accommodations such as those suggested above. However, we believe that it would benefit platform owners to voluntarily make their platforms and workflows more accessible, and to enforce compliance by third-party requesters — this would broaden their pool of eligible workers and help increase task completion rates, as well as enhancing that platform operators’ reputation.

Full access to participate in emerging forms of labor is important not only as an economic opportunity for people with disabilities, but as a social recognition of their full participation in all aspects of society; our research has highlighted important considerations for platform operators, job requesters, and policy makers to consider as a next step along the path to making this full access a reality.

You can read more in our CSCW 2015 paper: Zyskowski, K., Morris, M.R., Bigham, J.P., Gray, M.L., and Kane, S.K. Accessible Crowdwork? Understanding the Value in and Challenge of Microtask Employment for People with Disabilities. 


Meredith Ringel Morris is a Senior Researcher at Microsoft Research, where she designs, develops, and evaluates collaborative and social technologies.

CrowdCamp Report: Situated Crowdsourced Access

Navigating through streets and within buildings might seem like a trivial activity, however, its often a challenge for people with visual impairment. Over the last few years, innovation in the space of sensors, devices and smartphone apps have attempted to improve universal access and make navigation easier. However, the technology is not there yet.

Current approaches still incur concerns about safety – unexpected danger from construction or vehicle placement can hurt a user and could have used real-time notifications or help. For example, a visually impaired person uses white cane as a primary mobility tool. This helps in keeping track of objects that are hindrances in the path he is taking. There might be some special situations like heavy hanging objects or objects that are protruding from the wall (can be artistic displays, etc) that cannot be tracked by the white cane but can cause severe head injury if not taken care of. Understanding these aspects are beyond sensor’s limitations, but easy for humans to comprehend.

In this project, we designed an approach to help address some of these problems by adding humans in the loop as sensors and actors to assist with accessibility questions/problems. For example, in the figure below, if a user (green) has a to go to a coffee shop (A or B), she can quickly query the route and can use the location based services like Twitter, etc., crowdsourcing approaches where crowd are the people (orange) in neighborhood who can help her by notifying about a problem if it exists. This can help inform the user’s decision making process, and the navigation system’s path recommendation.

Sample route map
Sample route map

The approach can be implemented using the workflow architecture shown below:

Workflow architecture of the proposed set up
Workflow architecture of the proposed set up

In this approach, an end user can make a request by setting abilities preferences with respect to time, cost, location, and more. The request can then be broadcasted to the people or volunteers in the neighborhood. The volunteers can then respond, providing the system with updated information about the situation with a click of a button. To create this system, we envisioned the possibility of using Twitter or a custom app.

Twitter Approach: Twitter being a very popular social media has attracted so many users who can help others in their area of location without even making a request to the volunteer or another user. In order to first understand this we have to get an idea if the user base in a given area is bigger and there are sufficient tweets from a given region. We considered Pittsburgh as our point of interest and we calculated the average frequency of tweets. As shown below, on an average there is atleast one tweet for every 12 seconds. Hence, reflecting a promising outcome.

Tweets on average
Tweets on average

Custom Application: As a part of brainstorming and prototyping process, we also developed a homescreen app by extending the concept of Twitch Crowdsourcing. This lets users provide and answer just by unlocking their phone. One particular use-case of this app is shown below. If a visually impaired user (VU) has a question about the presence of curb ramp near-by or the number of steps, this request will be visible to the people in the location of VU’s interest where they can make a binary decision by simply checking their mobile phones. This makes the entire experience seamless with minimal cognitive load.

Screenshot of unlock screen application
Screenshot of unlock screen application

We believe that by using situated crowdsourcing, we can overcome the limitations of current sensor technology and real-world deployment, and better empower people with visual disabilities to navigate through buildings or cities more independently.

[1] Rajan Vaish, Keith Wyngarden, Jingshu Chen, Brandon Cheung, and Michael S. Bernstein. 2014. Twitch crowdsourcing: crowd contributions in short bursts of time. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). ACM, New York, NY, USA, 3645-3654. http://dl.acm.org/citation.cfm?id=2556288.2556996

[2] Simo Hosio, Jorge Goncalves, Vili Lehdonvirta, Denzil Ferreira, and Vassilis Kostakos. 2014. Situated crowdsourcing using a market model. In Proceedings of the 27th annual ACM symposium on User interface software and technology (UIST ’14). ACM, New York, NY, USA, 55-64. http://dl.acm.org/citation.cfm?id=2647362

Rajan Vaish, University of California, Santa Cruz, USA
Walter Lasecki, University of Rochester, USA
Lydia Manikonda, Arizona State University, USA

CrowdCamp Report: Collaborative Learning in a Video Lecture

During CrowdCamp at HCOMP 2014, we developed a project called “The Dancing Professor” that earned its name from the motions professors make when trying to explain visual ideas with physical body movements instead of making an illustration. While participating in online courses, students are often confused by ideas presented in a video lecture. How can we aid learning and improve the illustration of concepts and ideas for online courses? With an augmented web-learning interface, students can give and receive help at certain time intervals throughout a lecture.

To test this approach, we built a prototype that wraps a YouTube video and interacts with the YouTube API to present events and interactions at certain timestamps of the video. Users have three options to interact with the video aside from the normal YouTube controls: “I’m Confused,” “I Know This!” and “Proceed, I’m Good.”


When a user clicks “I’m confused,” a request is sent to see if there is any help available at the current timeframe. If help is available, it is presented as a learning opportunity with multiple choice answers, ranging from good answers to bad answers. A user can see explanations of both good and bad answers and selects an answer they would like to learn about. When selecting an answer, feedback is given about the answer as to whether it is good or bad, and how it relates to the best answer. If no data is available, users are notified that no data is available and asked if they can contribute information about the timeframe.

Users can contribute new learning experiences for a portion of video through an “I Know this!” link. They can provide their own questions or answers and view or vote on input from other students. Clicking either of the “I’m Confused” or “I Know This!” links will pause the video. Clicking “Proceed, I’m Good!” will play the video and clear the help interface below.

Juho Kim’s prior research on identifying confusion, interest, and importance in videos was integrated into this interface. To ensure understanding during confusing timeframes, the interface anticipates confusion and automatically pauses the video immediately after a confusing concept is covered. The learning experience is shown and the user can choose to either interact with it or continue on.

Feedback for the project at CrowdCamp included improving the design of the question-answer interface, improving the incentives to provide explanations, and re-framing the help to be less of a question-answer interaction and more of a scenario-explanation interaction. Ideas for future work include allowing instructors to moderate and improve learning experiences in the interface, recording user interactions with the interface to track whether learning is being improved, and abstracting the interface for reuse on other video-learning platforms.

Josh Hibschman, Northwestern University
Juho Kim, MIT
Kanya (Pao) Siangliulue, Harvard University
Michael Richardson, Carnegie Mellon University

JAIR: a new venue for publication of research on human computation

The Journal of Artificial Intelligence Research (JAIR) is pleased to announce the launch of the JAIR Special Track on Human Computation and Artificial Intelligence. This special track has been established as a natural home for the publication of leading research at the intersection of human computation and artificial intelligence. Please see http://www.jair.org/specialtrack-hcomp.html for more information. There is no specific deadline for submissions; submitted articles will be evaluated and reviewed on an on-going basis.

Articles published in the Human Computation and Artificial Intelligence track must meet the highest quality standards as measured by originality and significance of the contribution and clarity of presentation. We seek full-length original articles, research notes, and survey articles. Research notes are very brief papers that extend or evaluate previous work. Survey articles are tutorials or literature reviews that contribute an analysis or perspective that advances understanding of the subject matter.

A Workshop Connecting Crowdsourcing and Online Education at HComp 2014

The online education and crowdsourcing communities are addressing similar problems in educating, motivating and evaluating students and workers. The online learning community succeeds in increasing the supply side of the cognitively skilled labor market, and the crowdsourcing at scale community creates a larger marketplace for cognitively skilled work. WorkLearn is held at HComp 2014 in Pittsburgh, November 2, 2014.

Workshop: http://www.worklearn.org/
Venue: http://www.humancomputation.com/2014/

Call for Proposals

WorkLearn 2014 is a full-day workshop at HCOMP 2014 which will bring together researchers and practitioners from crowdsourcing and online education communities to explore connections between learning and working online. We want to spark knowledge sharing and discussions on topics such as: integrating online learning platforms and online work platforms; solving shared problems like training and evaluation of both students and high-skill crowd workers; how crowdsourcing methodologies can be used to scale the labor-intensive components of education. We invite submission of short (1-2 page) position papers which identify and motivate key problems related to the intersection of issues between crowd work/human computation and online learning/education. You are invited to include a short bio in your submission to provide context for your fit with the workshop. Please send your submission to work.learn.workshop@gmail.com. Submissions invited to participate in the workshop will notified in September. We encourage submission of position papers focusing on:

  • Challenges and demands of industry
    • What skills do we need to train (crowd and online) workers for?
    • What can crowdsourcing do for learning at scale?
  • Proposals for platforms and software to connect online work and learning
    • How can a platform for online learning be linked to a platform for crowd work in a way that creates a more skilled workforce and better crowd work?
    • Visionary papers on the future of online work and learning

We are looking forward seeing you in Pittsburgh: Markus Krause, Leibniz University, Germany Praveen Paritosh, Google, USA Joseph Jay Williams, Stanford University, USA

Turkers’ guidelines for academic requesters on Amazon Mechanical Turk

If you’ve spent time talking with Turkers, you probably know that academic requesters have been a continuous source of strain. Research surveys with horrendous pay and arbitrary rejections are common. Despite Mechanical Turk’s attractive availability, a large number of researchers make innocent missteps and cause serious stress. Recently, the tension came to a head on Turkopticon. An IRB-approved researcher experimented on the platform unannounced. The result was Turker confusion, strife, and wasted time, in a system where time is what it takes to make ends meet.

Turkers have had to deal with research problems on a case-by-case basis through e-mail or by calling human subjects review boards (e.g. IRBs, HRPPs) for help. Now, a collective of Turkers and researchers have created guidelines making Turkers’ expectations and rights available in advance to mitigate these tensions from the start. They address how to be a good requester, how to pay fairly, and what Turkers can do if HITs are questionable. They apply to Turkers both as experimental subjects or data processing workers who fuel academic research.

We’ll publicly maintain these guidelines so IRBs and researchers can easily find them, and Turkers can easily point to them in advocating for themselves.

Read the guidelines: http://guidelines.wearedynamo.org

They were developed over several weeks, and have been circulated and debated by workers. Turkers have been signing it to show their support.

As a requester, you are part of a very powerful group on AMT. Your signature in support of this document will help give Turkers a sense of cooperation and goodwill, and make Mechanical Turk a better place to work.

Today is Labor Day, a US holiday to honor the achievements of worker organizations. Honor Turkers by signing the guidelines as a researcher, and treating Turkers with the respect they deserve.

If you have any questions, you can email them to info@wearedynamo.org or submit a reply to this post.

The Dynamo Collective

The Human Flesh Search: Large-Scale Crowdsourcing for a Decade and Beyond

Human Flesh Search (HFS, 人肉搜索 in Chinese), a Web-enabled large-scale crowdsourcing phenomenon (mostly based on voluntary crowd power without cash rewards), originated in China a decade ago. It is a new form of search and problem solving scheme that involves the collaboration among a potentially large number of voluntary Web users. The term “human flesh,” an unfortunately bad translation from its Chinese name, refers to the human empowerment (in fact, crowd-powered search is a more appropriate English name). HFS has seen tremendous growth since its inception in 2001 (Figure 1). Figure1_updatedFigure 1. (a) Types of HFS episodes, and (b) evolution of HFS episodes based on social desirability HFS has been a unique Web phenomenon for just over 10 years. HFS presents a valuable test-bed for scientists to validate existing and new theories in social computing, sociology, behavioral sciences, and so forth. Based on a comprehensive dataset of HFS episodes collected from participants’ discussion on the Internet, we performed a series of empirical studies, focusing on the scope of HFS activities, the patterns of HFS crowd collaboration process, and the unique characteristics and dynamics of HFS participant networks. More results of the analysis of HFS participant networks can be found in two papers published in 2010 and 2012 (Additional readings 1 and 2). In this paper, a survey of HFS participants was conducted to provide an in-depth understanding of the HFS community and various factors that motivate these participants to contribute. The survey results shed light on the in-depth understanding of HFS participants and people involved in the crowdsourcing systems. Most participants voluntarily contribute to HFS, without expectation of money rewards (either real-world or virtual world money). The findings indicate great potential for researchers to explore how to design a more effective and efficient crowdsourcing system, and how to better utilize this power of the crowds for social goods, solve complex task-solving problems, and even for business purposes like marketing and management. For more, see our full paper, The Chinese “Human Flesh” Web: the first decade and beyond (free download link; preprint is also available upon request). Qingpeng Zhang, City University of Hong Kong Additoinal readings:

  1. Wang F-Y, Zeng D, Hendler J A, Zhang Q, et al (2010). A study of the human flesh search engine: Crowd-powered expansion of online knowledge. Computer, 43: 45-53. doi:10.1109/MC.2010.216
  2. Zhang Q, Wang F-Y, Zeng D, Wang T (2012). Understanding crowd-powered search groups: A social network perspective. PLoS ONE 7(6): e39749. doi:10.1371/journal.pone.0039749