Amazon Mechanical Turk: Gold Mine or Coal Mine?

Karën Fort, INIST-CNRS / LIPN, France
(Gilles Adda, LIMSI-CNRS, France
Kevin Bretonnel Cohen, U. Colorado School of Medicine / U. Colorado at Boulder, USA)

It’s supposed to be cheap, fast, good AND a hobby for “Turkers”. A dream come true, especially for people developing high-cost language resources in our field of research, Natural Language Processing.
But what if the gold mine was more of a coal mine?

At a time when it’s getting more and more difficult to get funding to create language resources (lexicons, translations, annotations, etc.), some pressured researchers might want to believe the dream-come-true story and use MTurk without knowing exactly what’s under the table. Our goal is therefore to give some insights about the reality of the system, in particular about the people working on it,  and the issues behind it.

MTurk is an on-line crowdsourcing, microworking system which enables elementary
tasks (Human Intelligence Tasks, or HITs) to be performed by a huge number of people (typically called “Turkers”) on-line.

Who are the Turkers and why do they use MTurk?

The Turker inside the Mechanical Turk
The Turker inside the Mechanical Turk

Although 500k people are registered as Turkers in the MTurk system, there are really between 15,059 and 42,912 of them. 80% of the HITs are performed by the 20% most active, who spend more than 15 hours per week in the MTurk system.
The observed mean hourly wages for performing jobs in the MTurk system is below US$ 2. However, money is an important motivation for a majority of the Turkers (20% use MTurk as their primary source of income, and 50% as their secondary source of income), and leisure is important for only a minority (30%).

That’s for the hobby lobby.

What about Ethics?

Besides the very low wages, the Turkers have no guarantee of payment for work properly performed, no governmental or private benefits whatsoever and they have no recourse to any channels for redress of employer wrongdoing (apart from trying to avoid them using a tool like Turkopticon or checking the TurkerNation hall of Shame).
Also, keep in mind that most tasks performed on MTurk used to be done by conventional employees, including from agencies like LDC or ELRA, with traditional contracts, real wages and protection against employer wrongdoing by federal labour laws.

Some researchers may not care, but we think (hope) the majority does.

The future of language resources development?

MTurk costs may soon become the standard costs, and it may become very difficult to obtain funding for a project proposing fair wages.
Therefore, our community’s use of MTurk not only supports a workplace model that is unfair and open to abuses of a variety of sorts, but also creates a de facto standard for the development of linguistic resources that may have long-term funding consequences.

What can we do?

In France, the learned society ATALA and the professional association APIL are working on a Charter for ethically produced resources, so that researchers and funding agencies will be able to easily identify such resources and promote them. We think that’s a good start.

“People do have choices, but some have more choices than others.”(Turkopticon blog)

We have the choice to use more ethical systems, so let’s do it:
let’s play GWAP (Games With A Purpose), let’s use ethical plate-formes, let’s volunteer, let’s Max Havelaar the creation of language resources!

For more, in particular on the quality that you can expect from this system, see our full paper, Amazon Mechanical Turk (MTurk):  Gold Mine or Coal Mine?.

About the author

karen.fort

View all posts

5 Comments

  • Hi! Interesting thoughts. I think there are many researchers (myself included) who do try hard to pay a fair wage, e.g., by testing the HITs ourselves and aiming for an hourly wage between $6-$9. I understand that funding may be an issue, but would this resolve most of your concerns about doing research on Turk?

  • Hi Haoqi,

    I’m not sure what you mean by “funding”, but the way I understand it using the context, is “wage” or “money”, whereas in my post it meant “national agencies money” (maybe a bad translation from my part). Now, if I read your question replacing “funding” by “wage”, my answer would be: not really.
    First, paying a fair wage on MTurk apparently attracts even more spammers and robots. I don’t know if you experienced that, and I’d be interested in your feedback on that point.

    From the Turkers’ point of view, now: they have no more rights if you offer more money, and if you decide not to pay them, they still have no official recourse against you [Of course, I’m sure you’re a fair employer, but I took the question as a general one].

    Finally, I’d say: why using MTurk then? I mean, the downsides of the system are such, for example, you cannot really know who is working for you (if they are really native speakers, for example — yes, there are ways to try and control that, but all of them can be bypassed), that I suggest that you use another solutions (we list a few of them in an article to be published at LTC).
    Anyway, I’m glad and relieved to see that researchers make efforts to try and pay people more decent wages.

  • Giving fair wages is good practice. But certainly not the definite answer. In a very recent post (http://www.behind-the-enemy-lines.com/2011/11/does-lack-of-reputation-help.html), Panos Ipeirotis pointed out that the observed mean very low wages ($1/hr) are a result of the defective reputation system, but more important, that this lack of reputation system is intentional. For some few big requesters and for the whole crowdsourcing business who have built alternate reputation systems to be able to pay higher wages to good workers and to obtain good quality, it maintains the salary very low: the “standard” wage for working in crowdsourcing is 1$/hr, thus 4$/hr appears “generous”.

    Furthermore, by using MTurk for research we give the opportunity to Amazon to advertise on the fact that MTurk is not a defective working place but a useful tool for research (see the NAACL 2010 workshop sponsored by Amazon http://sites.google.com/site/amtworkshop2010/).

    Thus I think that we, as researchers, should not contribute to a defective system (MTurk, not crowdsourcing) which has in its genes to lower wages.

  • Hi Karen and Gilles,

    Thanks for your responses! A few thoughts:

    – Yes, there are spammers on Turk. There is also quite a literature being built up now on how to do quality control effectively and efficiently, e.g., see papers in HCOMP 2010, 2011. Less state-of-the-art, but still effective solutions include restricting to workers with high approval rating (95%, or even, 98%), using gold standards, inserting unrelated questions that ensure that people are paying attention.

    – Turker rights: I totally agree, and there I think individual best practices is a good starting point. Two good things to do are to hang out on TurkerNation, and to go through IRB as your paper suggests (even if the project is exempt, I have found that I learn a lot by talking to folks at the IRB office)

    – Upsides of MTurk: prototyping. For researchers designing new crowd-powered systems, MTurk is still the only general purpose platforms on which to recruit a crowd (through programmatic access in particular, but also more generally). I think we all agree that if there is a better platform then we should consider it. Cost is an issue for researchers.

    – Contributing to a defective system: tough one. Would be in any other context…

    – By funding I did actually mean funding, as in national agency funding. The point being that higher wages mean you need more money to run the experiment. But I think we understood each other just fine 🙂

    Thanks!
    Haoqi

  • I read with great interest both your blog post and the full journal article referenced above. I think you bring up some very good ethical points about using Amazon Mechanical Turk.

    I have seen articles from the academic community that were very disappointing. It was clear that they were either using old data or ignoring the new data. These articles speak very disparagingly about ‘stay-at-home moms’ and the ‘cheap labor from India.’ Most of the Turkers I have met are not just looking to kill time or make extra money but look to AMT as a source of income for meeting basic needs in this depressed economy.

    There is a very popular forum among Turkers called TurkerNation. I would suggest anyone doing research about AMT to join this forum and start reading. This is especially true for those who are considering using AMT for their research tasks.

    Thanks for this great article!

    Robert