img_7693

Report: Second GroupSight Workshop on Human Computation for Image and Video Analysis

What would be possible if we could accelerate the analysis of images and videos, especially at scale? This question is generating widespread interest across research communities as diverse as computer vision, human computer interaction, computer graphics, and multimedia.

The second Workshop on Human Computation for Image and Video Analysis (GroupSight) took place in Quebec City, Canada on October 24, 2017, as part of HCOMP 2017. The goal of the workshop was to promote greater interaction between this diversity of researchers and practitioners who examine how to mix human and computer efforts to convert visual data into discoveries and innovations that benefit society at large.

This was the second edition of the GroupSight workshop to be held at HCOMP. It was also the first time the workshop and conference were co-located with UIST. A website and blog post on the first edition of GroupSight are also available.

The workshop featured two keynote speakers in HCI doing research on crowdsourced image analysis. Meredith Ringel Morris (Microsoft Research) presented work on combining human and machine intelligence to describe images to people with visual impairments (slides). Walter Lasecki (University of Michigan) discussed projects using real-time crowdsourcing to rapidly and scalably generate training data for computer vision systems.

Participants also presented papers along three emergent themes:

Leveraging the visual capabilities of crowd workers:

  • Abdullah Alshaibani and colleagues at Purdue University presented InFocus, a system enabling untrusted workers to redact potentially sensitive content from imagery. (Best Paper Award)
  • Kyung Je Jo and colleagues at KAIST presented Exprgram (paper, video). This paper introduced a crowd workflow that supports language learning while annotating and searching videos. (Best Paper Runner-Up Award)
  • GroundTruth (paper, video), a system by Rachel Kohler and colleagues at Virginia Tech, combined expert investigators and novice crowds to identify the precise geographic location where images and videos were created.

Kurt Luther hands the best paper award to Alex Quinn.

Creating synergies between crowdsourced human visual analysis and computer vision:

  • Steven Gutstein and colleagues from the U.S. Army Research Laboratory presented a system that integrated a brain-computer interface with computer vision techniques to support rapid triage of images.
  • Divya Ramesh and colleagues from CloudSight presented an approach for real-time captioning of images by combining crowdsourcing and computer vision.

Improving methods for aggregating results from crowdsourced image analysis:

  • Jean Song and colleagues at the University of Michigan presented research showing that tool diversity can improve aggregate crowd performance on image segmentation tasks.
  • Anuparna Banerjee and colleagues at UT Austin presented an analysis of ways that crowd workers disagree in visual question answering tasks.

The workshop also had break-out groups where participants used a bottom-up approach to identify topical clusters of common research interests and open problems. These clusters included real-time crowdsourcing, worker abilities, applications (to computer vision and in general), and crowdsourcing ethics.

A group of researchers talking and seated around a poster board covered in sticky notes.

For more, including keynote slides and papers, check out the workshop website: https://groupsight.github.io/

Danna Gurari, UT Austin
Kurt Luther, Virginia Tech
Genevieve Patterson, Brown University and Microsoft Research New England
Steve Branson, Caltech
James Hays, Georgia Tech
Pietro Perona, Caltech
Serge Belongie, Cornell Tech

diagrams

Crowdsourcing the Location of Photos and Videos

How can crowdsourcing help debunk fake news and prevent the spread of misinformation? In this paper, we explore how crowds can help expert investigators verify the claims around visual evidence they encounter during their work.

A key step in image verification is geolocation, the process of identifying the precise geographic location where a photo or video was created. Geotags or other metadata can be forged or missing, so expert investigators will often try to manually locate the image using visual clues, such as road signs, business names, logos, distinctive architecture or landmarks, vehicles, and terrain and vegetation.

However, sometimes there are not enough clues to make a definitive geolocation. In these cases, the expert will often draw an aerial diagram, such as the one shown below, and then try to find a match by analyzing miles of satellite imagery.

An aerial diagram of a ground-level photo, and the corresponding satellite imagery of that location.

Source: Bellingcat

This can be a very tedious and overwhelming task – essentially finding a needle in a haystack. We proposed that crowdsourcing might help, because crowds have good visual recognition skills and can scale up, and satellite image analysis can be highly parallelized. However, novice crowds would have trouble translating the ground-level photo or video into an aerial diagram, a process that experts told us requires lots of practice.

Our approach to solving this problem was right in front of us: what if crowds also use the expert’s aerial diagram? The expert was going to make the diagram anyway, so it’s no extra work for them, but it would allow novice crowds to bridge the gap between ground-level photo and satellite imagery.

To evaluate this approach, we conducted two experiments. The first experiment looked at how the level of detail in the aerial diagram affected the crowd’s geolocation performance. We found that in only ten minutes, crowds could consistently narrow down the search area by 40-60%, while missing the correct location only 2-8% of the time, on average.

diagrams

In our second experiment, we looked at whether to show crowds the ground-level photo, the aerial diagram, or both. The results confirmed our intuition: the aerial diagram was best. When we gave crowds just the ground-level photo, they missed the correct location 22% of the time – not bad, but probably not good enough to be useful, either. On the other hand, when we gave crowds the aerial diagram, they missed the correct location only 2% of the time – a game-changer.

Bar chart showing the diagram condition performed significantly better than the ground photo condition.

For next steps, we are building a system called GroundTruth (video) that brings together experts and crowds to support image geolocation. We’re also interested in ways to synthesize our crowdsourcing results with recent advances in image geolocation from the computer vision research community.

For more, see our full paper, Supporting Image Geolocation with Diagramming and Crowdsourcing, which received the Notable Paper Award at HCOMP 2017.

Rachel Kohler, Virginia Tech
John Purviance, Virginia Tech
Kurt Luther, Virginia Tech

Redistributing Leadership in Online Creative Collaboration

Online creative collaboration is complex, and leaders frequently become overwhelmed, causing their projects to fail. We introduce Pipeline, a collaboration tool designed to ease the burden on leaders, and describe how Pipeline helped redistribute leadership in a successful 28-person artistic collaboration.

For the Holiday Flood, 28 artists from around the world used Pipeline to create 24 artworks and release them in the days leading up to Christmas.

Leadership is important in many types of online creative collaboration, from writing encyclopedias to developing software to proving mathematical theorems. In previous work, we studied leaders of online animation projects, called collabs, organized on websites like Newgrounds. These leaders take on a huge variety of responsibilities, and many become desperately overwhelmed. They also struggle with poor technological support, relying on discussion forums designed for conversation, not complex multimedia collaboration. To manage these challenges, leaders attempt less ambitious projects and embrace top-down leadership styles. Still, less than 20% of collabs result in a finished product, like a movie, video game, or artwork.

Our goal was to encourage complex, creative, and successful collabs by designing a technology to ease the burden on leaders. Two theories guided our approach:

  • Distributed cognition holds that cognitive processes can be distributed across people, objects, and time.
  • Distributed leadership suggests that leadership roles can be decoupled from leadership behaviors, which could be performed by any member of a group.

We integrated these theories and used them to design a system which helps leaders distribute their efforts across both people and technology.

The result is Pipeline, a free, open-source collaboration tool. Pipeline enables redistributed leadership through the notion of “trust.” Projects have two types of members:

  • Trusted members, who can create and lead tasks (among other privileges)
  • Regular members, whose privileges are limited to working on existing tasks

At one extreme, creators can replicate the old “benevolent dictator” model popular on Newgrounds by trusting only themselves. At the other extreme, creators can trust every member of their projects, creating an open, wiki-like environment. Most Pipeline users will opt for something in between, making real-time adjustments as needed.

The Pipeline tasks system. In this example, Spagneti posts a new version of a work-in-progress, and RAMATSU provides feedback. The right column includes information about the task, links to other versions of this work, and a recent activity feed.

We launched Pipeline in 2011 and have seen users organize a variety of creative projects, includi moviesvideo games, and even a global scavenger hunt. Our paper focuses on one case study, an artistic collaboration called Holiday Flood. Over six weeks, 28 artists from at least 12 countries used Pipeline to create a digital Advent calendar with 24 original Christmas-themed artworks, along with an interactive Flash gallery. Every aspect of the project was completed on schedule, and the Newgrounds community responded with high ratings and a staff award.

The main menu of the interactive Flash gallery for Operation Holiday Flood. Clicking any of the square thumbnails reveals one of 24 Christmas-themed artworks.

Our research suggests that Pipeline contributed to Holiday Flood’s success in several key ways. It emboldened the project creators to attempt something more complex and ambitious than anything they had tried previously. Pipeline also helped members perform leadership behaviors previously reserved for leaders, like planning, problem solving, and providing feedback.

For more, see our full paper, Redistributing Leadership in Online Creative Collaboration.
Kurt Luther, Carnegie Mellon University
Casey FieslerGeorgia Institute of Technology
Amy Bruckman, Georgia Institute of Technology

Leading the Crowd

by Kurt Luther (Georgia Tech)

Who tells the crowd what to do? In the mid-2000s, when online collaboration was just beginning to attract mainstream attention, common explanations included phrases like “self-organization” and “the invisible hand.” These ideas, as Steven Weber has noted, served mainly as placeholders for more detailed, nuanced theories that had yet to be developed [6]. Fortunately, the last half-decade has filled many of these gaps with a wealth of empirical research looking at how online collaboration really works.

One of the most compelling findings from this literature is the central importance of leadership. Rather than self-organizing, or being guided by an invisible hand, the most successful crowds are led by competent, communicative, charismatic individuals [2,4,5]. For example, Linus Torvalds started Linux, and Jimmy Wales co-founded Wikipedia. The similar histories of these projects suggest a more general lesson about the close coupling between success and leadership. With both Wikipedia and Linux, the collaboration began when the project founder brought some compelling ideas to a community and asked for help. As the project gained popularity, its success attracted new members. Fans wanted to get involved. Thousands of people sought to contribute–but how could they coordinate their efforts?

(from “The Wisdom of the Chaperones” by Chris Wilson, Slate, Feb. 22, 2008)

Part of the answer, as with traditional organizations, includes new leadership roles. For a while, the project founder may lead alone, acting as a “benevolent dictator.” But eventually, most dictators crowdsource leadership, too. They step back, decentralizing their power into an increasingly stratified hierarchy of authority. As Wikipedia has grown to be the world’s largest encyclopedia, Wales has delegated most day-to-day responsibilities to hundreds of administrators, bureaucrats, stewards, and other sub-leaders [1]. As Linux exploded in popularity, Torvalds appointed lieutenants and maintainers to assist him [6]. When authority isn’t decentralized among the crowd, however, leaders can become overburdened. Amy Bruckman and I have studied hundreds of crowdsourced movie productions and found that because leaders lack technological support to be anything other than benevolent dictators, they struggle mightily, and most fail to complete their movies [2,3].

This last point is a potent reminder: all leadership is hard, but leading online collaborations brings special challenges. As technologists and researchers, we can help alleviate these challenges. At Georgia Tech, we are building Pipeline, a movie crowdsourcing platform meant to ease the burden on leaders, but also help us understand which leadership styles work best. Of course, Pipeline is just the tip of the iceberg–many experiments, studies, and software designs can help us understand this new type of creative collaboration. We’re all excited about the wisdom of crowds, but let us not forget the leaders of crowds.

Kurt Luther is a fifth-year Ph.D. candidate in social computing at the Georgia Institute of Technology. His dissertation research explores the role of leadership in online creative collaboration.

References

  1. Andrea Forte, Vanesa Larco, and Amy Bruckman, “Decentralization in Wikipedia Governance,” Journal of Management Information Systems 26, no. 1 (Summer): 49-72.
  2. Kurt Luther, Kelly Caine, Kevin Ziegler, and Amy Bruckman, “Why It Works (When It Works): Success Factors in Online Creative Collaboration,” in Proceedings of GROUP 2010 (New York, NY, USA: ACM, 2010), 1–10.
  3. Kurt Luther and Amy Bruckman, “Leadership in Online Creative Collaboration,” in Proceedings of CSCW 2008 (San Diego, CA, USA: ACM, 2008), 343-352.
  4. Siobhán O’Mahony and Fabrizio Ferraro, “The Emergence of Governance in an Open Source Community,” Academy of Management Journal 50, no. 5 (October 2007): 1079-1106.
  5. Joseph M. Reagle, “Do As I Do: Authorial Leadership in Wikipedia,” in Proceedings of WikiSym 2007 (Montreal, Quebec, Canada: ACM, 2007), 143-156.
  6. Steven Weber, The Success of Open Source (Harvard University Press, 2004).

Workshop Paper
Fast, Accurate, and Brilliant: Realizing the Potential of Crowdsourcing and Human Computation