img_7693

Report: Second GroupSight Workshop on Human Computation for Image and Video Analysis

What would be possible if we could accelerate the analysis of images and videos, especially at scale? This question is generating widespread interest across research communities as diverse as computer vision, human computer interaction, computer graphics, and multimedia.

The second Workshop on Human Computation for Image and Video Analysis (GroupSight) took place in Quebec City, Canada on October 24, 2017, as part of HCOMP 2017. The goal of the workshop was to promote greater interaction between this diversity of researchers and practitioners who examine how to mix human and computer efforts to convert visual data into discoveries and innovations that benefit society at large.

This was the second edition of the GroupSight workshop to be held at HCOMP. It was also the first time the workshop and conference were co-located with UIST. A website and blog post on the first edition of GroupSight are also available.

The workshop featured two keynote speakers in HCI doing research on crowdsourced image analysis. Meredith Ringel Morris (Microsoft Research) presented work on combining human and machine intelligence to describe images to people with visual impairments (slides). Walter Lasecki (University of Michigan) discussed projects using real-time crowdsourcing to rapidly and scalably generate training data for computer vision systems.

Participants also presented papers along three emergent themes:

Leveraging the visual capabilities of crowd workers:

  • Abdullah Alshaibani and colleagues at Purdue University presented InFocus, a system enabling untrusted workers to redact potentially sensitive content from imagery. (Best Paper Award)
  • Kyung Je Jo and colleagues at KAIST presented Exprgram (paper, video). This paper introduced a crowd workflow that supports language learning while annotating and searching videos. (Best Paper Runner-Up Award)
  • GroundTruth (paper, video), a system by Rachel Kohler and colleagues at Virginia Tech, combined expert investigators and novice crowds to identify the precise geographic location where images and videos were created.

Kurt Luther hands the best paper award to Alex Quinn.

Creating synergies between crowdsourced human visual analysis and computer vision:

  • Steven Gutstein and colleagues from the U.S. Army Research Laboratory presented a system that integrated a brain-computer interface with computer vision techniques to support rapid triage of images.
  • Divya Ramesh and colleagues from CloudSight presented an approach for real-time captioning of images by combining crowdsourcing and computer vision.

Improving methods for aggregating results from crowdsourced image analysis:

  • Jean Song and colleagues at the University of Michigan presented research showing that tool diversity can improve aggregate crowd performance on image segmentation tasks.
  • Anuparna Banerjee and colleagues at UT Austin presented an analysis of ways that crowd workers disagree in visual question answering tasks.

The workshop also had break-out groups where participants used a bottom-up approach to identify topical clusters of common research interests and open problems. These clusters included real-time crowdsourcing, worker abilities, applications (to computer vision and in general), and crowdsourcing ethics.

A group of researchers talking and seated around a poster board covered in sticky notes.

For more, including keynote slides and papers, check out the workshop website: https://groupsight.github.io/

Danna Gurari, UT Austin
Kurt Luther, Virginia Tech
Genevieve Patterson, Brown University and Microsoft Research New England
Steve Branson, Caltech
James Hays, Georgia Tech
Pietro Perona, Caltech
Serge Belongie, Cornell Tech

Call for Participation: GroupSight 2017

The Second Workshop on Human Computation for Image and Video Analysis (GroupSight) is to be held on October 24, 2017 at AAAI HCOMP 2017 at Québec City, Canada. This promises an exciting mix of people and papers at the intersection of HCI, crowdsourcing, and computer vision.

The aim of this workshop is to promote greater interaction between the diversity of researchers and practitioners who examine how to mix human and computer efforts to convert visual data into discoveries and innovations that benefit society at large. It will foster in-depth discussion of technical and application issues for how to engage humans with computers to optimize cost/quality trade-offs. It will also serve as an introduction to researchers and students curious about this important, emerging field at the intersection of crowdsourced human computation and image/video analysis.

Topics of Interest

Crowdsourcing image and video annotations (e.g., labeling methods, quality control, etc.)
Humans in the loop for visual tasks (e.g., recognition, segmentation, tracking, counting, etc.)
Richer modalities of communication between humans and visual information (e.g., language, 3D pose, attributes, etc.)
Semi-automated computer vision algorithms
Active visual learning
Studies of crowdsourced image/video analysis in the wild

Submission Details

Submissions are requested in the following two categories: Original Work (not published elsewhere) and Demo (describing new systems, architectures, interaction techniques, etc.). Papers should be submitted as 4-page extended abstracts (including references) using the provided author kit. Demos should also include a URL to a video (max 6 min). Multiple submissions are not allowed. Reviewing will be double-blind.
Previously published work from a recent conference or journal can be considered but the authors should submit an unrevised copy of their published work. Reviewing will be single-blind. Email submissions to groupsight@outlook.com

Important Dates

August 14August 23, 2017: Deadline for paper submission (5:59 pm EDT)
August 25, 2017: Notification of decision
October 24, 2017: Workshop (full-day)

Link

https://groupsight.github.io