CIDR 2013: CrowdQ – Crowdsourced Query Understanding

Understanding complex questions is characteristic of human intelligence. and other Question-and-Answer platforms are good examples of how complex questions are best answered by humans.  Unfortunately, Google and other search engines don’t understand your queries.  In this work we use crowdsourcing combined with algorithms for complex query understanding.

Our proposed system can answer complex queries such as “birthdate of the mayors of all the cities in Italy” The answers for such complex queries are typically available on the Web (or even just in Wikipedia). However, current search engines are not able to provide answers directly because they do not understand the semantics behind user requests.

The proposed system generates query templates using a combination of:

  • query log mining
  • natural language processing (NLP)
    • part-of-speech tagging
    • entity extraction
  • crowdsourcing

Query templates that can then be used to answer whole classes of different questions rather than focusing on just a specific question and answer.

Our proposed approach first transforms the user request into a structured query then  answers the query with machine-readable data publicly available on the Web (i.e., Linked Open Data).

Human input is used to to detect the structure of a user request expressed in natural language:

  • which entities are mentioned
  • which relations exist among the entities
  • what is the type of the desired answer

The crowd is also involved to verify the correctness of automatic annotations in uncertain cases.

The result of this process is an SQL-like query that can be answered automatically by standard database technologies.

For more, see our full paper, CrowdQ – Crowdsourced Query Understanding.

Gianluca Demartini, eXascale Infolab, University of Fribourg, Switzerland
Beth Trushkowsky, AMPLab, UC Berkeley, USA
Tim Kraska, Brown University, USA
Michael J. FranklinAMPLab, UC Berkeley, USA
Daniel Bruckner, UC Berkeley
Daniel Haas, UC Berkeley
Jonathan Harper, UC Berkeley