What do users really want in an event summarization system?

The wide usage of social media means that users now have to keep up with a large number of incoming content, motivating the development of several stream monitoring tools, such as PalanteerTopsyTweet Archivist, etc. Such tools could be used to aid in sensemaking about real-life events by detecting and summarizing social media content about these events. Given the large amount of content being shared and the limited attention of users, what information should we provide to users about special events as they are detected in social media? 

In our analysis, we analyzed tweets related to four diverse events:

  1. Facebook IPO
  2. Obamacare
  3. Japan Earthquake
  4. BP Oil Spill

The figure below shows the temporal patterns of usage for words related to the Facebook launch price. By exploiting the content similarity between tweets written around the same time, we could discover various aspects (topics) of an event.

Facebook IPO Launch Price
These plots show frequency of usage over time for various words related to the Facebook IPO. We can see similarities and differences in the temporal profiles of the usage of each of these words.

The figure below shows how the volume of content related to various aspects (topics) of an event changes over time, as the event unfolds. Notice that some aspects have a longer lifespan of attention from tweeters, while others peak and die off quickly.

Topics through time
These two figures show how the topics within an event change over time. The figure on the left shows raw volumes, while the figure on the right shows underlying patterns used in our model. Notice how topics spike at different times and with different amounts of concentration over time.

We used our model to generate summaries and hired workers on Amazon Mechanical Turk to provide feedback. Please refer to this link for the summaries we showed to our workers. Which summary do you like best? This is what some of our respondents had to say:

  1. Number 3 has the most facts.
  2. Summary 2 is more straight forward information & not personal appeal pieces like live chats and other stuff with people who are unqualified to speak about the issue.
  3. None. All too partisan
  4. Summary 3 has most news with less personal commentary than the others.
  5. I believe that summary 1 and 2 had a large amount of personal opinion and not fact.
  6. I think summary 3 best summarize Facebook IPO because it shows a broad range of information related to the event.
  7. Summary 3 is more comprehensive and offers better overall summary.

Overall, we received feedback from users that they want summaries that are comprehensive, covering a broad range of information. Furthermore, they want summaries to be objective, factual, and non-partisanWhile we believe we have done well in giving users comprehensive and broad range information, we think that future work in summarization will reduce the gap between what researchers are doing and what users really want.

For more, see our full paper,  Automatic Summarization of Events from Social Media.
Freddy Chua, Living Analytics Research Centre, Singapore Management University
Sitaram Asur, Social Computing Research Group, Hewlett Packard Research Labs

About the author

Freddy Chua

Freddy Chua is a PhD student in Singapore Management University with Professor Ee-Peng Lim as his dissertation advisor. His main research interests are in modeling of social networks. In 2011, he visited Research Professor William W. Cohen at Carnegie Mellon University to work on Information Extraction in Twitter. In the summer of 2012, he worked as an intern at Hewlett Packard Research Labs, Social Computing Group with Sitaram (Ram) Asur and Bernardo A. Huberman.

View all posts

2 Comments

  • Nice work =) A few thoughts that came to my mind:

    1) How do you envision people using this? Typing in a search query and getting back the list of the most relevant tweets? One of the thing that worries me about this sort of system is that whatever tweets a summarization system returns will become the de facto most “relevant” tweets simply because they are returned by a summarization system (i.e., they may become the most seen and thus the most retweeted, etc.). He who controls the summarization model then controls information flow. Do you any thoughts on this?

    2) Strawman: How do you think your models would compare to just looking at the tweets that get retweeted the most as the “summary” tweets?

    3) This one is actually just a thought. Certainly people say they want ‘objective’, ‘factual’ and ‘non-partisan’ information, but I wonder about that. It’s sort of like how people claim to care a lot about their privacy and the security of their data when asked directly, but their actions seem to suggest otherwise. It sounds good to say you want ‘objective’ and ‘factual’ information, even though what you’re really looking for is information that confirms your beliefs.

  • Thanks for your comments, my answers to your questions are as follows,

    1) Your concerns are valid. This is the reason why we observe power-law in many social data sets. The same problem exists for many other kinds of systems, e.g. online shopping: the item that gets purchased the most has most reviews, is ranked high on popularity, seen by most people and get purchased even more. One way overcome this problem is to decay the importance of tweets according to the age of the information. We do propose a model that consider time factor and decay information accordingly.

    2) I am pretty sure, our model will do better. When I went over the tweets, there are many tweets that provide the same information but written in slightly different manner. They are all retweeted multiple times. If I summarize the event using most retweets, the summary will likely have a poor coverage of information.

    3) That’s true as well. There could be a work done on “personalized” summarization where systems provide different summaries for different users tailored according to their own community (political) affiliation. But it is hard to evaluate the “accuracy” of such a system. It will be good work for building a real system by startup companies, not for academic research or publication :p.