More than Liking and Bookmarking? Towards Understanding Twitter Favouriting Behaviour

Twitter is a widely used micro-blogging platform that offers its users a variety of different features to engage with contacts in their social network and the content they produce. One of this features is the favouriting function a small, star-shaped icon displayed on the bottom of every tweet.

favorite

The usage of favouriting has strongly increased over the years, but in contrast to other Twitter features, such as retweeting or hashtags, favouriting has not been, to date, the focus of any rigorous scientific investigation.

Our work presents an initial study of favouriting behaviour. In particular we focus on the motivations people have for favouriting a tweet. We approach this question via a large-scale survey, which queried 606 Twitter users on the frequency with which they exhibit particular behaviours, including how often they make use of favourite button. Moreover two free form questions asked users about the reasons why they use this function and what they hope to achieve when doing so.

Interestingly only 65% (395 participants) of our respondents reported knowing about the favouriting feature. On the one hand, 26.8% of these participants stated to never favourite a tweet. On the other hand, 36.1% reported favouriting regularly, 5% of participants even reported doing so multiple times per day.

Favouriting

The main result of our study is a coding scheme or classification of 25 heterogenous reasons for using the favouriting feature. The table below shows the complete coding scheme along with frequency information, detailing how often each code appeared in the participants’ answers.

CodingScheme

Our findings show that motivations behind favouriting can be grouped into two major use cases:

  • (A) favouriting is used as a response or reaction to the tweet or its metadata, e.g., by liking it [A3]. Another prominent example is the ego favouriter [A4.2], who favourites a tweet, when he or she is mentioned in it.
  • (B) favouriting is used for a specific purpose or to fulfill a function, e.g. by bookmarking [B1] it in the favourites list. Another example would be agreeing with the author [B2.1], which can be interpreted as a digital fist bump or nod, as form of unwritten communication [B2].

All in all we can see that the favouriting feature is overly re-purposed, revealing unsupported user needs and interesting behaviour.

For a more detailed explanation of codes  and example statements see our full paper, More than Liking and Bookmarking? Towards Understanding Twitter Favouriting Behaviour.

Florian Meier, Chair for Information Science, University of Regensburg, Germany
David Elsweiler,Chair for Information Science, University of Regensburg, Germany
Max L. Wilson, Mixed Reality Lab, University of Nottingham, United Kingdom

Information Overload in Social Media and its Impact on Social Contagion

Since Alvin Toffler popularized the term “Information overload” in his bestselling 1970 book Future Shock, it has become ubiquitous in modern society. The advent of social media and online social networking has led to a dramatic increase in the amount of information a user is exposed to, greatly increasing the chances of the user experiencing an information overload. Surveys show that two thirds of Twitter users have felt that they receive too many posts, and over half of Twitter users have felt the need for a tool to filter out the irrelevant posts.

Our goal is to quantitatively characterize the phenomenon of information overload and its impact on information propagation in a social network. To this end, we perform a large-scale quantitative study of information overload experienced by users in Twitter. The key insight that enables our study is that users’ information processing behaviors can be reverse engineered through a careful analysis of the times when they receive a piece of information and when they choose to forward it to other users.

We found several insights that not only reveal the extent to which users in social media are overloaded with information, but also help us in understanding how information overload influences users’ decisions to forward and disseminate information to other users:

  • We find empirical evidence of a limit on the amount of information a Twitter user produces per day; very few Twitter users produce more than ∼40 tweets/day.
  • We find no limit on the information received by Twitter users; many Twitter users follow several hundreds to thousands of other users.blog-fig-1
  • We find a threshold rate of incoming information (∼30 tweets/hour), below which the probability that a user forwards any received tweet holds nearly constant, but above which the probability that a user forwards any received tweet begins to drop substantially (Figure below). We argue that the threshold rate roughly approximates the limit on information processing capacity of users and it allows us to identify overloaded users.twitter-users-2009-07-01-2009-10-01-final-tweet-in-flow-probability-retweet
  • We observe that if a user is overloaded, the higher the rate at which she receives information, the longer the time she takes to process and forward the information. Further, overloaded users tend to prioritize tweets from a subset of sources.
    twitter-users-2009-07-01-2009-10-01-final-median-time-difference-in-flow

For more details, see our full paper Quantifying Information Overload in Social Media and its Impact on Social Contagions.

Manuel Gomez-Rodriguez, MPI for Intelligent Systems and MPI for Software Systems
Krishna Gummadi, MPI for Software Systems
Bernhard Schölkopf, MPI for Intelligent Systems

final_users_account

The Tweets They are a-Changin’: Evolution of Twitter Users and Behavior

Over the years, we have seen significant amounts of research on Twitter, due to the ease of access to large amounts of data. However, most studies typically focus on data from small period of time, generally ranging from a few weeks to a few months. Given that Twitter has evolved significantly since its founding in 2006, this situation makes it hard to interpret prior results or make projections of where Twitter is headed.

Our work aims to quantify the evolution of Twitter itself, focusing on the public Twitter ecosystem. There are two main contributions of our work: First, we collect a dataset of over 37 billion tweets spanning over seven years. Second, we quantify how the users, their behavior, and the site as a whole have evolved. Below, we highlight a few of our results; the paper contains many more results as well as details on the datasets that we use.

final_users_account

  • While Twitter has grown significantly, it has also seen a large number of users leave the platform. Today, we see that almost 33% of the user population is inactive, over 6% has been suspended, and 2% of users have deleted their accounts.

final_geo

  • We observe Twitter spreading over the globe; the fraction of tweets from the U.S. and Canada has dropped from over 80% to 32% today. Additionally, there has been a massive increase in the diversity of languages used on the platform. The figure above shows this evolution for both user-provided locations and tweet geo-tags.

final_tweet_content

  • We can quantify the rise of malicious activity on Twitter, including both follower spam (we see a massive increase in follower counts in 2011 and 2012) and trending-topic hashtag spam (we see a spike in tweets with many hashtags in 2009).

final_tweet_type

  • We can observe users quickly adopting platform enhancements by Twitter. Before Twitter introduced native retweets, only 5% of tweets were retweets; today, it is over 27%.

final_source

  • Twitter has shifted from a primarily-mobile system (based on SMS) to a primarily-desktop system (based on the web site) and back to a primarily-mobile system (based on smartphone apps). Today, over 50% of tweets come from mobile devices.

We hope that our findings will help researchers to better understand the Twitter platform and to more clearly interpret prior results. We make all of our analysis available to the research community (to the extent allowed by Twitter’s Terms of Service) at http://twitter-research.ccs.neu.edu/.

Yabing Liu, Northeastern University
Chloe Kliman-Silver, Brown University
Alan Mislove, Northeastern University

Early adopters of Twitter and Google+: Validation of a theoretical model of early adopter personality and social network site influence

The widespread adoption of social media is transforming the consumer-brand relationship. Social media is allowing consumers connect with other users, create, consume and control access to content (Hoffman and Novak, 2012). Research suggests that social media increases brand relationship depth and loyalty, and generates incremental purchase behaviour (Laroche et al., 2012; Kim and Ko, 2012; Pooja et al., 2012). It is not surprising therefore that commentators suggest that marketers should target social media users who are more likely to exert an influence on their network in order to facilitate brand recommendations (Iyengar, Han, & Gupta, 2009). But who are these influentials? Goldenberg et al. (2009) suggest that there are only two types of influential that impact information diffusion – innovators and followers.

influence

Our study looks at early users or in Goldenberg at al.’s terminology, innovators, of two social networking sites, Twitter and Google+, and the effects of personality and mode of information sharing on social influence scoring. Specifically, we look at:

1. How does (i) extraversion, (ii) openness and (iii) conscientiousness influence:

  • Information sharing behaviour
  • Rumour sharing behavior

2. How does (i) information sharing behaviour and (ii) rumour sharing behaviour impact social network site influence scores?

Early Twitter users were identified through a public list and through the joining date listed on user public profiles. As the study occurred during the Google+ closed field test, all users were deemed early users. Two discrete survey instruments were designed, one for Twitter and one for Google+ to provide for different SNS validation checks. To assess the personality traits of respondents, we tested extraversion, openness and conscientiousness with the scale of Gosling et al. (2003) while information and rumour sharing scale were extracted from Marett and Joshi (2009). The SNS score was the independent variable in our model and this was measured using two commercial SNS influence score providers, PeerIndex and Klout.

Our study hypothesized that that Extraversion and Openness were two personality traits that should positively influence both Information and Rumor sharing behavior (H1 and H2), while Conscientiousness would have a reverse effect on Information (+) and Rumor (-) sharing behavior (H3 and H4). We also hypothesized that both Information and Rumor sharing behavior should positively influence social network influence scoring. A structural equation model using AMOS was used to test these hypotheses.

Results of Structured Equation Model - Standardised Regression Weights and Summary Findings
Results of Structured Equation Model – Standardised Regression Weights and Summary Findings

 

The model suggests:

  • Early users of social network sites who are more extrovert or more open or more conscientious are more likely to share information
  • Information sharing and rumor sharing should be treated as two distinct constructs in the discussion of social network influence.
  • All three traits were negatively related to rumor sharing. Only the effects of extroversion and conscientiousness were significant.
  • Both information sharing and rumor sharing impacted positively and significantly on social network site influence scores.

While previous literature has suggested that it is difficult to identify market mavens (Goldsmith et al., 2006), early users of social media can be identified easily and conveniently. This may provide firms with the opportunity to target potential innovators and early adopters much more efficiently and thus accelerate diffusion of marketing messages. Our study suggests filtering these adopters by messaging behaviour may also be of assistance with a greater of emphasis of resources being placed on those social network users who share information rather than rumor. While identifying these potential influencers would seem to be more efficient than identifying mavens, further research is required to understand the most effective way and time to engage with them. Finally, it would seem social network influence scores provide useful signals for identifying social media users likely to share information. Social media users characterised by a combination of high influence scores and propensity for information sharing are powerful assets for firms, particularly if they have relatively large social networks. Engaging with these influencers represents a relatively low cost mechanism for indirectly reaching target markets through word of mouth on social networks.

The research was conducted by Dr Theo Lynn (DCU Business School), Dr Laurent Muzellec (UCD), Dr Barbara Caemerrer (ESSCA), Prof. Darach Turley (DCU Business School) and Bettina Wuerdinger (DCU Business School).

A More Paradoxical Paradox

Have you ever checked your Facebook and Instagram and felt that your friends have more interesting lives? You’re not alone! In fact, that’s one of the consequences of Friendship Paradox, which states that on average, your friends have more friends than you do. Recently, researchers demonstrated that network paradoxes hold not only for popularity, but other traits as well, such as activity and virality of content received.

Beach
A variety of paradoxes exist in online social network such as Twitter and Facebook: Your friends, on average, have more friends, are more active, and post more popular/interesting content compared to you. Image source: https://flic.kr/p/5QXd9M

We recently showed that the standard version of the paradox, using the mean of friends’ values of the trait, arises trivially from the properties of statistical sampling from a heavy-tailed distribution. Social traits, such as popularity or activity (e.g., number of posts made), often have a “heavy tail”, where extremely large values, e.g., very popular people, appear much more frequently than expected compared to a normal distribution. When sampling randomly from such a distribution, the mean of the sample (i.e., mean of friends’ values) will grow with sample size, resulting in paradox. In contrast, the median of the sample does not behave this way and is a more robust measure of the paradox.

Surprisingly, paradoxes persist when median is used: i.e., most of your friends (and followers) have more friends (followers) than you do, and also post and receive more viral and diverse content than you do. In other words, the paradox holds not only for the mean, where a single very popular (or active) friend could skew the average, but also for most friends.

Why do strong paradoxes exist in networks? Since they are not a consequence of sampling, they must have behavioral origin. We hypothesize that they arise due to correlations between individual’s traits and popularity or between traits of connected people (homophily). To test this hypothesis, we performed the shuffle test: we kept the network topology fixed, but permuted traits between nodes in the network. This keeps the distribution of the traits intact, but destroys correlations between people. As expected, we still observe a paradox for the mean in the shuffled network, but not the strong paradox that uses the median.

In short, main findings of our work are

  • We found “strong” paradoxes where most of your friends have more friends than you do, etc.
  • We showed that the paradoxes have a behavioral origin, and not simply the result of statistical properties of sampling from the network.
  • The origin of the paradoxes is in the correlations between traits of nodes and their degree or homophily.

For details, please see our paper “Network Weirdness: Exploring the Origins of Network Paradoxes” http://arxiv.org/abs/1403.7242

Farshad Kooti, University of Southern California
Nathan O. Hodas, USC Information Sciences Institute
Kristina Lerman, USC Information Sciences Institute

The Bechdel Test of Social Media

The Bechdel test is a popular tool to analyze the role of women in movies, defining three conditions for a movie to pass the test:

  1. It contains two female characters
  2. Who talk to each other
  3. About something besides a man

The wide application of this test to discuss about gender roles in fiction leads to many controversies. First, whether a movie passes the test or not is just an anecdote, leading to heated discussions about subjective criteria, like which characters to take into account. Second, the test lacks control groups: it is not clear how would it apply to humans, and what would be the results if we apply a reversed test for male characters.

Star wars network
Network of characters and dialogues in Star Wars: A New Hope (1977)

We designed a computational extension of the Bechdel test, calculating precise male and female Bechdel scores for movies. Processing the script of a movie, we can find the percentage of all dialogues that happen between characters of the same gender, and that do not mention the opposite gender.

We downloaded and processed 493 scripts from imsdb.com, creating a dialogue network for each movie like the one shown in the image. We found that male Bechdel scores are significantly larger than their female counterpart, showing quantitative evidence that movies in English have a gender bias towards males.

To understand how would these scores extend to the population at large, we used two datasets of online dialogues: One including more than two million public dialogues between Twitter users, and another one composed more than 3000 dialogues in the walls of MySpace users. We divided these datasets in smaller subsets of size similar to a movie, and computed male and female Bechdel scores on them. The following figure shows the average scores for both social media, compared with those in movies that pass the Bechdel test in bechdeltest.com, and those that do not.

Bechdel Score Ellipses
Average values of Bechdel scores for movies and social media

While MySpace shows a balanced male and female scores, Twitter shows a very clear male bias that makes it closer to movies that do not pass the test than to those that pass it. This bias is also present when we correct for the larger male population of Twitter, which amounts to 64% of the users in the dataset. To understand better the possible origin of this gender bias in Twitter, we analyzed certain demographic factors and the relation between movies and user behavior, finding that:

  • The trailers of movies that pass the Bechdel test receive less views and likes in Youtube, but are more likely to be shared by women on Twitter.
  • Twitter male users that declared to be fathers are less likely to talk about women in their discussions with other men.
  • Users of Twitter that declared to be students do not show any statistically significant difference in their Bechdel scores.
  • Male Twitter users of northern US states are less likely to mention women than those in southern states, and female users in states with higher mean income and from rural areas are more likely to talk about men.

These results show that online interaction through public dialogues in Twitter is not absent of the influence of gender asymmetries, similarly to those that we find in the fictions of Hollywood movies. We will present this work in the ICWSM plenary session “You are what you speak” on June 2nd, 2014. This work is a result of the collaboration between David Garcia from ETH Zurich and Kiran Garimella and Ingmar Weber from the Qatar Computing Research Institute. Feel free to come to our session and read our article!

What do your food & drink habits tell about your culture?

Traditional ways to study cross-cultural differences depend on surveys, which are costly and do not scale up. We reveal another way to obtain similar data that could revolutionize the study of global culture.

We propose the use of publicly available data from location-based social networks (LBSNs) to map individual preferences. This is interesting because an LBSN check-in expresses the preference of a user for a certain type of place. LBSNs have also the characteristic to be accessible (almost) everywhere by anyone, solving the scalability problem and allowing data from the entire world to be collected, at a much lower cost (compared to traditional surveys). 

Users expressing preferences
Users expressing their preferences in LBSNs.

Our goal is to propose a new methodology for identifying cultural boundaries and similarities across populations using data collected from LBSNs. Since we know that food and drink habits are able to describe strong differences among people, we use Foursquare check-ins in such locations to represent user preferences for specific types of food and drink. We studied how these preferences change according to time of day and geographical locations. We have found that:

  • The eating and drinking choices in different countries, cities, or neighborhoods of a city reveal fascinating insights into differing habits of human beings. For instance, preferences among people in cities located in the same country tend to be very similar;
  • The time instants when check-ins are performed in food and drink places also provide valuable insights into the cultural aspects of a particular region. For example, whereas Americanss and English people tend to have their main meal at dinner time, Brazilians have it at lunch time.

Given those observations, we consider spatio-temporal dimensions of food and drink check-ins as users’ cultural preferences. We then apply a simple clustering technique to show the “cultural distance” between countries, cities or even regions within a city. We found that:

  • Our results often strongly agree with common knowledge;

  • Comparing our results with the World Values Surveys (a very large study based many years of survey data), the similarities are striking.

Clusters by cities
Clustering cities.

Yet, unlike traditional survey-based empirical studies, such as the aforementioned one, our methodology allows the identification of cultural dynamics much faster, capturing current cultural expressions at nearly real time, and at a much lower cost.

For more, see our full paper, You are What you Eat (and Drink): Identifying Cultural Boundaries by Analyzing Food & Drink Habits in Foursquare.

Thiago H Silva, Universidade Federal de Minas Gerais, Brazil
Pedro O S Vaz de Melo, Universidade Federal de Minas Gerais, Brazil
Jussara M Almeida, Universidade Federal de Minas Gerais, Brazil
Mirco Musolesi, University of Birmingham, UK
Antonio A F Loureiro, Universidade Federal de Minas Gerais, Brazil

That’s What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships

People post millions of updates to social media sites like Facebook and Twitter everyday. When it comes to understanding what groups of people are experiencing, knowing the area where these messages originate can make a huge difference: are  “I feel sick” posts pointing to a breaking epidemic or just run-of-the-mill flu? Does the posts promoting a political candidate reveal wide-spread support or just a loud minority form their home town?

However, one of the big challenges in doing geographic analyses is estimating
where people are. For example, only 0.7% of all Twitter messages come with some
kind of GPS data. Our work starts from this data and then uses a old social
principle: People are often friends with others who live nearby.  If we
know the locations of only a small number of people, we can look at a person’s
social network and try to infer their location based on where their friends are.

Here, take 20 million friendships in Twitter where we know the location of both people and plotted the relative geographic offset of each person's closet friend.  The giant spike right highlights that nearly all individuals have a very close friend!
Here, take 20 million friendships in Twitter where we know the location of both
people and plotted the relative geographic offset of each person’s closet
friend. The giant spike right highlights that most people have a
very close friend!

We looked at a Twitter social network of 47.7 million people, where two people are connected if they’ve both talked to each other at least once. In our estimates, we found that we could estimate a location for most people in the network (95%) and that our estimates were often very close to where people actually were, with over half within 10km (6mi).  Moreover, our method enables geo-tagging over 77% of all Twitter messages.

The error estimates for our method at inferring users locations.  If you pick a number of the x-axis and look a line, the y-axis shows the probability that an location estimate is less than the distance on the x-axis
The error estimates for our method at inferring users locations. If you pick a number of the x-axis and look a line, the y-axis shows the probability that an location estimate is less than the distance on the x-axis.

In the paper, we examined many other hypotheses and found:

  • The method works regardless of the person’s countries of origin
  • Locations can be accurately inferred across a variety of social network sizes – even if they only have one friend
  • Locations can even be inferred using data from other social network like Foursquare, provided you can find individuals who have identities in both
  • Only a small amount of location data is needed

For more, see our full paper, That’s What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships
David Jurgens, Sapienza University of Rome

Emoticon Style: Interpreting Differences in Emoticons Across Cultures

Facial expressions can sometimes tell more about the minds of others than words. According to Mehrabian, body language and nonverbal cues are in fact essential in the communication of feelings and attitudes as an expected 93% of the communication is nonverbal.

smiley

In text-based communication, however, these cues are not present and their absence can result in misunderstanding and confusion. Therefore, people started to express their facial expressions pictorially through groupings of symbols, letters, and punctuation; what are popularly referred to as emoticons. We focused on the use of these representative nonverbal cues online content and asked: how do people use emoticons in online social media across cultural boundaries?

Utilizing a near-complete Twitter dataset from 2006 to 2009, which contains information about 54 million users and all of their public posts, we investigated the semantic, cultural, and social aspects of emoticon usage on Twitter.

We found that:

  • There are two kinds of emoticon styles, vertical and horizontal.emoticon_style
  • We identified a wide range of variations, involving more than 14K facial emoticons in tweets. For example, the basic smiley “:)” had several variations (e.g., adding nose, eye brows), which then slightly changed the meaning. Although emoticons are generally used in positive light contexts, the most popular ones were used with both positive (e.g., haha, smile) and negative (e.g., kill, freak) affects.
    Word clouds of the top 50 co-occurring affect words for popular six emoticons
    Word clouds of the top 50 co-occurring affect words for popular six emoticons
  • Emoticons are expressed differently across cultural boundaries defined by geography and language. While Easterners employ a vertical style like “^_^”, Westerners employ a horizontal style like “:-)”. An important factor determining emoticon style is language rather than geography.
    Different emoticon rates of each country for horizontal and vertical style
    Different emoticon rates of each country for horizontal and vertical style
  • Emoticons diffuse through the Twitter friendship network. Twitter users may influence their friends to adopt particular styles of emoticons especially for less popular emoticons like “:P”‘, “^^”, and “T_T”. The diffusion occurs almost entirely between people from similar cultural backgrounds.

What is your favorite emoticon? What kinds of words do you use with emoticons?

As socio-cultural norms, emoticons not only express specific emotions they may also show your identity and cultural backgrounds.

For more detail, see our full paper, Emoticon Style: Interpreting Differences in Emoticons Across Cultures.

Jaram Park, Graduate School of Culture Technology, KAIST
Vladimir Barash, Morningside Analytics
Clay Fink, Johns Hopkins University Applied Physics Laboratory
Meeyoung Cha, Graduate School of Culture Technology, KAIST

Warning: People you know may be hazardous to your cognitive health

The friendship paradox states that your friends have more friends than you do, on average. This statistical curiosity leads to systematic biases in perception and self-assessment. In our ICWSM 2013 paper, “Friendship Paradox Redux: Your Friends are More Interesting Than You,” we reveal that, not surprisingly, this paradox also exists in the follower graph of Twitter, in a variety of incarnations. Not only are the people you follow more popular (have more followers) than you, but they are also better connected (follow more users) than you. At the same time, your followers are also more popular and better connected than you are, on average.

In addition to these, we discovered two new behavioral paradoxes on Twitter. First, people you follow receive more viral content than you, on average (virality paradox). Also, they are more active than you,  meaning they tweet more often, on average, than you do (activity paradox).

These paradoxes have surprising implications for active users who rely on Twitter to keep up with friends and spread information to their followers.

Your friends see more valuable content than you do: Due to their better connectivity, people you follow tend to receive more valuable information, or at least information that ends up spreading farther, than you do.

Friendship Paradox Redux
Distribution of average popularity of information seen by overloaded and non-overloaded Twitter users. Popularity of information is defined as the number of people who have tweeted about it. Overloaded users tend to see only information that becomes popular.

Information overload: The more people you follow, the more information you will receive.  Due to the activity paradox, however, the volume of new information increases ever faster as you follow more people.  Because your ability to digest new information is limited, you risk becoming overloaded with content.

The last to know: Overloaded people tend to see only popular information that have been tweeted by many people they follow. They also risk missing updates from friends.

In order to absorb the content in their Twitter feeds, users will have to be more selective about whom they follow, and systematically refuse new “who to follow” suggestions Twitter makes. Conversely, to make themselves heard above the noise, users will either have to drown out the competition, exacerbating the problem, or find direct paths to users with the fewest friends—those who are most likely to see information in their feed and absorb it.

For more, see our fullpaper, Friendship Paradox Redux: Your Friends are More Interesting Than You.

Nathan O. Hodas, Information Sciences Institute, USC
Farshad Kooti, Information Sciences Institute, USC
Kristina Lerman, Information Sciences Institute, USC