User-generated comments in online social media have recently been gaining increasing attention as a viable source of general-purpose descriptive annotations for social media objects like online shared photos or videos. However, the quality of user-generated comments varies from very useful to entirely useless; comments can even be abusive or off-topic.
The most common methods for estimating the usefulness of user-generated comments simply allows all users to vote on (and possibly moderate) the contributions of others, thus avoiding an explicit definition of “useful”.
We investigate usefulness from the user’s perspective, defining a comment as USEFUL if it provides descriptive information about the media object beyond the usually very short title accompanying it.
With this definition in hand, we asked:
- What are PROPERTIES of useful comments?
- How to PREDICT useful comment?
- How to estimate the PREVALENCE of useful comment?
Using Text-based, Semantic, Topical, and Author Features, we characterized crowd-sourced labeled comments on two classes of media objects (comments on Flickr photos and YouTube videos) and trained prediction models. Furthermore, an existing Bayesian Prevalence model is adapted that uses the learned prediction models to estimate the prevalence of useful comments among different platforms and topics.
We found that:
- Properties of USEFUL comment varies slightly according the platform’s commenting culture and different topics of media objects. Comments that contain a higher number of references, a higher number of named entities, fewer self-references and less affective language are more likely to be inferred as USEFUL. Moreover, users express more emotion and may use more offensive language when writing comments about topics related to person and event.
- Prediction performance is better when the classifier is trained on comments of a single topic, (type-specific), whereas performance is worse when the topic is ignored (type-neutral). Thus, for a more accurate prediction, a model should be trained that takes into account the topic of media objects.
- Prevalence of USEFUL comments influenced by:
- The time of the topic of media object being commented. The nearer the time period of a topic is to the present time, the lower the usefulness prevalence is.
- The degree of polarization of topics among commenters.
- The topic of the media object being commented and the platform’s commenting culture
Want to learn more? see our full paper: Properties, Prediction, and Prevalence of Useful User-generated Comments for Descriptive Annotation of Social Media Objects