Many consumer brands hire customer agents to engage customers on social media services such as Twitter; these agents solicit opinions, respond to questions and requests, and thank or apologize to customers when necessary.
The usual method of filtering relevant customer opinions using simple keywords is often insufficient, because coming up with the right keywords is not easy. For instance, a representative at Delta Airlines filtering for “delta” would also collect posts aout “alpha delta phi” and “Nile delta”. A more restrictive query, requiring both “delta” and “airline” would miss posts such as “I flew to Seattle on Delta.” Furthermore, even among posts that indeed refer to the brand, many are side comments with no brand-relevant opinion, and therefore are usually not worth an agent’s attention.
As a result, agents end up wasting tons of time reviewing irrelevant content.
What’s the solution?

We produced CrowdE, an intelligent filtering system that helps brand agents filter tweets. We designed a common reusable filter creation process, where we ask crowd workers to label tweets for a brand and then extract insights through machine learning. The resulting filtering system has a number of nice properties:
- It can be customized for any particular consumer brand with minimal cost and design effort.
- It supports filtering by relevance to the brand and by presence of brand-related opinion.
- Filtering accuracy is on-par with expert-crafted filter rules for the given brand.
Using the CrowdE system, agents can filter the live Twitter stream at will, and mark relevant follow-up actions for each tweet. In user studies, both experienced and novice users preferred CrowdE to a traditional keyword-based filter. Users considered CrowdE-based filtering to be more efficient, more complete, less difficult, and less tedious. CrowdE also gave users more confidence in their filtering. Users performed better, as well, correctly marking more follow-up actions in the same amount of time.
For more details, see our ICWSM 2013 paper, CrowdE: Filtering Tweets for Direct Customer Engagements.
Jilin Chen, IBM Almaden Research Center
Allen Cypher, IBM Almaden Research Center
Clemens Drews, IBM Almaden Research Center
Jeffrey Nichols, IBM Almaden Research Center
This paper addresses 3 problems. 1) You want to retrieve relevant tweets for a particular brand, which is an Information Retrieval problem. 2) You employ crowd workers to provide labels of the tweets’ relevance to brands. This transform the IR problem into a machine learning classification problem. 3) But you only want tweets that expresses an opinion. The 3rd problem makes it an interesting and challenging problem. But not much is mentioned about what features are used for the machine learning portion.
I wonder how scalable is your approach since you require crowd workers to provide labels? I am thinking that it could be made scalable via some form of bootstrap or semi-supervised learning approach.
Considering that we have so many kinds of brands in the world, is the approach generalizable for most kinds of consumer brands?
Could we improve the performance by adding industry-specific considerations into the system?
Thanks for the comment!
We did state the feature set in the paper – it is simply a bag of words for both relevance and opinion. The tricky part is knowing which words actually matter, and that’s why we used crowd worker input to find that out.
Not sure what you mean by “scalable” – this isn’t a big data problem. I guess you mean the effort for creating the filters? Surely requiring crowd input for every brand is a burden; however, it is our experience that without this data we simply cannot create effective filters for even the two brands we considered.
As a result, we tried best to make the filter creation process as painless as possible while maintaining the quality of the resulting filter. I guess that’s our main contribution.
And of course, it is always possible to improve – by incorporating semi-supervised learning, by adding industry-specific knowledge. We’d love to see some follow-up work.