This post is the third of a multi-part series with our partner Brandwatch, in which Will McInnes, CMO of Brandwatch, examines how brands can unlock the power of social data and social listening for business.
Here’s an understatement: Sentiment, in life and in data, is a sensitive topic.
In the last six months, more than ever before, the cultural landscape and specific groups of data scientists have cast a sharp light on social sentiment. Questions mount and responses fluctuate as the validity of this type of data is scrutinized from every angle.
A prime example was the whole #Gamergate … entity. Social reactions spurred social data analysts to dissect and discuss various data insights and stats put forth by a variety of sources. It caused an uproar in not just the gaming community, but the world of data analysis.
The fact of the matter is social data sentiment is a tricky subject.
We asked the experts from social strategy and analytics company Converseon and Brandwatch to answer some hard-hitting questions about social sentiment data.
Let’s see if we can make some sense of all this sentiment with Converseon partner Erin Tavgac and Brandwatch data scientist Dr. Mike Williams.
Q: What are the different automated strategies social sentiment analysis companies use to determine sentiment? How is sentiment extracted from social mentions?
Converseon: Traditional approaches have used a combination of rules-based and dictionary-based approaches which essentially look to define a set of rules or words that correlate with a specific sentiment (e.g. the word small means negative). These approaches have limitations as it is virtually impossible to distill varied usage within different contexts into a strict set of rules (e.g. the word small may be negative when referring to hotel rooms but positive when referring to smartphones).
There is now an emerging class of automated strategies that utilize a combination of machine learning, Natural Language Processing, and statistical methods to capture the nuance and varied usage within human language. These algorithms are trained by humans (to mimic the advanced interpretation that we as humans instinctively use in interpreting language) and are significantly more effective at extracting true sentiment from social mentions.
Brandwatch: Determining the sentiment of a document (a tweet, a news article, a transcription of spoken language) pretty much always boils down to identifying words or phrases that indicate positive or negative emotions, or their absence.
The two main approaches differ fundamentally in how the lists of positive or negative emotions are generated. In a rules-based approach, experts will write, curate and maintain a manually constructed list of words and phrases (along with a score indicating whether they are positive, negative or something else). In a machine learning approach, a machine is shown examples of positive, negative or neutral documents, and, if it is shown enough of these training examples it learns the words and phrases that indicate sentiment. In a sense, it generates rules that are in some ways similar to those in a rules-based approach, but it does so automatically.
The upside is that, given large amounts of high quality, relevant data, the rules a machine learns can be more general, more specific and more complete than the human-curated rules-based approach. One of the downsides is that it can only learn linguistic patterns that exist in the training data, so if a new phrase indicating sentiment is created (e.g. “on fleek”) or is simply absent from the training data, the machine may struggle to learn what it means.
Q: Doesn’t society’s propensity for sarcasm make it impossible to determine what is actually sincere and positive, and what is negative?
Converseon: Not entirely. While this is a difficult problem encountered with all sentiment algorithms, the more advanced approaches involving machine learning enable a sentiment algorithm to learn to interpret sarcasm as a human does. A successful approach simply requires robust training by qualified human experts to capture enough data and all the nuanced ways in which sarcasm is employed by authors.
Brandwatch: Sarcasm is often cited as the biggest challenge facing automated sentiment analysis. I have two hunches here. The first is that it is not, in principle, impossible for an algorithm to detect sarcasm. The second is that sarcasm is less of a practical concern — and simply less prevalent — than you might imagine.
I have a related concern though: not all sentiments are simply either positive, negative or neutral. Indeed it’s not obvious what it means to say a tweet about a controversial subject (like gamergate) is negative.
Q: What benefit does social sentiment analysis provide for me? How can it help my bottom line and how can I utilize it in my activities – marketing or otherwise?
Converseon: Social sentiment analysis impacts a company’s bottom line in a tangible way across many functional areas of the business. Within marketing it can be used to better understand what marketing spend / campaigns / initiatives are driving sales (or not) and how overall marketing budget can be optimized to maximize ROI. For customer service and product development it is hugely helpful for diagnosing aspects of a company’s products or services that consumers like and should be emphasized to help grow revenue. The PR function can leverage sentiment analysis to better detect and respond to emerging reputational threats before they impact sales or stock price, while HR can use sentiment to manage employer reputation (thereby lowering recruiting costs and attracting more productive talent to grow the business).
Brandwatch: Sentiment analysis isn’t perfect, especially when applied to short documents like tweets. Most technologies that offer automated sentiment analysis openly acknowledge a 60-80% accuracy rate. But despite that, sentiment analysis does provide is one way to “take the temperature of a room” and, in particular, to compare the temperature of one room to another, or of one day to another. It’s a way to get a handle on questions like: are things getting better? Are people talking about X happier than people talking about Y?
Once these questions have been answered by sentiment analysis, you can go ahead and pinpoint the reason (e.g. celebrity tweet, trending hashtag, etc.) that caused that shift in sentiment. One great use case of this is to get ahead of a damaging social media rumor, or address high-profile customer concerns before they become trending mishaps.
Q: Are there problems with sentiment analysis? Can it be overinterpreted or misused?
Converseon: Yes, the main problems with sentiment analysis stem from the quality of data used as an input and the misuse of popular metrics. When sentiment is applied to highly irrelevant data that hasn’t been cleaned, there is a high degree of neutrality in the data and the signal gets lost within the overall noise that comes with social data. Sentiment is often times also viewed as a basic, topline metric (e.g. positive or negative) which can be overly simplistic and misleading. Truly actionable sentiment metrics (which drive bottom line results) go beyond this veneer and look at more granular cuts / shades of sentiment and cross-variables to diagnose the true drivers behind the sentiment.
Brandwatch: Yes and yes! Machines make mistakes! You should dismiss anyone who says otherwise as a charlatan. Neither precision nor recall are perfect in any automated sentiment analysis system. To say precision and recall are problems in the context of sentiment, is to say, firstly that algorithms sometimes misclassify positive tweets as negative (and vice versa), and secondly that algorithms sometimes misclassify positive or negative tweets as neutral.
What does that mean for users of sentiment analysis? Firstly, you should become used to the idea that individual mentions may occasionally be misclassified. If this happens, it’s not necessarily a sign that the whole system is flawed. Secondly, you should focus on making fair comparisons between the sentiments of sets of tweets according to your system (today vs. yesterday, brand A vs. brand B), rather than treating the absolute numerical score assigned by a system as the main KPI yielded by your social listening product.
Q: How are companies like Converseon and Brandwatch working to make this type of emotion-driven data more reliable?
Converseon: We are working on a couple of key areas:
- Increasing the relevancy of the data on which sentiment analysis is run
- Using training data interpreted for sentiment by human experts to train algorithms and upgrade precision of the automated sentiment scoring vs. traditional rules-based approaches
- Improving the recall of sentiment to capture not just overall sentiment for a social media post but also multiple instances of sentiment expression within a post
- Tuning / training sentiment algorithms to specific verticals and brands to better capture nuances in sentiment (as opposed to using a one size fits all approach)
Brandwatch: There are two main strands of research. The first is to improve the rules. This is done by expanding and refining the ruleset used in a human-curated rules-based approach, or by showing the machine more, and more relevant examples of the “right answer”, i.e. tweets with human-assigned sentiment scores.
The second strand of research, which is more ambitious, is to get away from the two crude buckets, “positive” and “negative”. There are more human emotions that positive and negative, and most of them do not lie on a simple continuum from very positive to very negative.
What new possibilities are there now that a business is social? To learn more, join Will McInnes at Social Media Week New York on Thursday February 26, where we learn how to strategically use social listening for business.
Social Media Week New York begins on February 23. For the full event schedule and how you can join us, visit here.