On Thursday, I got a call from my senior career regarding a sentiment analysis project. Honestly I was dreaming and got very sleepy thanks to the call recorder which helped me to listen to the requirements and deadline again. So the conclusion was that I had to analyze the review data and categorize it into negative, positive, and neutral, and I had to turn it in next Monday. I have a good interest in Natural Language Processing and sentiment analysis is a major part of it. I’ve heard of lexicon-based sentiment analyzers, especially TextBlob, and chose to keep up because there was no time to try out some new algorithms. TextBlob is developed by Steven Loria, it is a python library
Uses natural language tools to perform tasks. I’ve seen many projects using TextBlob as sentiment analyzer, mostly with Twitter data or movie review data.
Why is sentiment analysis important?
Reviewing anything (product, movie, person, etc.) is very important these days to get a clear picture of what end users think. Enthusiasts give all kinds of reviews on a particular domain or a consumer who has a bad experience gives negative reviews on the product they bought/used. We have given “The Shawshank Redemption” a rating of 9.3 and our “Student of the Year 2” rating of 2.2 on IMDB. Speaking of the reviews of any product, we have a lot of user review data if we want to take a look at that specific topic. The sources can be shopping portals like Amazon, Flipkart, Alibaba, Myntra, etc. as well as social media platforms like Twitter, Facebook,
The field I was about to work in was a bit different from movie reviews, it was mobile review from various shopping portals and social media. The data mining part was traditional i.e. scraping from the gates. I used web scraping to get all the data. Now it’s time to do the data cleaning and sentiment analysis. Cleaning the data was a time consuming task as the review included sentences with irrelevant things that weren’t helpful for my purpose. To clean the data, I used regular expressions and some ready-made expressions
Python libraries. The (data) set is now ready for analysis.
Start with TextBlob
Let’s see some basics of TextBlob, it uses NLTK (Natural Language Toolkit) and the input contains one sentence, TextBlob’s output is polarity And subjectivity. The degree of polarity lies between (-1 to 1) where -1 determines the most negative words such as ‘Disgusting’, ‘appalling’, ‘pathetic’, and 1 identifies the most positive words like “Excellent”, “Best”. The degree of subjectivity lies between (0 and 1), it shows the amount of subjective opinion, if the sentence is highly subjective i.e. close to 1, it is like the text contains more subjective opinion than factual information. I was more concerned about the degree of polarity because my goal was not to identify factual information, so I skipped the degree of character in my project.
To start working with TextBlob, you need to install python and a preconfigured pip. The pip install command for TextBlob is
pip install textblob
To import TextBlob, we need to write as
from textblob import TextBlob
TextBlob syntax for polarity score:
Since TextBlob is a lexicon based sentiment analyzer, it has some predefined grammar or we can say dictionary of words and weight, it has some score which helps in calculating sentence polarity. This is why lexicon-based sentiment analyzers are also called “rule-based sentiment analyzers”.
Let’s check the polarity of some random sentences with TextBlob, the beauty of TextBlob is that it has very easy syntax.
- It’s a beautiful day.
- This movie is poorly directed.
- The weather today is nice.
We get polarity values of 0.85, -0.69, 0.73, respectively. In the above data, we have a negative sentence This movie is badly directed. which has a polarity degree of -0.69 which is similar to one of the most negative sentences,
Let’s change the word “in a bad manner” to me “impressive”.
The output is 0.6000000000000001. Here, TextBlob works impressive as an emotional analyzer. I successfully submitted my project next Monday and also got appreciation from my colleagues.
The next day I was just looking at the results files and a few particular sentences caught my eye.
He was No slow motion camera
I also told my domain was mobile review analysis so if anyone writes this sentence it is negative, but TextBlob rated it positive with a polarity score of 0.15. It made me curious and forced me to do some exploration on how TextBlob works and the result When adding any negation with any sentence, it simply multiplies -0.5 to the degree of the word’s polarity. In my case, “slow” was a negative word and had a polarity of -0.3, so when you multiply -0.5 the resulting polarity becomes positive 0.15.
Another problem I had with TextBlob was when the word negation was added somewhere in between, i.e. not adjacent to the word having some polarity other than 0.
- This is the best face recognition at this price. (polarity: 1.0)
- This is it Not Best face recognition device at this price. (polarity: 1.0)
In the above example if we see the word “better”And It has a polarity degree of 1.0 but in the second sentence it should multiply -0.5 to 1.0 and the value should appear as -0.5 but that is not the case. The answer here is TextBlob “not better” different from “Not the best” This creates the problem. These things had to be changed because they were affecting the overall feel of the product.
I started exploring again on sentiment analyzers and found a research paper written by Eric Gilbert and C. Hutu. It was on the VADER (Valence Aware Dictionary and Emotion Reasoner). VADER is another lexicon-based sentiment analyzer that has predefined rules for words or lexicons. The VADER not only tells that the lexicon is positive, negative, or neutral, but also tells how positive, negative, or neutral the sentence is. The output from VADER comes in the Python dictionary where we have four keys and their corresponding values. “neg,” “neu,” “pos,” and “complex” which mean negative, neutral, and positive respectively. The composite score is an indispensable score that is calculated by leveling the other three scores (neg, neu, pos) between -1 and +1. The decision criteria are similar to the TextBlob criteria of -1 for most negative and +1 for most positive.
It works differently than TextBlob. I took some problem sentences and executed them using VADER and the result was correct.
To start working on VADER, we need to install it to pip.
pip install vaderSentiment
We need to import it and configure it as
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
I checked my problem statement:
print(sid_obj.polarity_scores("no slow motion camera"))
The product of the above sentence is a complex score of -0.296.
I have analyzed the entire batch using Vader and TextBlob. The output brought me to the conclusion that TextBlob was struggling with negative sentences, especially negations.
The above graph is a scatter plot of the Pearson correlation coefficient between the two mentioned algorithms VADER and TextBlob. In the above graph, we can see that the sentences considered by VADER to be negative were mostly identified as positive by TextBlob. In the case of the first and third quartiles, the algorithms came to an agreement. But in the case of the second and fourth quartiles, there is a mismatch especially for the fourth quartile, which has more contradictory data, it belongs to positive according to TextBlob and negative according to VADER.
To get rid of the bias I had for TextBlob, I needed more evidence to be convinced that VADER does the job better than TextBlob in my project. To get the guide you need more testing.
As Richard Feynman said, It doesn’t matter how beautiful it is
Your theory, no matter how smart you are. If it does not agree with
Experience, this is wrong. “
The best way was to compare the two algorithms they had, but the main problem was the comparison with which one? I wanted some real analysis, but who will decide the right sentiment? we human beings. At first, I thought I would identify all the right sentiments, but I researched a bit and got to know the “wisdom of the crowd”. In The Wisdom of the Crowds, James Surovicki writes, “collective
Knowing a group of people as expressed through their aggregated opinions
It can be trusted as a substitute for expert knowledge.”. I decided to go
With the wisdom of the masses to get the right feelings.
I selected 20 people for this task, of whom 10 have mobile experience while the rest are not. I gave them 150 random sentences to mark them as positive, negative, and neutral. Then from the output given by each individual, I took an average of 20 people and gave a final correct assessment of feelings. This was the gold standard. Now we can compare TextBlob and Vader. To get the accuracy of the algorithm compared to the human analyzed sentences, I created confusion matrices with both the algorithm data versus the crowdsourcing data.
The result is so convincing that VADER outperforms TextBlob when it comes to negative polarity
reveal. In the above confusion matrices, VADER gets an overall accuracy of 63.3% but TextBlob gets an accuracy of 41.3%.
It depends on user requirements. My answer is no, VADER is not better than TextBlob across the board. However, I can say that
VADER works best when it comes to rating negative emotions.
In the above table, the F1 score for VADER is 0.80 when it comes to negative polarity detection and for TextBlob it is 0.56. From this, we can conclude that VADER does a better sentiment analysis when it comes to detecting negative polarity.