Illustration with collage of pictograms of clouds, pie chart, graph pictograms

Sentiment analysis, or opinion mining, is the process of analyzing large volumes of text to determine whether it expresses a positive sentiment, a negative sentiment or a neutral sentiment.

Companies now have access to more data about their customers than ever before, presenting both an opportunity and a challenge: analyzing the vast amounts of textual data available and extracting meaningful insights to guide their business decisions.

From emails and tweets to online survey responses, chats with customer service representatives and reviews, the sources available to gauge customer sentiment are seemingly endless. Sentiment analysis systems help companies better understand their customers, deliver stronger customer experiences and improve their brand reputation.

Discover the power of integrating a data lakehouse strategy into your data architecture, including enhancements to scale AI and cost optimization opportunities.

With more ways than ever for people to express their feelings online, organizations need powerful tools to monitor what’s being said about them and their products and services in near real time. As companies adopt sentiment analysis and begin using it to analyze more conversations and interactions, it will become easier to identify customer friction points at every stage of the customer journey.

Deliver more objective results from customer reviews

The latest artificial intelligence (AI) sentiment analysis tools help companies filter reviews and net promoter scores (NPS) for personal bias and get more objective opinions about their brand, products and services. For example, if a customer expresses a negative opinion along with a positive opinion in a review, a human assessing the review might label it negative before reaching the positive words. AI-enhanced sentiment classification helps sort and classify text in an objective manner, so this doesn’t happen, and both sentiments are reflected.  

Achieve greater scalability of business intelligence programs

Sentiment analysis enables companies with vast troves of unstructured data to analyze and extract meaningful insights from it quickly and efficiently. With the amount of text generated by customers across digital channels, it’s easy for human teams to get overwhelmed with information. Strong, cloud-based, AI-enhanced customer sentiment analysis tools help organizations deliver business intelligence from their customer data at scale, without expending unnecessary resources.

Perform real-time brand reputation monitoring

Modern enterprises need to respond quickly in a crisis. Opinions expressed on social media, whether true or not, can destroy a brand reputation that took years to build. Robust, AI-enhanced sentiment analysis tools help executives monitor the overall sentiment surrounding their brand so they can spot potential problems and address them swiftly.

Sentiment analysis uses natural language processing (NLP) and machine learning (ML) technologies to train computer software to analyze and interpret text in a way similar to humans. The software uses one of two approaches, rule-based or ML—or a combination of the two known as hybrid. Each approach has its strengths and weaknesses; while a rule-based approach can deliver results in near real-time, ML based approaches are more adaptable and can typically handle more complex scenarios.

Rule-based sentiment analysis

In the rule-based approach, software is trained to classify certain keywords in a block of text based on groups of words, or lexicons, that describe the author’s intent. For example, words in a positive lexicon might include “affordable,” “fast” and “well-made,” while words in a negative lexicon might feature “expensive,” “slow” and “poorly made”. The software then scans the classifier for the words in either the positive or negative lexicon and tallies up a total sentiment score based on the volume of words used and the sentiment score of each category.

Machine learning sentiment analysis

With a machine learning (ML) approach, an algorithm is used to train software to gauge sentiment in a block of text using words that appear in the text as well as the order in which they appear. Developers use sentiment analysis algorithms to teach software how to identify emotion in text similarly to the way humans do. ML models continue to “learn” from the data they are fed, hence the name “machine learning”. Here are a few of the most commonly used classification algorithms:

Linear regression: A statistics algorithm that describes a value (Y) based on a set of features (X).

Naive Bayes: An algorithm that uses Bayes’ theorem to categorize words in a block of text.

Support vector machines: A fast and efficient classification algorithm used to solve two-group classification problems.

Deep learning (DL): Also known as an artificial neural network, deep learning is an advanced machine learning technique that links together multiple algorithms to mimic human brain function.

The hybrid approach

A hybrid approach to text analysis combines both ML and rule-based capabilities to optimize accuracy and speed. While highly accurate, this approach requires more resources, such as time and technical capacity, than the other two.

In addition to the different approaches used to build sentiment analysis tools, there are also different types of sentiment analysis that organizations turn to depending on their needs. The three most popular types, emotion based, fine-grained and aspect-based sentiment analysis (ABSA) all rely on the underlying software’s capacity to gauge something called polarity, the overall feeling that is conveyed by a piece of text.

Generally speaking, a text’s polarity can be described as either positive, negative or neutral, but by categorizing the text even further, for example into subgroups such as “extremely positive” or “extremely negative,” some sentiment analysis models can identify more subtle and complex emotions. The polarity of a text is the most commonly used metric for gauging textual emotion and is expressed by the software as a numerical rating on a scale of one to 100. Zero represents a neutral sentiment and 100 represents the most extreme sentiment.

Here are the three most widely used types of sentiment analysis:

Fine-grained (graded)

Fine-grained, or graded, sentiment analysis is a type of sentiment analysis that groups text into different emotions and the level of emotion being expressed. The emotion is then graded on a scale of zero to 100, similar to the way consumer websites deploy star-ratings to measure customer satisfaction.

Aspect-based (ABSA)

Aspect based sentiment analysis (ABSA) narrows the scope of what’s being examined in a body of text to a singular aspect of a product, service or customer experience a business wishes to analyze. For example, a budget travel app might use ABSA to understand how intuitive a new user interface is or to gauge the effectiveness of a customer service chatbot. ABSA can help organizations better understand how their products are succeeding or falling short of customer expectations.

Emotional detection

Emotional detection sentiment analysis seeks to understand the psychological state of the individual behind a body of text, including their frame of mind when they were writing it and their intentions. It is more complex than either fine-grained or ABSA and is typically used to gain a deeper understanding of a person’s motivation or emotional state. Rather than using polarities, like positive, negative or neutral, emotional detection can identify specific emotions in a body of text such as frustration, indifference, restlessness and shock.

Organizations conduct sentiment analysis for a variety of reasons. Here are some of the most popular use cases.  

Support teams use sentiment analysis to deliver more personalized responses to customers that accurately reflect the mood of an interaction. AI-based chatbots that use sentiment analysis can spot problems that need to be escalated quickly and prioritize customers in need of urgent attention. ML algorithms deployed on customer support forums help rank topics by level-of-urgency and can even identify customer feedback that indicates frustration with a particular product or feature. These capabilities help customer support teams process requests faster and more efficiently and improve customer experience.

By using sentiment analysis to conduct social media monitoring brands can better understand what is being said about them online and why. For example, is a new product launch going well? Monitoring sales is one way to know, but will only show stakeholders part of the picture. Using sentiment analysis on customer review sites and social media to identify the emotions being expressed about the product will enable a far deeper understanding of how it is landing with customers.

By turning sentiment analysis tools on the market in general and not just on their own products, organizations can spot trends and identify new opportunities for growth. Maybe a competitor’s new campaign isn’t connecting with its audience the way they expected, or perhaps someone famous has used a product in a social media post increasing demand. Sentiment analysis tools can help spot trends in news articles, online reviews and on social media platforms, and alert decision makers in real time so they can take action.

While sentiment analysis and the technologies underpinning it are growing rapidly, it is still a relatively new field. According to “Sentiment Analysis,” by Liu Bing (2020) the term has only been widely used since 2003. 1 There is still much to be learned and refined, here are some of the most common drawbacks and challenges.

Lack of context

Context is a critical component for understanding what emotion is being expressed in a block of text and one that frequently causes sentiment analysis tools to make mistakes. On a customer survey, for example, a customer might give two answers to the question: “What did you like about our app?” The first answer might be “functionality” and the second, “UX”. If the question being asked was different, for example, “What didn’t you like about our app?” it changes the meaning of the customer’s response without changing the words themselves. To correct this problem, the algorithm would need to be given the original context of the question the customer was responding to, a time-consuming tactic known as pre or post  processing.

Use of irony and sarcasm

Regardless of the level or extent of its training, software has a hard time correctly identifying irony and sarcasm in a body of text. This is because often when someone is being sarcastic or ironic it’s conveyed through their tone of voice or facial expression and there is no discernable difference in the words they’re using. For example, when analyzing the phrase, “Awesome, another thousand-dollar parking ticket—just what I need,” a sentiment analysis tool would likely mistake the nature of the emotion being expressed and label it as positive because of the use of the word “awesome”.

Negation is when a negative word is used to convey a reversal of meaning in a sentence. For example, consider the sentence, “I wouldn’t say the shoes were cheap." What’s being expressed, is that the shoes were probably expensive, or at least moderately priced, but a sentiment analysis tool would likely miss this subtlety.  

Idiomatic language

Idiomatic language, such as the use of—for example—common English phrases like “Let’s not beat around the bush,” or “Break a leg ,” frequently confounds sentiment analysis tools and the ML algorithms that they’re built on. When human language phrases like the ones above are used on social media channels or in product reviews, sentiment analysis tools will either incorrectly identify them—the “break a leg” example could be incorrectly identified as something painful or sad, for example—or miss them completely.

Organizations who decide they want to deploy sentiment analysis to better understand their customers have two options for how they can go about it: either purchase an existing tool or build one of their own.

Businesses opting to build their own tool typically use an open-source library in a common coding language such as Python or Java. These libraries are useful because their communities are steeped in data science. Still, organizations looking to take this approach will need to make a considerable investment in hiring a team of engineers and data scientists.

Acquiring an existing software as a service (SaaS) sentiment analysis tool requires less initial investment and allows businesses to deploy a pre-trained machine learning model rather than create one from scratch. SaaS sentiment analysis tools can be up and running with just a few simple steps and are a good option for businesses who aren’t ready to make the investment necessary to build their own.

Today’s most effective customer support sentiment analysis solutions use the power of AI and ML to improve customer experiences. IBM watsonx Assistant is a market leading, conversational artificial intelligence platform powered by large language models (LLMs) that enables organizations to build AI-powered voice agents and chatbots that deliver superior automated self-service support to their customers on a simple, easy-to-use interface.

Discover how artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind.

Gain a deeper understanding of machine learning along with important definitions, applications and concerns within businesses today.

Learn about the importance of mitigating bias in sentiment analysis and see how AI is being trained to be more neutral, unbiased and unwavering.

IBM watsonx Assistant helps organizations provide better customer experiences with an AI chatbot that understands the language of the business, connects to existing customer care systems, and deploys anywhere with enterprise security and scalability. watsonx Assistant automates repetitive tasks and uses machine learning to resolve customer support issues quickly and efficiently.

1 “Sentiment Analysis (Second edition),"  (link resides outside ibm.com), Liu, Bing, Cambridge University Press, September 23, 2020

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • Product Demos
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories
  • Artificial Intelligence
  • Market Research
  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Survey Data Analysis & Reporting
  • Sentiment Analysis

What is sentiment analysis?

What is sentiment analysis used for, why is sentiment analysis important, use cases for sentiment analysis, types of sentiment analysis, pros and cons of using a sentiment analysis system, how does sentiment analysis work, sentiment analysis challenges, three places to analyze customer sentiment, sentiment analysis tools, analyzing customer sentiment, creating better experiences, try qualtrics for free, sentiment analysis and how to leverage it.

20 min read From survey results and customer reviews to social media mentions and chat conversations, today’s businesses have access to data from numerous sources. But how can teams turn all of that data into meaningful insights? Find out how sentiment analysis can help.

When it comes to branding, simply having a great product or service is not enough.  In order to determine the true impact of a brand, organizations must leverage data from across customer feedback channels to fully understand the market perception of their offerings.

Quantitative feedback available via metrics such as net promoter scores can provide some information about brand performance, but qualitative feedback in the form of unstructured data provides more nuanced insight into how people actually “feel” about your brand .

Sifting through textual data, however, can be extremely time-consuming. Whether analyzing solicited feedback via channels such as surveys or examining unsolicited feedback found on social media, online forums, and more, it’s impossible to comprehensively identify and integrate data on brand sentiment when relying solely on manual processes.

Leveraging an omnichannel analytics platform allows teams to collect all of this information and aggregate it into a complete view. Once obtained, there are many ways to analyze and enrich the data, one of which involves conducting sentiment analysis. Sentiment analysis can be used to improve customer experience through direct and indirect interactions with your brand. Let’s consider the definition of sentiment analysis, how it works and when to use it.

Learn how TextiQ can help you conduct advanced sentiment analysis

Sentiment refers to the positivity or negativity expressed in text. Sentiment analysis provides an effective way to evaluate written or spoken language to determine if the expression is favorable, unfavorable, or neutral, and to what degree. Because of this, it gives a useful indication of how the customer felt about their experience.

If you’ve ever left an online review, made a comment about a brand or product online, or answered a large-scale market research survey , there’s a chance your responses have been through sentiment analysis.

Sentiment analysis is part of the greater umbrella of text mining, also known as text analysis . This type of analysis extracts meaning from many sources of text, such as surveys , reviews, public social media, and even articles on the Web. A score is then assigned to each clause based on the sentiment expressed in the text. For example, -1 for negative sentiment and +1 for positive sentiment. This is done using natural language processing (NLP).

Positive neutral and negative sentiment chart

Today’s algorithm-based sentiment analysis tools can handle huge volumes of customer feedback consistently and accurately. A type of text analysis , sentiment analysis, reveals how positive or negative customers feel about topics ranging from your products and services to your location, your advertisements, or even your competitors.

Accurate sentiment analysis can be difficult to conduct, what’s the benefit? Why do we use an AI-powered tool to categorize natural language feedback rather than our human brains?

Mostly, it’s a question of scale. Sentiment analysis is helpful when you have a large volume of text-based information that you need to generalize from.

For example, let’s say you work on the marketing team at a major motion picture studio, and you just released a trailer for a movie that got a huge volume of comments on Twitter.

You can read some – or even a lot – of the comments, but you won’t be able to get an accurate picture of how many people liked or disliked it unless you look at every last one and make a note of whether it was positive, negative or neutral. That would be prohibitively expensive and time-consuming, and the results would be prone to a degree of human error.

On top of that, you’d have a risk of bias coming from the person or people going through the comments. They might have certain views or perceptions that color the way they interpret the data, and their judgment may change from time to time depending on their mood, energy levels, and other normal human variations.

On the other hand, sentiment analysis tools provide a comprehensive, consistent overall verdict with a simple button press.

From there, it’s up to the business to determine how they’ll put that sentiment into action .

Sentiment analysis is critical because it helps provide insight into how customers perceive your brand .

Customer feedback – whether that’s via social media, the website, conversations with service agents, or any other source – contains a treasure trove of useful business information, but it isn’t enough to know what customers are talking about. Knowing how they feel will give you the most insight into how their experience was. Sentiment analysis is one way to understand those experiences.

Sometimes known as “opinion mining,” sentiment analysis can let you know if there has been a change in public opinion toward any aspect of your business. Peaks or valleys in sentiment scores give you a place to start if you want to make product improvements, train sales reps or customer care agents, or create new marketing campaigns.

We live in a world where huge amounts of written information are produced and published every moment, thanks to the internet, news articles, social media, and digital communications. Sentiment analysis can help companies keep track of how their brands and products are perceived, both at key moments and over a period of time.

It can also be used in market research , PR, marketing analysis, reputation management , stock analysis and financial trading, customer experience , product design , and many more fields.

Here are a few scenarios where sentiment analysis can save time and add value:

  • Social media listening – in day-to-day monitoring, or around a specific event such as a product launch
  • Analyzing survey responses for a large-scale research program
  • Processing employee feedback in a large organization
  • Identifying very unhappy customers so you can offer closed-loop follow up
  • See where sentiment trends are clustered in particular groups or regions
  • Competitor research – checking your approval levels against comparable businesses

Airline onboard experience sentiment by category

Not all sentiment analysis is done the same way. There are different ways to approach it and a range of different algorithms and processes that can be used to do the job depending on the context of use and the desired outcome.

Basic sub-types of sentiment analysis include:

  • Detecting sentiment This means parsing through text and sorting opinionated data (such as “I love this!”) from objective data (like “the restaurant is located downtown”).
  • Categorizing sentiment This means detecting whether the sentiment is positive, negative, or neutral. Your tools may also add weighting to these categories, e.g very positive, positive, neutral, somewhat negative, negative.
  • Clause-level Analysis Sometimes, the text contains mixed or ambivalent opinions, for example, “staff was very friendly but we waited too long to be served”. Being able to score feedback at the clause level indicates when there are both good and bad opinions expressed in one place , and can be useful in case the positives and negatives within a text cancel each other out and return a misleading neutral sentiment

In addition, you can choose whether to view the results of sentiment analysis at:

  • Document-level (useful for professional reviews or press coverage)
  • Sentence level (for short comments and evaluations)
  • Sub-sentence level (for picking out the meaning in phrases or short clauses within a sentence)

Sentiment analysis is a powerful tool that offers a number of advantages, but like any research method, it has some limitations.

Advantages of sentiment analysis:

  • Accurate, unbiased results
  • Enhanced insights
  • More time and energy available for staff do to higher-level tasks
  • Consistent measures you can use to track sentiment over time

Disadvantages of sentiment analysis:

  • Best for large and numerous data sets. To get real value out of sentiment analysis tools, you need to be analyzing large quantities of textual data on a regular basis.
  • Sentiment analysis is still a developing field, and the results are not always perfect. You may still need to sense-check and manually correct results occasionally.

Sentiment analysis uses machine learning, statistics, and natural language processing (NLP) to find out how people think and feel on a macro scale. Sentiment analysis tools take written content and process it to unearth the positivity or negativity of the expression.

This is done in a couple of ways:

  • Rule-based sentiment analysis This method uses a lexicon, or word-list, where each word is given a score for sentiment, for example “great” = 0.9, “lame” = -0.7, “okay” = 0.1 Sentences are assessed for overall positivity or negativity using these weightings. Rule-based systems usually require additional finessing to account for sarcasm, idioms, and other verbal anomalies.
  • Machine learning-based sentiment analysis A computer model is given a training set of natural language feedback, manually tagged with sentiment labels. It learns which words and phrases have a positive sentiment or a negative sentiment. Once trained, it can then be used on new data sets.

In some cases, the best results come from combining the two methods.

Sentiment analysis of client feedback

Developing sentiment analysis tools is technically an impressive feat, since human language is grammatically intricate, heavily context-dependent, and varies a lot from person to person. If you say “I loved it,” another person might say “I’ve never seen better,” or “Leaves its rivals in the dust”. The challenge for an AI tool is to recognize that all these sentences mean the same thing.

Another challenge is to decide how language is interpreted since this is very subjective and varies between individuals. What sounds positive to one person might sound negative or even neutral to someone else. In designing algorithms for sentiment analysis, data scientists must think creatively in order to build useful and reliable tools.

Getting the correct sentiment classification

Sentiment classification requires your sentiment analysis tools to be sophisticated enough to understand not only when a data snippet is positive or negative, but how to extrapolate sentiment even when both positive and negative words are used. On top of that, it needs to be able to understand context and complications such as sarcasm or irony.

Human beings are complicated, and how we express ourselves can be similarly complex. Many types of sentiment analysis tools use a simple view of polarity (positive/neutral/negative), which means much of the meaning behind the data is lost.

Let’s see an example:

“I hated the setup process, but the product was easy to use so in the end, I think my purchase was worth it.”

A less sophisticated sentiment analysis tool might see the sentiment expressed here as “neutral” because the positive – “the product was easy to use so, in the end, I think my purchase was worth it” – and negative-tagged sentiments – “I hated the setup process” – cancel each other out.

However, polarity isn’t so cut-and-dry as being one or the other here. The final part – “in the end, I think my purchase was worth it” – means that as a human analyzing the text, we can see that generally, this customer felt mostly positive about the experience. That’s why a scale from positive to negative is needed, and why a sentiment analysis tool adds weighting along a scale of 1-11.

How satisfied are you with our service? Likert scale question

Scores are assigned with attention to grammar, context, industry, and source, and Qualtrics gives users the ability to adjust the sentiment scores to be even more business-specific.

Understanding context

Context is key for a sentiment analysis model to be correct. This means you need to make sure that your sentiment scoring tool not only knows that “happy” is positive—and that “not happy” is not, but understands that certain words that are context-dependent are viewed correctly.

As human beings, we know customers are pleased when they mention how “thin” their new laptop is, but that they’re complaining when they talk about the “thin” walls in your hotel. We understand that context.

Obviously, a tool that flags “thin” as negative sentiment in all circumstances is going to lose accuracy in its sentiment scores. The context is important.

This is where training natural language processing (NLP) algorithms come in. Natural language processing is a way of mimicking the human understanding of language, meaning context becomes more readily understood by your sentiment analysis tool.

Sentiment analysis algorithms are trained using this system over time, using deep learning to understand instances with context and apply that learning to future data. This is why a sophisticated sentiment analysis tool can help you to not only analyze vast volumes of data more quickly but also discern what context is common or important to your customers .

In a world of endless opinions on the Web, how people “feel” about your brand can be important for measuring the customer experience .

Consumers desire likable brands that understand them; brands that provide memorable on-and-offline experiences. The more in-tune a consumer feels with your brand, the more likely they’ll share feedback, and the more likely they’ll buy from you too. According to our Consumer trends research , 62% of consumers said that businesses need to care more about them, and 60% would buy more as a result.

But the opposite is true as well. As a matter of fact, 71 percent of Twitter users will take to the social media platform to voice their frustrations with a brand.

These conversations, both positive and negative, should be captured and analyzed to improve the customer experience. Sentiment analysis can help.

1. Text analysis for surveys

Surveys are a great way to connect with customers directly, and they’re also ripe with constructive feedback . The feedback within survey responses can be quickly analyzed for sentiment scores.

For the survey itself, consider questions that will generate qualitative customer experience metrics, some examples include:

  • What was your most recent experience like?
  • How much better (or worse) was your experience compared to your expectations?
  • What is something you would have changed about your experience?

Remember, the goal here is to acquire honest textual responses from your customers so the sentiment within them can be analyzed. Another tip is to avoid close-ended questions that only generate “yes” or “no” responses. These types of questions won’t serve your analysis well.

Next, use a text analysis tool to break down the nuances of the responses. TextiQ is a tool that will not only provide sentiment scores but extract key themes from the responses.

After the sentiment is scored from survey responses, you’ll be able to address some of the more immediate concerns your customers have during their experiences.

Another great place to find text feedback is through customer reviews .

2. Text analysis for customer reviews

Did you know that 72 percent of customers will not take action until they’ve read reviews on a product or service? An astonishing 95 percent of customers read reviews prior to making a purchase. In today’s feedback-driven world, the power of customer reviews and peer insight is undeniable.

Review sites like G2 are common first-stops for customers looking for honest feedback on products and services. This feedback, like that in surveys, can be analyzed.

The benefit of customer reviews compared to surveys is that they’re unsolicited, which often leads to more honest and in-depth feedback.

To improve the customer experience, you can take the sentiment scores from customer reviews – positive, negative, and neutral – and identify gaps and pain points that may have not been addressed in the surveys. Remember, negative feedback is just as (if not more) beneficial to your business than positive feedback.

3. Text analysis for social media

Another way to acquire textual data is through social media analysis.

Monitoring tools ingest publicly available social media data on platforms such as Twitter and Facebook for brand mentions and assign sentiment scores accordingly. This has its upsides as well considering users are highly likely to take their uninhibited feedback to social media.

Regardless, a staggering 70 percent of brands don’t bother with feedback on social media. Because social media is an ocean of big data just waiting to be analyzed, brands could be missing out on some important information.

When choosing sentiment analysis technologies, bear in mind how you will use them. There are a number of options out there, from open-source solutions to in-built features within social listening tools. Some of them are limited in scope, while others are more powerful but require a high level of user knowledge.

Text iQ is a natural language processing tool within the Experience Management Platform™ that allows you to carry out sentiment analysis online using just your browser. It’s fully integrated, meaning that you can view and analyze your sentiment analysis results in the context of other data and metrics, including those from third-party platforms.

Like all our tools, it’s designed to be straightforward, clear, and accessible to those without specialized skills or experience, so there’s no barrier between you and the results you want to achieve.

When it comes to understanding the customer experience, the key is to always be on the lookout for customer feedback. Sentiment analysis is not a one-and-done effort and requires continuous monitoring. By reviewing your customers’ feedback on your business regularly, you can proactively get ahead of emerging trends and fix problems before it’s too late.  Acquiring feedback and analyzing sentiment can provide businesses with a deep understanding of how customers truly “feel” about their brand. When you’re able to understand your customers, you’re able to provide a more robust customer experience.

Related resources

Analysis & Reporting

Margin of error 11 min read

Data saturation in qualitative research 8 min read, thematic analysis 11 min read, behavioral analytics 12 min read, statistical significance calculator: tool & complete guide 18 min read, regression analysis 19 min read, data analysis 31 min read, request demo.

Ready to learn more about Qualtrics?

analysis sentiment research

Sentiment Analysis: Decoding Emotions for Research

analysis sentiment research

Introduction

What is sentiment analysis, what is an example of sentiment analysis, why is sentiment analysis important, how do you collect sentiments, how do you analyze sentiments, what are the current challenges for sentiment analysis.

Sentiment analysis is the process of determining whether textual data contains a positive sentiment or a negative sentiment. Researchers use sentiment analysis tools to provide additional clarity and context to the messages conveyed in words to deliver more meaningful insights.

In this article, we'll look at the importance of sentiments, how researchers analyze sentiments, and what strategies and tools can help you in your research .

analysis sentiment research

Sentiment analysis is a subset of natural language processing (NLP) that focuses on extracting and understanding the emotional content from data . The primary objective is to classify the polarity of a text as positive, negative, or neutral. This classification is essential for understanding customer sentiment, gauging public opinion, and conducting in-depth research on various topics.

At its core, a sentiment analysis system employs machine learning techniques and algorithms to dissect the language used in text data from many sources, such as:

  • written feedback
  • news articles
  • survey records
  • social media posts

One of the most refined forms of this method is aspect-based sentiment analysis. Rather than merely classifying the overall sentiment of a document, this kind of analysis pinpoints specific topics or aspects within the text and evaluates the sentiment towards each. Such sentiment analysis technologies with natural language processing can also be used for opinion mining.

A simple example

Consider a product review that states, "The camera on this phone is excellent, but the battery life is short." A sentiment analysis model would recognize the positive sentiment towards the camera and the negative sentiment towards the battery life, rather than giving a blanket sentiment score.

Sentiment analysis tools are varied, ranging from simple models that identify positive and negative terms to sophisticated sentiment analysis models that rely on machine learning and data scientists for insightful sentiment analysis. Such tools work by assigning a sentiment score to words or phrases, often based on their context. The result? A sentiment analysis solution that deciphers the nuances of human language, turning unstructured data into actionable insights.

Ultimately, an accurate sentiment analysis bridges the gap between the vast world of text-based data and the need to understand the underlying emotions and opinions it contains. Whether you're a researcher looking to perform sentiment analysis on news articles or a business keen on understanding customer feedback, sentiment analysis is a pivotal tool in today's data-driven world.

analysis sentiment research

For deeper insights, turn to ATLAS.ti

Make the most of your data with the most comprehensive qualitative data analysis available. Download a free trial today.

Sentiment analysis offers tangible examples of its applications across diverse fields. From businesses striving to enhance their products to researchers aiming to grasp public sentiment on various issues, the power of sentiment analysis is evident.

By examining specific sectors, we can better understand the profound impact this analysis has on our decision-making processes and the vast potential it holds in shaping perceptions.

Market research

Conducting market research often consists of analyzing sentiment to gauge public reactions to a product or service. Using sentiment analysis tools, companies can sift through survey responses and online reviews, identifying patterns that might not be immediately apparent.

For example, if a new beverage receives predominantly positive reviews for its taste but negative comments about its packaging, this analytical approach can highlight these specific sentiments, guiding the company in refining its offering.

analysis sentiment research

Customer feedback

Customer feedback is a goldmine of sentiment analysis datasets for businesses aiming to improve their services. By implementing a sentiment analysis system, companies can categorize feedback as positive, negative, or neutral, making it easier to prioritize areas for improvement.

Suppose a hotel chain discovers that a significant number of negative words in customer reviews pertain to room cleanliness. In that case, they can take immediate measures to address this concern, enhancing the overall guest experience.

analysis sentiment research

Social media platforms

Social media is awash with opinions and feedback. By employing models for the analysis of sentiments, businesses and researchers can tap into real-time feelings of the masses.

For instance, if a celebrity endorses a brand and sentiment analysis reflects a surge in positive words associated with that brand, it can be concluded that the endorsement had a favorable impact. Conversely, if a political figure makes a statement and the analysis indicates a spike in negative words related to the topic, it provides insights into public opinion.

analysis sentiment research

Sentiment analysis has rapidly become a crucial tool in today's digital age, helping businesses, researchers, and individuals decode the emotions hidden within vast amounts of data. But why has it garnered such significance?

The reasons are manifold, but they all converge on the idea that understanding sentiment offers a deeper, more nuanced view of human reactions and opinions.

Sentiment analysis use cases & applications

The applications of sentiment analysis are diverse and expansive. For instance, in the realm of politics, sentiment analysis can be used to gauge public opinion on policies or candidates, offering insights that can guide campaign strategies.

In the healthcare sector, sentiment analysis can capture patient feedback, allowing providers to fine-tune their services and improve patient experiences.

Moreover, educators can use sentiment analysis to understand student feedback, making curriculum adjustments that align with student needs and preferences.

analysis sentiment research

Benefits of sentiment analysis

Beyond its various applications, the benefits of sentiment analysis are profound. Firstly, it offers an efficient way to process large volumes of unstructured data , turning it into actionable insights. Businesses, for example, can use sentiment analysis to get ahead of potential public relations crises by identifying negative sentiments early.

Furthermore, it provides rule-based systems that can circumvent the time-consuming task of manually reviewing each piece of feedback. This not only saves time but also reduces the risk of human bias.

Most significantly, by understanding both positive and negative phrases and their context, organizations can better align their strategies and offerings with their audience's true feelings and needs.

analysis sentiment research

Collecting sentiments involves gathering data from various sources to be analyzed for emotional content. This task, while seemingly straightforward, requires a systematic approach to ensure that the data obtained is both relevant and of high quality.

One of the primary sources for sentiment collection is social media platforms. Platforms like Twitter, Facebook, and Instagram are brimming with user-generated content that reflects public opinion on a vast array of topics. By utilizing specialized web scraping tools or APIs provided by these platforms, one can amass large datasets of posts, comments, and reviews to analyze.

analysis sentiment research

Customer reviews on e-commerce websites, such as Amazon or Yelp, are another treasure trove of sentiments. These reviews often provide detailed insights into customer sentiment about products, services, and overall brand perception. Similarly, survey responses, when designed with open-ended questions, can provide valuable data that captures the sentiments of the respondents.

In the news and media sector, news articles and op-eds are rich sources of sentiment. Collecting sentiments from these sources can help gauge public sentiment on current events, governmental decisions, or societal issues.

Forums and online communities, like Reddit or specialized industry forums, offer another avenue. Here, users often engage in in-depth discussions, providing nuanced views that are ripe for sentiment analysis.

However, while collecting sentiments, it's essential to consider privacy and ethical guidelines. Ensuring that data is anonymized and devoid of personally identifiable information is crucial. Moreover, always be aware of terms of service when extracting data from online platforms, as some might have restrictions on data scraping.

Analyzing sentiments is a multifaceted process that goes beyond merely identifying positive or negative words. It examines the context, nuances, and the intricate elements of human language. With advancements in machine learning and data science, this analysis has become more refined and precise.

Sentiment scores

At the foundation of this analytical approach lies the sentiment score. This score is usually a numerical value assigned to a piece of text, indicating its overall sentiment. For instance, a system to analyze sentiment might assign values on a scale from -1 (negative) to 1 (positive), with 0 representing a neutral sentiment. Sentiment scores provide a quick overview, enabling researchers and businesses to categorize large datasets swiftly.

Sentiment analysis algorithms

A machine learning algorithm, natural language toolkit, or artificial neural networks can power sentiment analysis work. These range from simple rule-based algorithms, which identify sentiments based on predefined lists of positive and negative words, to more complex machine learning techniques. Machine learning-based sentiment analysis models, especially those utilizing deep learning, consider the broader context in which words are used, leading to more advanced sentiment analysis.

Sentiment analysis tools

There's a plethora of tools available, each tailored for different requirements. Some tools are designed for specific industries, while others are more general-purpose. Many of these tools leverage advanced models, making it easier for users without a deep technical background to extract meaningful insights from textual data. The qualitative data analysis software ATLAS.ti, for example, includes a sentiment analysis tool to automatically code data .

Sentiment analysis, despite its transformative potential and growing adoption, is not without its share of challenges. The intricacies of language and emotion often pose complexities that even the most advanced systems can find challenging to navigate.

Sarcasm and irony : One of the most significant challenges is detecting sarcasm and irony. A statement like "Oh, great! Another flat tire!" may be classified as positive by rudimentary analysis models because of the word "great." However, the context clearly indicates a negative sentiment.

Cultural nuances : Cultural and regional variations in language can affect sentiment interpretation. A word or phrase that's considered positive in one culture might be neutral or even negative in another. Without a culturally-aware model, these nuances can easily be missed.

Short and ambiguous texts : Platforms like Twitter, with their character limitations, often contain short and sometimes ambiguous messages. Without ample context, determining the sentiment of such messages can be tricky.

Polysemy : Words with multiple meanings, based on context, can pose challenges. For instance, the word "light" can be positive when referring to a "light meal" but negative when talking about "light rain" during a planned outdoor event.

Emotionally complex statements : Some statements might contain mixed emotions, making them hard to classify. For example, "I love how this camera captures colors, but its weight is a bit much for me." This statement contains both positive and negative sentiments about the same product.

Evolution of language : Language is dynamic. New words, slang, and expressions constantly emerge, especially on digital platforms. Keeping sentiment analysis tools updated to recognize and correctly interpret these new terms is a continual challenge.

Addressing these challenges requires a combination of improved algorithms, larger and more diverse training datasets, and a deeper understanding of linguistics and cultural contexts. As technology advances and sentiment analysis solutions become more sophisticated, the hope is that these challenges will diminish, leading to even more accurate and insightful outcomes.

analysis sentiment research

Make ATLAS.ti your own sentiment analysis solution

Powerful auto-coding tools for sentiment analysis and opinion mining are at your fingertips, starting with a free trial.

analysis sentiment research

Logo

Sentiment Analysis: A Definitive Guide

What is sentiment analysis, sentiment analysis examples.

  • How Does It Work?
  • Sentiment Anaysis Tools

Emojis as representations of sentiments: positive, neutral, and negative to show how sentiment analysis works. Text reads 'my experience so far has been fantastic!' (positive), 'The product is ok, I guess' (neutral), and 'your support team is useless' (negative).

Sentiment analysis (or opinion mining) is a natural language processing (NLP) technique used to determine whether data is positive, negative or neutral. Sentiment analysis is often performed on textual data to help businesses monitor brand and product sentiment in customer feedback , and understand customer needs.

Start analyzing your text for sentiment

Learn more about how sentiment analysis works, its challenges, and how you can use sentiment analysis to improve processes, decision-making, customer satisfaction and more.

Once you’re familiar with the basics, get started with easy-to-use sentiment analysis tools that are ready to use right off the bat.

Types of Sentiment Analysis

Why is sentiment analysis important.

  • Sentiment Analysis Examples & Break Down of Trustpilot Reviews

How Does Sentiment Analysis Work?

Sentiment analysis challenges.

  • Sentiment Analysis Applications

Sentiment Analysis Tools & Tutorials

Sentiment analysis research & courses.

The Basics of Sentiment Analysis

Sentiment analysis is the process of detecting positive or negative sentiment in text. It’s often used by businesses to detect sentiment in social data, gauge brand reputation, and understand customers.

Sentiment analysis focuses on the polarity of a text ( positive, negative, neutral ) but it also goes beyond polarity to detect specific feelings and emotions ( angry, happy, sad , etc), urgency ( urgent, not urgent ) and even intentions ( interested v. not interested ).

Depending on how you want to interpret customer feedback and queries, you can define and tailor your categories to meet your sentiment analysis needs. In the meantime, here are some of the most popular types of sentiment analysis:

Graded Sentiment Analysis

If polarity precision is important to your business, you might consider expanding your polarity categories to include different levels of positive and negative:

  • Very positive
  • Very negative

This is usually referred to as graded or fine-grained sentiment analysis, and could be used to interpret 5-star ratings in a review, for example:

  • Very Positive = 5 stars
  • Very Negative = 1 star

Emotion detection

Emotion detection sentiment analysis allows you to go beyond polarity to detect emotions, like happiness, frustration, anger, and sadness.

Many emotion detection systems use lexicons (i.e. lists of words and the emotions they convey) or complex machine learning algorithms .

One of the downsides of using lexicons is that people express emotions in different ways. Some words that typically express anger, like bad or kill (e.g. your product is so bad or your customer support is killing me ) might also express happiness (e.g. this is bad ass or you are killing it ).

Aspect-based Sentiment Analysis

Usually, when analyzing sentiments of texts you’ll want to know which particular aspects or features people are mentioning in a positive, neutral, or negative way.

That's where aspect-based sentiment analysis can help, for example in this product review: "The battery life of this camera is too short" , an aspect-based classifier would be able to determine that the sentence expresses a negative opinion about the battery life of the product in question.

Multilingual sentiment analysis

Multilingual sentiment analysis can be difficult. It involves a lot of preprocessing and resources. Most of these resources are available online (e.g. sentiment lexicons), while others need to be created (e.g. translated corpora or noise detection algorithms), but you’ll need to know how to code to use them.

Alternatively, you could detect language in texts automatically with a language classifier, then train a custom sentiment analysis model to classify texts in the language of your choice.

Since humans express their thoughts and feelings more openly than ever before, sentiment analysis is fast becoming an essential tool to monitor and understand sentiment in all types of data.

Automatically analyzing customer feedback , such as opinions in survey responses and social media conversations, allows brands to learn what makes customers happy or frustrated, so that they can tailor products and services to meet their customers’ needs.

For example, using sentiment analysis to automatically analyze 4,000+ open-ended responses in your customer satisfaction surveys could help you discover why customers are happy or unhappy at each stage of the customer journey.

Maybe you want to track brand sentiment so you can detect disgruntled customers immediately and respond as soon as possible. Maybe you want to compare sentiment from one quarter to the next to see if you need to take action. Then you could dig deeper into your qualitative data to see why sentiment is falling or rising.

The overall benefits of sentiment analysis include :

  • Sorting Data at Scale

Can you imagine manually sorting through thousands of tweets, customer support conversations, or surveys ? There’s just too much business data to process manually. Sentiment analysis helps businesses process huge amounts of unstructured data in an efficient and cost-effective way.

  • Real-Time Analysis

Sentiment analysis can identify critical issues in real-time, for example is a PR crisis on social media escalating? Is an angry customer about to churn? Sentiment analysis models can help you immediately identify these kinds of situations, so you can take action right away.

  • Consistent criteria

It’s estimated that people only agree around 60-65% of the time when determining the sentiment of a particular text. Tagging text by sentiment is highly subjective, influenced by personal experiences, thoughts, and beliefs.

By using a centralized sentiment analysis system, companies can apply the same criteria to all of their data, helping them improve accuracy and gain better insights.

The applications of sentiment analysis are endless. So, to help you understand how sentiment analysis could benefit your business, let’s take a look at some examples of texts that you could analyze using sentiment analysis.

Then, we’ll jump into a real-world example of how Chewy, a pet supplies company, was able to gain a much more nuanced (and useful!) understanding of their reviews through the application of sentiment analysis.

To understand the goal and challenges of sentiment analysis, here are some examples:

Basic examples of sentiment analysis data

  • Netflix has the best selection of films
  • Hulu has a great UI
  • I dislike like the new crime series
  • I hate waiting for the next series to come out

More challenging examples of sentiment analysis

  • I do not dislike horror movies. (phrase with negation)
  • Disliking horror movies is not uncommon. (negation, inverted word order)
  • Sometimes I really hate the show. (adverbial modifies the sentiment)
  • I love having to wait two months for the next series to come out! ( sarcasm)
  • The final episode was surprising with a terrible twist at the end (negative term used in a positive way)
  • The film was easy to watch but I would not recommend it to my friends. (difficult to categorize)
  • I LOL’d at the end of the cake scene (often hard to understand new terms)

Now, let’s take a look at some real reviews on Trustpilot and see how MonkeyLearn’s sentiment analysis tools fare when it comes to recognizing and categorizing sentiment.

Case Study: Sentiment analysis on TrustPilot Reviews

Chewy is a pet supplies company – an industry with no shortage of competition, so providing a superior customer experience (CX) to their customers can be a massive difference maker.

For this reason, online reviews can be an extremely valuable source of information to gain customer insights to improve their CX. Chewy has thousands of reviews in TrustPilot, this is what their review archive looks like:

The overall star rating of Chewy reviews on Trustpilot: 3.6 star rating from over 9,000 reviews, with 82% leaving a score of 'excellent'.

Via TrustPilot

It is easy to draw a general conclusion about Chewy’s relative success from this alone - 82% of responses being excellent is a great starting place.

But TrustPilot’s results alone fall short if Chewy’s goal is to improve its services. This perfunctory overview fails to provide actionable insight , the cornerstone, and end goal, of effective sentiment analysis. 

If Chewy wanted to unpack the what and why behind their reviews, in order to further improve their services, they would need to analyze each and every negative review at a granular level.

But with sentiment analysis tools , Chewy could plug in their 5,639 (at the time) TrustPilot reviews to gain instant sentiment analysis insights.

We uploaded and analyzed Chewy’s reviews to MonkeyLearn’s all-in-one data analysis and visualization studio to generate the following dashboard:

MonkeyLearn data visualization dashboard showing how reviews have been filtered by topic and sentiment. An example of aspect-based sentiment analysis in the form of graphs, pie charts, word clouds, and tagged data.

Chewy TrustPilot Reviews Sample

Feel free to click this link to peruse the results at your leisure - as this sample dashboard is a public demo, you can click through and explore the inputs and filters at work yourself.

While there is a ton more to explore, in this breakdown we are going to focus on four sentiment analysis data visualization results that the dashboard has visualized for us.

  • Overall Sentiment
  • Sentiment over Time
  • Sentiment by Rating
  • Sentiment by Topic

1. Overall sentiment

We’ll begin by pulling the relevant graphic from the above dashboard. 

Overall sentiment of Chewy's reviews, split by positive (38.2%), negative (40.8%), and neutral (21%) sentiment in a pie chart.

You’ll notice that these results are very different from TrustPilot’s overview (82% excellent, etc). This is because MonkeyLearn’s sentiment analysis AI performs advanced sentiment analysis, parsing through each review sentence by sentence, word by word. 

What you are left with is an accurate assessment of everything customers have written, rather than a simple tabulation of stars. This analysis can point you towards friction points much more accurately and in much more detail. 

Read up on the mechanics of how sentiment analysis works below .

2. Sentiment over time

Here’s our handy-dandy sentiment over time graph, blown up:

analysis sentiment research

This data visualization sample is classic temporal datavis, a datavis type that tracks results and plots them over a period of time.

This graph expands on our Overall Sentiment data - it tracks the overall proportion of positive, neutral, and negative sentiment in the reviews from 2016 to 2021.

This graph informs the gradual change in the content of their written reviews over this five year period. For instance, negative responses went down from 2019-2020, then jumped back up to previous levels in 2021.

3. Sentiment by rating

The number of reviews and proportion of sentiment broken down by rating.

Now we jump to something that anchors our text-based sentiment to TrustPilot’s earlier results.

By taking each TrustPilot category from 1-Bad to 5-Excellent, and breaking down the text of the written reviews from the scores you can derive the above graphic.

Looking at the results, and courtesy of taking a deeper look at the reviews via sentiment analysis, we can draw a couple interesting conclusions right off the bat.

  • TrustPilots results aren’t useless - the better reviews have higher proportions of positive sentiment and the worse reviews have more negative sentiment. But, all reviews contain a little bit of all types of sentiment - we’ve learned that our reviews are nuanced and thus likely have even more hidden insight for us! 
  • Our reviews are polarized. They skew in amounts towards 5 and 1.

These quick takeaways point us towards goldmines for future analysis. Namely, the positive sentiment sections of negative reviews and the negative section of positive ones, and the 2 - 4 reviews (why do they feel the way they do, how could we improve their scores?). 

4. Sentiment by Topic

Number of reviews and proportion of sentiment broken down by topics: customer support, shipping, product, pricing, and website

Finally, we can take a look at Sentiment by Topic to begin to illustrate how sentiment analysis can take us even further into our data.

The above chart applies product-linked text classification in addition to sentiment analysis to pair given sentiment to product/service specific features, this is known as aspect-based sentiment analysis .

This means we can know how our customers feel about what, helping us zero in and fix specific pain points or issues. 

These are all great jumping off points designed to visually demonstrate the value of sentiment analysis - but they only scratch the surface of its true power.

Read on for a step-by-step walkthrough of how sentiment analysis works.

How Does Sentiment Analysis Work?

Sentiment analysis, otherwise known as opinion mining, works thanks to natural language processing (NLP) and machine learning algorithms , to automatically determine the emotional tone behind online conversations.

There are different algorithms you can implement in sentiment analysis models, depending on how much data you need to analyze, and how accurate you need your model to be. We’ll go over some of these in more detail, below.

Sentiment analysis algorithms fall into one of three buckets:

  • Rule-based: these systems automatically perform sentiment analysis based on a set of manually crafted rules.
  • Automatic: systems rely on machine learning techniques to learn from data.
  • Hybrid systems combine both rule-based and automatic approaches.

Rule-based Approaches

Usually, a rule-based system uses a set of human-crafted rules to help identify subjectivity, polarity, or the subject of an opinion.

These rules may include various NLP techniques developed in computational linguistics, such as:

  • Stemming , tokenization , part-of-speech tagging and parsing .
  • Lexicons (i.e. lists of words and expressions).

Here’s a basic example of how a rule-based system works:

  • Defines two lists of polarized words (e.g. negative words such as bad , worst , ugly , etc and positive words such as good , best , beautiful , etc).
  • Counts the number of positive and negative words that appear in a given text.
  • If the number of positive word appearances is greater than the number of negative word appearances, the system returns a positive sentiment, and vice versa. If the numbers are even, the system will return a neutral sentiment.

Rule-based systems are very naive since they don't take into account how words are combined in a sequence. Of course, more advanced processing techniques can be used, and new rules added to support new expressions and vocabulary. However, adding new rules may affect previous results, and the whole system can get very complex. Since rule-based systems often require fine-tuning and maintenance, they’ll also need regular investments.

Automatic Approaches

Automatic methods, contrary to rule-based systems, don't rely on manually crafted rules, but on machine learning techniques. A sentiment analysis task is usually modeled as a classification problem, whereby a classifier is fed a text and returns a category, e.g. positive, negative, or neutral.

Here’s how a machine learning classifier can be implemented:

How does Sentiment Analysis Work

The Training and Prediction Processes

In the training process (a), our model learns to associate a particular input (i.e. a text) to the corresponding output (tag) based on the test samples used for training. The feature extractor transfers the text input into a feature vector. Pairs of feature vectors and tags (e.g. positive , negative , or neutral ) are fed into the machine learning algorithm to generate a model.

In the prediction process (b), the feature extractor is used to transform unseen text inputs into feature vectors. These feature vectors are then fed into the model, which generates predicted tags (again, positive , negative , or neutral ).

Feature Extraction from Text

The first step in a machine learning text classifier is to transform the text extraction or text vectorization, and the classical approach has been bag-of-words or bag-of-ngrams with their frequency.

More recently, new feature extraction techniques have been applied based on word embeddings (also known as word vectors ). This kind of representations makes it possible for words with similar meaning to have a similar representation, which can improve the performance of classifiers.

Classification Algorithms

The classification step usually involves a statistical model like Naïve Bayes, Logistic Regression, Support Vector Machines, or Neural Networks:

  • Naïve Bayes : a family of probabilistic algorithms that uses Bayes’s Theorem to predict the category of a text.
  • Linear Regression : a very well-known algorithm in statistics used to predict some value (Y) given a set of features (X).
  • Support Vector Machines : a non-probabilistic model which uses a representation of text examples as points in a multidimensional space. Examples of different categories (sentiments) are mapped to distinct regions within that space. Then, new texts are assigned a category based on similarities with existing texts and the regions they’re mapped to.
  • Deep Learning : a diverse set of algorithms that attempt to mimic the human brain, by employing artificial neural networks to process data.

Hybrid Approaches

Hybrid systems combine the desirable elements of rule-based and automatic techniques into one system. One huge benefit of these systems is that results are often more accurate.

Sentiment analysis is one of the hardest tasks in natural language processing because even humans struggle to analyze sentiments accurately.

Data scientists are getting better at creating more accurate sentiment classifiers, but there’s still a long way to go. Let’s take a closer look at some of the main challenges of machine-based sentiment analysis:

  • Subjectivity & Tone
  • Context & Polarity
  • Irony & Sarcasm

Comparisons

Defining neutral, human annotator accuracy, subjectivity and tone.

There are two types of text: subjective and objective. Objective texts do not contain explicit sentiments, whereas subjective texts do. Say, for example, you intend to analyze the sentiment of the following two texts:

The package is nice.

The package is red.

Most people would say that sentiment is positive for the first one and neutral for the second one, right? All predicates (adjectives, verbs, and some nouns) should not be treated the same with respect to how they create sentiment. In the examples above, nice is more subjective than red .

Context and Polarity

All utterances are uttered at some point in time, in some place, by and to some people, you get the point. All utterances are uttered in context. Analyzing sentiment without context gets pretty difficult. However, machines cannot learn about contexts if they are not mentioned explicitly. One of the problems that arise from context is changes in polarity . Look at the following responses to a survey:

Everything about it.

Absolutely nothing!

Imagine the responses above come from answers to the question What did you like about the event? The first response would be positive and the second one would be negative, right? Now, imagine the responses come from answers to the question What did you DISlike about the event? The negative in the question will make sentiment analysis change altogether.

A good deal of preprocessing or postprocessing will be needed if we are to take into account at least part of the context in which texts were produced. However, how to preprocess or postprocess data in order to capture the bits of context that will help analyze sentiment is not straightforward.

Irony and Sarcasm

When it comes to irony and sarcasm , people express their negative sentiments using positive words, which can be difficult for machines to detect without having a thorough understanding of the context of the situation in which a feeling was expressed.

For example, look at some possible answers to the question, Did you enjoy your shopping experience with us?

Yeah, sure. So smooth!

Not one, but many!

What sentiment would you assign to the responses above? The first response with an exclamation mark could be negative, right? The problem is there is no textual cue that will help a machine learn, or at least question that sentiment since yeah and sure often belong to positive or neutral texts.

How about the second response? In this context, sentiment is positive, but we’re sure you can come up with many different contexts in which the same response can express negative sentiment.

How to treat comparisons in sentiment analysis is another challenge worth tackling. Look at the texts below:

This product is second to none.

This is better than older tools.

This is better than nothing.

The first comparison doesn’t need any contextual clues to be classified correctly. It’s clear that it’s positive.

The second and third texts are a little more difficult to classify, though. Would you classify them as neutral , positive , or even negative ? Once again, context can make a difference. For example, if the ‘older tools’ in the second text were considered useless, then the second text is pretty similar to the third text.

There are two types of emojis according to Guibon et al. . Western emojis (e.g. :D) are encoded in only one or two characters, whereas Eastern emojis (e.g. ¯ \ (ツ) / ¯) are a longer combination of characters of a vertical nature. Emojis play an important role in the sentiment of texts, particularly in tweets.

You’ll need to pay special attention to character-level, as well as word-level, when performing sentiment analysis on tweets. A lot of preprocessing might also be needed. For example, you might want to preprocess social media content and transform both Western and Eastern emojis into tokens and whitelist them (i.e. always take them as a feature for classification purposes) in order to help improve sentiment analysis performance.

Here’s a quite comprehensive list of emojis and their unicode characters that may come in handy when preprocessing.

Defining what we mean by neutral is another challenge to tackle in order to perform accurate sentiment analysis. As in all classification problems, defining your categories -and, in this case, the neutral tag- is one of the most important parts of the problem. What you mean by neutral , positive , or negative does matter when you train sentiment analysis models. Since tagging data requires that tagging criteria be consistent, a good definition of the problem is a must.

Here are some ideas to help you identify and define neutral texts:

  • Objective texts . So called objective texts do not contain explicit sentiments, so you should include those texts into the neutral category.
  • Irrelevant information . If you haven’t preprocessed your data to filter out irrelevant information, you can tag it neutral. However, be careful! Only do this if you know how this could affect overall performance. Sometimes, you will be adding noise to your classifier and performance could get worse.
  • Texts containing wishes . Some wishes like, I wish the product had more integrations are generally neutral. However, those including comparisons like, I wish the product were better are pretty difficult to categorize

Sentiment analysis is a tremendously difficult task even for humans. On average, inter-annotator agreement (a measure of how well two (or more) human labelers can make the same annotation decision) is pretty low when it comes to sentiment analysis. And since machines learn from labeled data , sentiment analysis classifiers might not be as precise as other types of classifiers.

Still, sentiment analysis is worth the effort, even if your sentiment analysis predictions are wrong from time to time. By using MonkeyLearn’s sentiment analysis model , you can expect correct predictions about 70-80% of the time you submit your texts for classification.

If you are new to sentiment analysis, then you’ll quickly notice improvements. For typical use cases, such as ticket routing, brand monitoring, and VoC analysis , you’ll save a lot of time and money on tedious manual tasks.

Sentiment Analysis Use Cases & Applications

Sentiment Analysis Applications and Examples

The applications of sentiment analysis are endless and can be applied to any industry, from finance and retail to hospitality and technology. Below, we’ve listed some of the most popular ways that sentiment analysis is being used in business:

Social Media Monitoring

Brand monitoring.

  • Voice of customer (VoC)

Customer Service

Market research.

Sentiment analysis is used in social media monitoring , allowing businesses to gain insights about how customers feel about certain topics, and detect urgent issues in real time before they spiral out of control.

On the fateful evening of April 9th, 2017, United Airlines forcibly removed a passenger from an overbooked flight. The nightmare-ish incident was filmed by other passengers on their smartphones and posted immediately. One of the videos, posted to Facebook, was shared more than 87,000 times and viewed 6.8 million times by 6pm on Monday, just 24 hours later.

The fiasco was only magnified by the company’s dismissive response. On Monday afternoon, United’s CEO tweeted a statement apologizing for “having to re-accommodate customers.”

This is exactly the kind of PR catastrophe you can avoid with sentiment analysis. It’s an example of why it’s important to care, not only about if people are talking about your brand, but how they’re talking about it. More mentions don't equal positive mentions.

Brands of all shapes and sizes have meaningful interactions with customers, leads, even their competition, all across social media. By monitoring these conversations you can understand customer sentiment in real time and over time, so you can detect disgruntled customers immediately and respond as soon as possible.

Most marketing departments are already tuned into online mentions as far as volume – they measure more chatter as more brand awareness. But businesses need to look beyond the numbers for deeper insights.

Not only do brands have a wealth of information available on social media, but across the internet, on news sites, blogs, forums, product reviews, and more. Again, we can look at not just the volume of mentions, but the individual and overall quality of those mentions.

In our United Airlines example, for instance, the flare-up started on the social media accounts of just a few passengers. Within hours, it was picked up by news sites and spread like wildfire across the US, then to China and Vietnam, as United was accused of racial profiling against a passenger of Chinese-Vietnamese descent. In China, the incident became the number one trending topic on Weibo , a microblogging site with almost 500 million users.

And again, this is all happening within mere hours of the incident.

Brand monitoring offers a wealth of insights from conversations happening about your brand from all over the internet. Analyze news articles, blogs, forums, and more to gauge brand sentiment , and target certain demographics or regions, as desired. Automatically categorize the urgency of all brand mentions and route them instantly to designated team members.

Get an understanding of customer feelings and opinions, beyond mere numbers and statistics. Understand how your brand image evolves over time, and compare it to that of your competition. You can tune into a specific point in time to follow product releases, marketing campaigns, IPO filings, etc., and compare them to past events.

Real-time sentiment analysis allows you to identify potential PR crises and take immediate action before they become serious issues. Or identify positive comments and respond directly, to use them to your benefit.

Example: Expedia Canada

Around Christmas time, Expedia Canada ran a classic “escape winter” marketing campaign. All was well, except for the screeching violin they chose as background music. Understandably, people took to social media, blogs, and forums. Expedia noticed right away and removed the ad.

Then, they created a series of follow-up spin-off videos: one showed the original actor smashing the violin; another invited a real negative Twitter user to rip the violin out of the actor’s hands on screen. Though their original campaign was a flop, Expedia were able to redeem themselves by listening to their customers and responding.

Sentiment analysis allows you to automatically monitor all chatter around your brand and detect and address this type of potentially-explosive scenario while you still have time to defuse it.

Voice of Customer (VoC)

Social media and brand monitoring offer us immediate, unfiltered, and invaluable information on customer sentiment , but you can also put this analysis to work on surveys and customer support interactions.

Net Promoter Score (NPS) surveys are one of the most popular ways for businesses to gain feedback with the simple question: Would you recommend this company, product, and/or service to a friend or family member? These result in a single score on a number scale.

Businesses use these scores to identify customers as promoters, passives, or detractors. The goal is to identify overall customer experience , and find ways to elevate all customers to “promoter” level, where they, theoretically, will buy more, stay longer, and refer other customers.

Numerical (quantitative) survey data is easily aggregated and assessed. But the next question in NPS surveys, asking why survey participants left the score they did, seeks open-ended responses, or qualitative data.

Open-ended survey responses were previously much more difficult to analyze, but with sentiment analysis these texts can be classified into positive and negative (and everywhere in between) offering further insights into the Voice of Customer (VoC) .

Sentiment analysis can be used on any kind of survey – quantitative and qualitative – and on customer support interactions, to understand the emotions and opinions of your customers. Tracking customer sentiment over time adds depth to help understand why NPS scores or sentiment toward individual aspects of your business may have changed.

You can use it on incoming surveys and support tickets to detect customers who are ‘strongly negative’ and target them immediately to improve their service. Zero in on certain demographics to understand what works best and how you can improve.

Real-time analysis allows you to see shifts in VoC right away and understand the nuances of the customer experience over time beyond statistics and percentages.

Discover how we analyzed the sentiment of thousands of Facebook reviews , and transformed them into actionable insights.

Example: McKinsey City Voices project

In Brazil, federal public spending rose by 156% from 2007 to 2015, while satisfaction with public services steadily decreased. Unhappy with this counterproductive progress, the Urban Planning Department recruited McKinsey to help them focus on user experience, or “citizen journeys,” when delivering services. This citizen-centric style of governance has led to the rise of what we call Smart Cities.

McKinsey developed a tool called City Voices, which conducts citizen surveys across more than 150 metrics, and then runs sentiment analysis to help leaders understand how constituents live and what they need, in order to better inform public policy. By using this tool, the Brazilian government was able to uncover the most urgent needs – a safer bus system, for instance – and improve them first.

If this can be successful on a national scale, imagine what it can do for your company.

We already looked at how we can use sentiment analysis in terms of the broader VoC, so now we’ll dial in on customer service teams.

We all know the drill: stellar customer experiences means a higher rate of returning customers. Leading companies know that how they deliver is just as, if not more, important as what they deliver. Customers expect their experience with companies to be immediate, intuitive, personal, and hassle-free. If not, they’ll leave and do business elsewhere. Did you know that one in three customers will leave a brand after just one bad experience ?

You can use sentiment analysis and text classification to automatically organize incoming support queries by topic and urgency to route them to the correct department and make sure the most urgent are handled right away.

Analyze customer support interactions to ensure your employees are following appropriate protocol. Increase efficiency, so customers aren’t left waiting for support. Decrease churn rates; after all it’s less hassle to keep customers than acquire new ones.

Discover how we analyzed customer support interactions on Twitter .

analysis sentiment research

Sentiment analysis empowers all kinds of market research and competitive analysis. Whether you’re exploring a new market, anticipating future trends, or seeking an edge on the competition, sentiment analysis can make all the difference.

You can analyze online reviews of your products and compare them to your competition. Maybe your competitor released a new product that landed as a flop. Find out what aspects of the product performed most negatively and use it to your advantage.

Follow your brand and your competition in real time on social media. Locate new markets where your brand is likely to succeed. Uncover trends just as they emerge, or follow long-term market leanings through analysis of formal market reports and business journals.

You’ll tap into new sources of information and be able to quantify otherwise qualitative information. With social data analysis you can fill in gaps where public data is scarce, like emerging markets.

Discover how to analyze the sentiment of hotel reviews on TripAdvisor or perform sentiment analysis on Yelp restaurant reviews .

Sentiment Analysis Resources

Sentiment analysis is a vast topic, and it can be intimidating to get started. Luckily, there are many useful resources, from helpful tutorials to all kinds of free online tools, to help you take your first steps.

Free Online Sentiment Analysis Tools

A good start to your journey is to simply play around with a sentiment analysis tool. A little first-hand experience will help you understand how it works

Next, to take your sentiment analysis further, you’ll want to try out MonkeyLearn’s sentiment analysis and keyword template . First, you’ll need sign up, then walk through the following steps:

1. Choose Keyword + Sentiment Analysis template

Choose template.

2. Upload your data

Uplad your data.

​​If you don't have a CSV, you can use our sample dataset .

3. Match the CSV columns to the dashboard fields

In this template, there is only one field: text. If you have more than one column in your dataset, choose the column that has the text you would like to analyze.

Match csv columns to fields.

4. Name your workflow

Name your dashboard.

5. Wait for your data to import

Waiting for the data to import.

6. Explore your dashboard!

Explore your dashboard.

  • Filter by sentiment or keyword.
  • Share via email with other coworkers.

Open Source vs SaaS (Software as a Service) Sentiment Analysis Tools

When it comes to sentiment analysis (and text analysis in general), you have two choices: build your own solution or buy a tool .

Open source libraries in languages like Python and Java are particularly well positioned to build your own sentiment analysis solution because their communities lean more heavily toward data science, like natural language processing and deep learning for sentiment analysis . But you’ll need a team of data scientists and engineers on board, huge upfront investments, and time to spare.

SaaS tools offer the option to implement pre-trained sentiment analysis models immediately or custom-train your own, often in just a few steps. These tools are recommended if you don’t have a data science or engineering team on board, since they can be implemented with little or no code and can save months of work and money (upwards of $100,000).

Another key advantage of SaaS tools is that you don't even need to know how to code; they provide integrations with third-party apps, like MonkeyLearn’s Zendesk, Excel and Zapier Integrations .

If you want to get started with these out-of-the-box tools, check out this guide to the best SaaS tools for sentiment analysis , which also come with APIs for seamless integration with your existing tools.

Or start learning how to perform sentiment analysis using MonkeyLearn’s API and the pre-built sentiment analysis model, with just six lines of code. Then, train your own custom sentiment analysis model using MonkeyLearn’s easy-to-use UI.

  • Tutorial on sentiment analysis in python using MonkeyLearn’s API.

If you’re still convinced that you need to build your own sentiment analysis solution, check out these tools and tutorials in various programming languages:

Sentiment Analysis Python

  • Scikit-learn is the go-to library for machine learning and has useful tools for text vectorization. Training a classifier on top of vectorizations, like frequency or tf-idf text vectorizers is quite straightforward. Scikit-learn has implementations for Support Vector Machines, Naïve Bayes, and Logistic Regression, among others.
  • NLTK has been the traditional NLP library for Python. It has an active community and offers the possibility to train machine learning classifiers.
  • SpaCy is an NLP library with a growing community. Like NLTK, it provides a strong set of low-level functions for NLP and support for training text classifiers.
  • TensorFlow , developed by Google, provides a low-level set of tools to build and train neural networks. There's also support for text vectorization, both on traditional word frequency and on more advanced through-word embeddings.
  • Keras provides useful abstractions to work with multiple neural network types, like recurrent neural networks (RNNs) and convolutional neural networks (CNNs) and easily stack layers of neurons. Keras can be run on top of Tensorflow or Theano. It also provides useful tools for text classification.
  • PyTorch is a recent deep learning framework backed by some prestigious organizations like Facebook, Twitter, Nvidia, Salesforce, Stanford University, University of Oxford, and Uber. It has quickly developed a strong community.

Tutorials to try out:

Python web scraping and sentiment analysis : this tutorial provides a step-by-step guide on how to analyze the top 100 subreddits by sentiment. It explains how to use Beautiful Soup , one of the most popular Python libraries for web scraping that collects the names of the top subreddit web pages (subreddits like /r/funny, /r/AskReddit and /r/todayilearned).

Using Praw library , it demonstrates how to interact with the Reddit API and extract the comments from these subreddits. Then, learn how to use TextBlob to perform sentiment analysis on the extracted comments. Code: https://github.com/jg-fisher/redditSentiment

Twitter sentiment analysis using Python and NLTK : This step-by-step guide shows you how to train your first sentiment classifier. The author uses Natural Language Toolkit NLTK to train a classifier on tweets. Making Sentiment Analysis Easy with Scikit-learn : This tutorial explains how to train a logistic regression model for sentiment analysis.

Making Sentiment Analysis Easy with Scikit-learn : This tutorial explains how to train a logistic regression model for sentiment analysis.

Sentiment Analysis Javascript

Java is another programming language with a strong community around data science with remarkable data science libraries for NLP.

  • OpenNLP : a toolkit that supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, language detection and coreference resolution.
  • Stanford CoreNLP : a Java suite of core NLP tools provided by The Stanford NLP Group.
  • Lingpipe : a Java toolkit for processing text using computational linguistics. LingPipe is often used for text classification and entity extraction.
  • Weka : a set of tools created by The University of Waikato for data pre-processing, classification, regression, clustering, association rules, and visualization.

Sentiment analysis research and courses

After learning the basics of sentiment analysis, and understanding how it can help you, you might want to delve further into the topic:

Sentiment Analysis Papers

The literature around sentiment analysis is massive; there are more than 55,700 scholarly articles, papers, theses, books, and abstracts out there.

The following are the most frequently cited and read papers in the sentiment analysis community in general:

  • Opinion mining and sentiment analysis (Pang and Lee, 2008)
  • Recognizing contextual polarity in phrase-level sentiment analysis (Wilson, Wiebe and Hoffmann, 2005).
  • A survey of opinion mining and sentiment analysis (Liu and Zhang, 2012)
  • Sentiment analysis and opinion mining (Liu, 2012)
  • How to Perform Text Mining with Sentiment Analysis

Sentiment Analysis Books

Bing Liu is a thought leader in the field of machine learning and has written a book about sentiment analysis and opinion mining.

Useful for those starting research on sentiment analysis, Liu does a wonderful job of explaining sentiment analysis in a way that is highly technical, yet understandable. In the book, he covers different aspects of sentiment analysis including applications, research, sentiment classification using supervised and unsupervised learning, sentence subjectivity, aspect-based sentiment analysis, and more.

For those who want to learn about deep-learning based approaches for sentiment analysis, a relatively new and fast-growing research area, take a look at Deep-Learning Based Approaches for Sentiment Analysis .

Sentiment Analysis Courses and Lectures

Another good way to go deeper with sentiment analysis is mastering your knowledge and skills in natural language processing (NLP), the computer science field that focuses on understanding ‘human’ language.

By combining machine learning, computational linguistics, and computer science, NLP allows a machine to understand natural language including people's sentiments, evaluations, attitudes, and emotions from written language.

There are a large number of courses, lectures, and resources available online, but the essential NLP course is the Stanford Coursera course by Dan Jurafsky and Christopher Manning . By taking this course, you will get a step-by-step introduction to the field by two of the most reputable names in the NLP community.

If you want a more hands-on course, you should enroll in the Data Science: Natural Language Processing (NLP) in Python on Udemy. This course gives you a good introduction to NLP and what it can do, but it will also make you build different projects in Python, including a spam detector, a sentiment analyzer, and an article spinner. Most of the lectures are really short (~5 minutes) and the course strikes the right balance between practical and theoretical content.

Sentiment Analysis Datasets

The key part for mastering sentiment analysis is working on different datasets and experimenting with different approaches. First, you’ll need to get your hands on data and procure a dataset which you will use to carry out your experiments.

The following are some of our favorite sentiment analysis datasets for experimenting with sentiment analysis and a machine learning approach. They’re open and free to download:

  • Product reviews : this dataset consists of a few million Amazon customer reviews with star ratings, super useful for training a sentiment analysis model.
  • Restaurant reviews : this dataset consists of 5,2 million Yelp reviews with star ratings.
  • Movie reviews : this dataset consists of 1,000 positive and 1,000 negative processed reviews. It also provides 5,331 positive and 5,331 negative processed sentences / snippets.
  • Fine food reviews : this dataset consists of ~500,000 food reviews from Amazon. It includes product and user information, ratings, and a plain text version of every review.
  • Twitter airline sentiment on Kaggle : this dataset consists of ~15,000 labeled tweets (positive, neutral, and negative) about airlines.
  • First GOP Debate Twitter Sentiment : this dataset consists of ~14,000 labeled tweets (positive, neutral, and negative) about the first GOP debate in 2016.

If you are interested in rule-based approach, the following is a varied list of sentiment analysis lexicons that will come in handy. These lexicons provide a set of dictionaries of words with labels specifying their sentiments across different domains. The following lexicons are really useful to identify the sentiment of texts:

  • Sentiment Lexicons for 81 Languages : this dataset contains both positive and negative sentiment lexicons for 81 languages.
  • SentiWordNet : this dataset contains about 29,000 words with a sentiment score between 0 and 1.
  • Opinion Lexicon for Sentiment Analysis : this dataset provides a list of 4,782 negative words and 2,005 positive words in English.
  • Wordstat Sentiment Dictionary : this dataset includes ~ 4800 positive and ~ 9000 negative words.
  • Emoticon Sentiment Lexicon : this dataset contains a list of 477 emoticons labeled as positive, neutral, or negative.

Parting words

Sentiment analysis can be applied to countless aspects of business, from brand monitoring and product analytics, to customer service and market research. By incorporating it into their existing systems and analytics, leading brands (not to mention entire cities) are able to work faster, with more accuracy, toward more useful ends.

Sentiment analysis has moved beyond merely an interesting, high-tech whim, and will soon become an indispensable tool for all companies of the modern age. Ultimately, sentiment analysis enables us to glean new insights, better understand our customers, and empower our own teams more effectively so that they do better and more productive work.

MonkeyLearn is an online platform that makes it easy to perform text analytics with machine learning and data visualization tools.

If you need help building a sentiment analysis system for your business, visit MonkeyLearn Studio and request a demo .

Related Posts

  • The Best Free Word Cloud to Visualize Your Data
  • Visualize Sentiments in a Word Cloud
  • Keyword Extraction: A Guide to Finding Keywords in Text

GDPR

MonkeyLearn Inc. All rights reserved 2024

A Complete Guide to Sentiment Analysis

“That movie was a colossal disaster… I absolutely hated it! Waste of time and money #skipit”

“Have you seen the new season of XYZ? It is so good!”

“You should really check out this new app, it’s awesome! And it makes your life so convenient.”

By reading these comments, can you figure out what the emotions behind them are?

They may seem obvious to you because we, as humans, are capable of discerning the complex emotional sentiments behind the text.

Not only have we been educated to understand the meanings, intentions, and grammar behind each of these particular sentences, but we’ve also personally felt many of these emotions before and, from our own experiences, can conjure up the deeper meaning behind these words.

Moreover, we’re also extremely familiar with the real-world objects that the text is referring to.

This doesn’t apply to machines, but they do have other ways of determining positive and negative sentiments! How do they do this, exactly? By using sentiment analysis. In this article, we will discuss how a computer can decipher emotions by using sentiment analysis methods, and what the implications of this can be. If you want to skip ahead to a certain section, simply use the clickable menu:

  • What is sentiment analysis?
  • How does sentiment analysis work?
  • Sentiment analysis use cases
  • Machine learning and sentiment analysis
  • Advantages of sentiment analysis
  • Disadvantages of sentiment analysis
  • Key takeaways and next steps

1. What is sentiment analysis?

With computers getting smarter and smarter, surely they’re able to decipher and discern between the wide range of different human emotions, right?

Wrong—while they are intelligent machines, computers can neither see nor feel any emotions, with the only input they receive being in the form of zeros and ones—or what’s more commonly known as binary code.

However, on the other hand, computers excel at the one thing that humans struggle with: processing large amounts of data quickly and effectively. So, theoretically, if we could teach machines how to identify the sentiments behind the plain text, we could analyze and evaluate the emotional response to a certain product by analyzing hundreds of thousands of reviews or tweets.

This would, in turn, provide companies with invaluable feedback and help them tailor their next product to better suit the market’s needs. So, what kind of process is this? Sentiment analysis!

Sentiment analysis, also known as opinion mining , is the process of determining the emotions behind a piece of text. Sentiment analysis aims to categorize the given text as positive, negative, or neutral.

Furthermore, it then identifies and quantifies subjective information about those texts with the help of:

  • natural language processing (NLP)
  • text analysis
  • computational linguistics
  • machine learning

2. How does sentiment analysis work?

There are two main methods for sentiment analysis: machine learning and lexicon-based.

The machine learning method leverages human-labeled data to train the text classifier, making it a supervised learning method.

The lexicon-based approach breaks down a sentence into words and scores each word’s semantic orientation based on a dictionary. It then adds up the various scores to arrive at a conclusion.

In this example, we will look at how sentiment analysis works using a simple lexicon-based approach. We’ll take the following comment as our test data:

Step 1: Cleaning

The initial step is to remove special characters and numbers from the text. In our example, we’ll remove the exclamation marks and commas from the comment above.

That movie was a colossal disaster I absolutely hated it Waste of time and money skipit

Step 2: Tokenization

Tokenization is the process of breaking down a text into smaller chunks called tokens, which are either individual words or short sentences.

Breaking down a paragraph into sentences is known as sentence tokenization , and breaking down a sentence into words is known as word tokenization .

[ ‘That’, ‘movie’, ‘was’, ‘a’, ‘colossal’, ‘disaster’, ‘I’, ‘absolutely’, ‘hated’, ‘it’,  ‘Waste’, ‘of’, ‘time’, ‘and’, ‘money’, ‘skipit’ ]

Step 3: Part-of-speech (POS) tagging

Part-of-speech tagging is the process of tagging each word with its grammatical group, categorizing it as either a noun, pronoun, adjective, or adverb—depending on its context.

This transforms each token into a tuple of the form (word, tag). POS tagging is used to preserve the context of a word.

[ (‘That’, ‘DT’), 

  (‘movie’, ‘NN’), 

  (‘was’, ‘VBD’),  

  (‘a’, ‘DT’) 

  (‘colossal’, ‘JJ’), 

  (‘disaster’, ‘NN’),  

  (‘I’, ‘PRP’), 

  (‘absolutely’, ‘RB’), 

  (‘hated’, ‘VBD’), 

  (‘it’, ‘PRP’),  

  (‘Waste’, ‘NN’) , 

  (‘of’, ‘IN’), 

  (‘time’, ‘NN’), 

  (‘and’, ‘CC’),

  (‘money’, ‘NN’),  

  (‘skipit’, ‘NN’) ]

Step 4: Removing stop words

Stop words are words like ‘have,’ ‘but,’ ‘we,’ ‘he,’ ‘into,’ ‘just,’ and so on. These words carry information of little value, andare generally considered noise, so they are removed from the data.

[ ‘movie’, ‘colossal’, ‘disaster’, ‘absolutely’, ‘hated’, Waste’, ‘time’, ‘money’, ‘skipit’ ]

Step 5: Stemming

Stemming is a process of linguistic normalization which removes the suffix of each of these words and reduces them to their base word. For example, loved is reduced to love, wasted is reduced to waste. Here, hated is reduced to hate.

[ ‘movie’, ‘colossal’, ‘disaster’, ‘absolutely’, ‘hate’, ‘Waste’, ‘time’, ‘money’, ‘skipit’ ]

Step 6: Final Analysis

In a lexicon-based approach, the remaining words are compared against the sentiment libraries, and the scores obtained for each token are added or averaged.

Sentiment libraries are a list of predefined words and phrases which are manually scored by humans. For example, ‘worst’ is scored -3, and ‘amazing’ is scored +3. 

With a basic dictionary, our example comment will be turned into:

movie= 0, colossal= 0, disaster= -2,  absolutely=0, hate=-2, waste= -1, time= 0, money= 0, skipit= 0

This makes the overall score of the comment -5 , classifying the comment as negative.

3. Sentiment analysis use cases

Sentiment analysis is used to swiftly glean insights from enormous amounts of text data, with its applications ranging from politics, finance, retail, hospitality, and healthcare. For instance, consider its usefulness in the following scenarios:

  • Brand reputation management:  Sentiment analysis allows you to track all the online chatter about your brand and spot potential PR disasters before they become major concerns. 
  • Voice of the customer: The “voice of the customer” refers to the feedback and opinions you get from your clients all over the world. You can improve your product and meet your clients’ needs with the help of this feedback and sentiment analysis.
  • Voice of the employee:   Employee satisfaction can be measured for your company by analyzing reviews on sites like Glassdoor, allowing you to determine how to improve the work environment you have created.
  • Market research: You can analyze and monitor internet reviews of your products and those of your competitors to see how the public differentiates between them, helping you glean indispensable feedback and refine your products and marketing strategies accordingly. Furthermore, sentiment analysis in market research can also anticipate future trends and thus have a first-mover advantage.

Other applications for sentiment analysis could include:

  • Customer support
  • Social media monitoring
  • Voice assistants & chatbots
  • Election polls
  • Customer experience about a product
  • Stock market sentiment and market movement
  • Analyzing movie reviews

4. Machine learning and sentiment analysis

Sentiment analysis tasks are typically treated as classification problems in the machine learning approach.

Data analysts use historical textual data—which is manually labeled as positive, negative, or neutral—as the training set. They then complete feature extraction on this labeled dataset, using this initial data to train the model to recognize the relevant patterns. Next, they can accurately predict the sentiment of a fresh piece of text using our trained model.

Naive Bayes, logistic regression, support vector machines, and neural networks are some of the classification algorithms commonly used in sentiment analysis tasks. The high accuracy of prediction is one of the key advantages of the machine learning approach.

5. Advantages of sentiment analysis

Considering large amounts of data on the internet are entirely unstructured, data analysts need a way to evaluate this data.

With regards to sentiment analysis, data analysts want to extract and identify emotions, attitudes, and opinions from our sample sets. Reading and assigning a rating to a large number of reviews, tweets, and comments is not an easy task, but with the help of sentiment analysis, this can be accomplished quickly.

Another unparalleled feature of sentiment analysis is its ability to quickly analyze data such as new product launches or new policy proposals in real time. Thus, sentiment analysis can be a cost-effective and efficient way to gauge and accordingly manage public opinion.

6. Disadvantages of sentiment analysis

Sentiment analysis, as fascinating as it is, is not without its flaws.

Human language is nuanced and often far from straightforward. Machines might struggle to identify the emotions behind an individual piece of text despite their extensive grasp of past data. Some situations where sentiment analysis might fail are:

  • Sarcasm, jokes, irony. These things generally don’t follow a fixed set of rules, so they might not be correctly classified by sentiment analytics systems.
  • Nuance. Words can have multiple meanings and connotations, which are entirely subject to the context they occur in.
  • Multipolarity. When the given text is positive in some parts and negative in others.
  • Negation detection. It can be challenging for the machine because the function and the scope of the word ‘not’ in a sentence is not definite; moreover, suffixes and prefixes such as ‘non-,’ ‘dis-,’ ‘-less’ etc. can change the meaning of a text.

7. Key takeaways and next steps

In this article, we examined the science and nuances of sentiment analysis. While sentimental analysis is a method that’s nowhere near perfect, as more data is generated and fed into machines, they will continue to get smarter and improve the accuracy with which they process that data. 

All in all, sentimental analysis has a large use case and is an indispensable tool for companies that hope to leverage the power of data to make optimal decisions.

For those who believe in the power of data science and want to learn more, we recommend taking this free, 5-day introductory course in data analytics . You could also read more about related topics by reading any of the following articles:

  • The Best Data Books for Aspiring Data Analysts
  • PyTorch vs TensorFlow: What Are They And Which Should You Use?
  • These Are the Best Data Bootcamps for Learning Python

Sentiment Analysis: Comprehensive Beginners Guide

What is sentiment analysis.

Sentiment analysis is used to determine whether a given text contains negative, positive, or neutral emotions. It’s a form of text analytics that uses natural language processing (NLP) and machine learning. Sentiment analysis is also known as “opinion mining” or “emotion artificial intelligence”.

Sentiment Scoring

A key aspect of sentiment analysis is polarity classification. Polarity refers to the overall sentiment conveyed by a particular text, phrase or word. This polarity can be expressed as a numerical rating known as a “sentiment score”. For example, this score can be a number between -100 and 100 with 0 representing neutral sentiment. This score could be calculated for an entire text or just for an individual phrase.

analysis sentiment research

Fine-grained Sentiment Analysis

Sentiment scoring can be as fine-grained as required for a specific use case. Categories can expand beyond just “positive”, “neutral” and “negative”. For example, you may choose to use five categories

analysis sentiment research

One easy way to do this with customer reviews is to rank 1-star reviews as “very negative”. 5-star reviews would be ranked as “very positive”.

You can also refine the sentiment further into specific emotions. For example, positive sentiment can be further refined into happy, excited, impressed, trusting and so on. This is typically done using emotion analysis, which we’ve covered in one of our previous articles .

Aspect-based Sentiment Analysis (ABSA)

Sentiment analysis is most useful, when it’s tied to a specific attribute or a feature described in text. The process of discovery of these attributes or features and their sentiment is called Aspect-based Sentiment Analysis, or ABSA. Here at Thematic we call these aspects “themes”. For example, for product reviews of a laptop you might be interested in processor speed. An aspect-based algorithm can be used to determine whether a sentence is negative, positive or neutral when it talks about processor speed.

analysis sentiment research

ABSA for real-time monitoring

Learning is an area of AI that teaches computers to perform tasks by looking at data. Machine Learning algorithms are programmed to discover patterns in data. Machine learning algorithms can be trained to analyze any new text with a high degree of accuracy. This makes it possible to measure the sentiment on processor speed even when people use slightly different words. For example, “slow to load” or “speed issues” which would both contribute to a negative sentiment for the “processor speed” aspect of the laptop.

Companies use Machine Learning based solutions to apply aspect-based sentiment analysis across their social media, review sites, online communities and internal customer communication channels. The results of the ABSA can then be explored in data visualizations to identify areas for improvement. These visualizations could include overall sentiment, sentiment over time, and sentiment by rating for a particular dataset.

ABSA and Machine Learning

Aspect-based sentiment analysis can be especially useful for real-time monitoring. Businesses can immediately identify issues that customers are reporting on social media or in reviews. This can help speed up response times and improve their customer experience.

analysis sentiment research

Why Is Sentiment Analysis Important?

Improving sales and retaining customers are core business goals. According to research by Apex Global Learning , every additional star in an online review leads to a 5-9% revenue bump. There’s an 18% difference in revenue between businesses rated as three-star and five-star ratings.

Sentiment analysis can help you understand how people feel about your brand or product at scale . This is often not possible to do manually simply because there is too much data. Specialized SaaS tools have made it easier for businesses to gain deeper insights into their text data. This could include everything from customer reviews to employee surveys and social media posts. The sentiment data from these sources can be used to inform key business decisions.

Benefits Of Sentiment Analysis

Let’s dig deeper into the key benefits of sentiment analysis.

analysis sentiment research

More trustworthy

Removes human bias through consistent analysis.

Sentiment can be highly subjective. As humans we use tone, context, and language to convey meaning. How we understand that meaning depends on our own experiences and unconscious biases. To explore this further, let’s look at a customer review about a new SaaS product:

“Gets the job done, but it’s not cheap!”

There is both negative and positive sentiment in this sentence. Negative sentiment is linked to the price. Positive sentiment is linked to the functionality of the product. But what’s the overall sentiment of the sentence?

This is where human bias and error can creep in. Human analysts might regard this sentence as positive overall since the reviewer mentions functionality in a positive sentiment. On the other hand, they may focus on the negative comment on price and tag it as negative. This is just one example of how subjectivity can influence sentiment perception.

Sentiment analysis solutions apply consistent criteria to generate more accurate insights. For example, a machine learning model can be trained to recognise that there are two aspects with two different sentiments. It would average the overall sentiment as neutral, but also keep track of the details.

analysis sentiment research

More powerful

Processes data at scale.

Sentiment analysis helps businesses make sense of huge quantities of unstructured data. When you work with text, even 50 examples already can feel like Big Data. Especially, when you deal with people’s opinions in product reviews or on social media.

Take the example of a company who has recently launched a new product. Rather than trawling through hundreds of reviews the company can feed the data into a feedback management solution. Its sentiment analysis model will classify incoming feedback according to sentiment. The company can understand what customers think of their new product faster and act accordingly. They can uncover features that customers like as well as areas for improvement.

This type of analysis also gives companies an idea of how many customers feel a certain way about their product. The number of people and the overall polarity of the sentiment about, let’s say “online documentation”, can inform a company’s priorities. For example, they could focus on creating better documentation to avoid customer churn and stay competitive.

analysis sentiment research

Automation!

Sentiment analysis algorithms can analyze hundreds of megabytes of text in minutes. Instead of manually analyzing data in spreadsheets, you can now spend your time on more valuable activities. For example, you can validate the insight: Is this something worth acting on? You can add business context too. If there is an issue, is it seasonal? Have we seen this in other parts of the business? Ultimately, sentiment analysis just provides a signal. But if you get this signal fast and with low effort, you will have time to create the right strategy.

Sentiment analysis algorithms and approaches are continually getting better. They are improved by feeding better quality and more varied training data. Researchers also invent new algorithms that can use this data more effectively. At Thematic, we monitor your results and assess errors. If required, we add more specific training data in areas that need improvement. As a result, sentiment analysis is becoming more accurate and delivers more specific insights.

analysis sentiment research

Act faster:

Real-time analysis and insights.

Sentiment analysis is automated using Machine Learning. This means that businesses can get insights in real-time. This can be very helpful when identifying issues that need to be addressed right away. For example, a negative story trending on social media can be picked up in real-time and dealt with quickly. If one customer complains about an account issue, others might have the same problem. By instantly alerting the right teams to fix this issue, companies can prevent bad experiences from happening.

Business Applications For Sentiment Analysis

Sentiment analysis is useful for making sense of qualitative data that companies continuously gather through various channels. Let’s dig into some of the most common business applications.

Voice of Customer (VoC) Programs

Understanding how your customers feel about your brand or your products is essential. This information can help you improve the customer experience or identify and fix problems with your products or services. To do this, as a business, you need to collect data from customers about their experiences with and expectations for your products or services. This feedback is known as Voice of the Customer (VoC).

Net Promoter Score (NPS) surveys are a common way to assess how customers feel. Customers are usually asked, “How likely are you to recommend us to a friend?” The feedback is usually expressed as a number on a scale of 1 to 10. Customers who respond with a score of 10 are known as “promoters”. They’re the most likely to recommend the business to a friend or family member. High NPS means better customer retention. More promoters also means better word-of-mouth advertising. This means that you need to spend less on paid customer acquisition.

A drawback of NPS surveys is they don’t give you much information about why your customers really feel a certain way. Open-ended questions supplement the NPS rating questions. They capture why customers are likely or unlikely to recommend products and services. Sentiment analysis turns this text into the drivers of NPS.

NPS is just one of the VoC survey types. The same idea applies to any metric that you might care about: Customer Effort Score, Customer Satisfaction etc. It really doesn’t matter that much what metric is used. What’s driving the ups and downs of the metric is more important.

A great VOC program includes listening to customer feedback across all channels. You can imagine how it can quickly explode to hundreds and thousands of pieces of feedback even for a mid-size B2B company. Sentiment analysis is critical to make sense of this data.

Finally, companies can also quickly identify customers reporting strongly negative experiences and rectify urgent issues. Tracking your customers’ sentiment over time can help you identify and address emerging issues before they become bigger problems.

Customer Service Experience

A great customer service experience can make or break a company. Customers want to know that their query will be dealt with quickly, efficiently, and professionally. Sentiment analysis can help companies streamline and enhance their customer service experience.

Sentiment analysis and text analysis can both be applied to customer support conversations . Machine Learning algorithms can automatically rank conversations by urgency and topic. For example, let’s say you have a community where people report technical issues. A sentiment analysis algorithm can find those posts where people are particularly frustrated. These queries can be prioritized for an in-house specialist. Regular questions can be answered by other community members.

As you can see, sentiment analysis can reduce processing times and increase efficiency by directing queries to the right people. Ultimately, customers get a better support experience and you can reduce churn rates.

analysis sentiment research

Product Experience

Sentiment analysis can identify how your customers feel about the features and benefits of your products. This can help uncover areas for improvement that you may not have been aware of.

For example, you could mine online product reviews for feedback on a specific product category across all competitors in this market. You can then apply sentiment analysis to reveal topics that your customers feel negatively about. This could reveal opportunities or common issues.

For example, when we analyzed sentiment of US banking app reviews we found that the most important feature was mobile check deposit. Interestingly, most apps had issues with this feature. Companies that have the least complaints for this feature could use such an insight in their marketing messaging.

Product managers can iterate on improving the feature. They can then use sentiment analysis to monitor if customers are seeing improvements in functionality and reliability of the check deposit.

Brand Sentiment Analysis

How customers feel about a brand can impact sales, churn rates, and how likely they are to recommend this brand to others. In 2004 the “Super Size” documentary was released documenting a 30-day period when filmmaker Morgan Spurlock only ate McDonald’s food. The ensuing media storm combined with other negative publicity caused the company’s profits in the UK to fall to the lowest levels in 30 years . The company responded by launching a PR campaign to improve their public image.

Sentiment analysis can help brands monitor how their customers feel about them. They can analyze communities, forums and social media platforms to keep an eye on their brand reputation. Or they can conduct surveys to understand what issues their customers feel strongly about.

Companies also track their brand, product names and competitor mentions to build up an understanding of brand image over time. This helps companies assess how a PR campaign or a new product launch have impacted overall brand sentiment.

analysis sentiment research

Social media sentiment analysis

Social media is a powerful way to reach new customers and engage with existing ones. Good customer reviews and posts on social media encourage other customers to buy from your company. But the reverse is also true. Negative social media posts or reviews can be very costly to your business.

Research by Convergys Corp. showed that a negative review on YouTube, Twitter or Facebook can cost a company about 30 customers. Negative social media posts about a company can also cause big financial losses. One memorable example is Elon Musk’s 2020 tweet which claimed the Tesla stock price was too high.

The viral tweet wiped $14 billion off Tesla’s valuation in a matter of hours. Sentiment analysis can help identify these types of issues in real-time before they escalate. Businesses can then respond quickly to mitigate any damage to their brand reputation and limit financial cost.

Market research

Sentiment analysis can help companies identify emerging trends, analyze competitors, and probe new markets. Companies may want to analyze reviews on competitors’ products or services. Applying sentiment analysis to this data can identify what customers like or dislike about their competitors’ products. These insights might reveal how to gain a competitive edge. For example, sentiment analysis could reveal that competitors’ customers are unhappy about the poor battery life of their laptop. The company could then highlight their superior battery life in their marketing messaging.

Sentiment analysis could also be applied to market reports and business journals to pinpoint new opportunities. For example, analyzing industry data on the real estate market could reveal a particular area is increasingly being mentioned in a positive light. This information might suggest that industry insiders see this area as a good investment opportunity. These insights could then be used to gain an early advantage by investing ahead of the rest of the market.

analysis sentiment research

Sentiment Analysis Case Study

Atom bank customer feedback.

Atom bank is a newcomer to the banking scene that set out to disrupt the industry. They take customer feedback seriously. These insights are used to continuously improve their digital customer experiences.

Atom bank’s VoC programme includes a diverse range of feedback channels. They ran regular surveys, focus groups and engaged in online communities. This gave them A LOT of unstructured and structured data.

analysis sentiment research

Working with Thematic , Atom bank transformed their banking experience. As you can see above, combining thematic and sentiment analysis identified what mattered most to their customers. Some themes such as “authentication” were associated with negative sentiment in Atom bank customer feedback. Other themes like “ease of use” were associated with positive sentiment.

Sentiment analysis also helped to identify specific issues like “face recognition not working”. Atom bank then used these insights to rectify these issues.

With all these customer sentiment insights, the team could prioritize the app features they knew would have the most impact. These improvements made Atom bank the highest rated bank according to Trustpilot. They also now have an App Store Rating of 4.7/5. And contact centre failure demand reduced by 30%!

analysis sentiment research

How Does Sentiment Analysis Work?

Sentiment analysis uses machine learning and natural language processing (NLP) to identify whether a text is negative, positive, or neutral. The two main approaches are rule-based and automated sentiment analysis.

Rule-based Sentiment Analysis

This is the traditional way to do sentiment analysis based on a set of manually-created rules. This approach includes NLP techniques like lexicons (lists of words), stemming, tokenization and parsing.

Rule-based sentiment analysis works like this:

“Lexicons” or lists of positive and negative words are created. These are words that are used to describe sentiment. For example, positive lexicons might include “fast”, “affordable”, and “user-friendly“. Negative lexicons could include “slow”, “pricey”, and “complicated”.

analysis sentiment research

Before text can be analyzed it needs to be prepared. Several processes are used to format the text in a way that a machine can understand. Tokenization breaks up text into small chunks called tokens. Sentence tokenization splits up text into sentences. Word tokenization separates words in a sentence. For example, “the best customer service” would be split into “the”, “best”, and “customer service”. Lemmatization can be used to transforms words back to their root form. A lemma is the root form of a word. For example, the root form of “is, are, am, were, and been” is “be”. We also want to exclude things which are known but are not useful for sentiment analysis. So another important process is stopword removal which takes out common words like “for, at, a, to”. These words have little or no semantic value in the sentence. Applying these processes makes it easier for computers to understand the text.

A computer counts the number of positive or negative words in a particular text. A special rule can make sure that negated words, e.g. “not easy”, are counted as opposites.

The final step is to calculate the overall sentiment score for the text . As mentioned previously, this could be based on a scale of -100 to 100. In this case a score of 100 would be the highest score possible for positive sentiment. A score of 0 would indicate neutral sentiment. The score can also be expressed as a percentage, ranging from 0% as negative and 100% as positive.

Disadvantages of Rule-based Sentiment Analysis

Rule-based approaches are limited because they don’t consider the sentence as whole. The complexity of human language means that it’s easy to miss complex negation and metaphors. Rule-based systems also tend to require regular updates to optimize their performance.

Automated or Machine Learning Sentiment Analysis

Automated sentiment analysis relies on machine learning (ML) techniques. In this case a ML algorithm is trained to classify sentiment based on both the words and their order. The success of this approach depends on the quality of the training data set and the algorithm.

There are also hybrid sentiment algorithms which combine both ML and rule-based approaches. They can offer greater accuracy, although they are much more complex to build.

Step 1: Feature Extraction

Before the model can classify text, the text needs to be prepared so it can be read by a computer. Tokenization, lemmatization and stopword removal can be part of this process, similarly to rule-based approaches.In addition, text is transformed into numbers using a process called vectorization. These numeric representations are known as “features”. A common way to do this is to use the bag of words or bag-of-ngrams methods. These vectorize text according to the number of times words appear.

Recently deep learning has introduced new ways of performing text vectorization. One example is the word2vec algorithm that uses a neural network model. The neural network can be taught to learn word associations from large quantities of text. Word2vec represents each distinct word as a vector, or a list of numbers. The advantage of this approach is that words with similar meanings are given similar numeric representations. This can help to improve the accuracy of sentiment analysis.

Step 2: Training & Prediction

In the next stage, the algorithm is fed a sentiment-labelled training set. The model then learns to associate input data with the most appropriate corresponding label. For example, this input data would include pairs of features (or numeric representations of text) and their corresponding positive, negative or neutral label. The training data can be either created manually or generated from reviews themselves.

Step 3: Predictions

The final stage is where ML sentiment analysis has the greatest advantage over rule-based approaches. New text is fed into the model. The model then predicts labels (also called classes or tags) for this unseen data using the model learned from the training data. The data can thus be labelled as positive, negative or neutral in sentiment. This eliminates the need for a pre-defined lexicon used in rule-based sentiment analysis.

Classification algorithms

Classification algorithms are used to predict the sentiment of a particular text. As detailed in the vgsteps above, they are trained using pre-labelled training data. Classification models commonly use Naive Bayes, Logistic Regression, Support Vector Machines, Linear Regression, and Deep Learning. Let’s explore these algorithms in a bit more detail.

Naive Bayes: this type of classification is based on Bayes’ Theorem. These are probabilistic algorithms meaning they calculate the probability of a label for a particular text. The text is then labelled with the highest probability label. “Naive” refers to the fundamental assumption that each feature is independent. Individual words make an independent and equal contribution to the overall outcome. This assumption can help this algorithm work well even where there is limited or mislabelled data.

Logistic Regression: a classification algorithm that predicts a binary outcome based on independent variables. It uses the sigmoid function which outputs a probability between 0 and 1. Words and phrases can be either classified as positive or negative. For example, “super slow processing speed” would be classified as 0 or negative.

Linear Regression: algorithm that predicts polarity (Y output) based on words and phrases (X input). The objective is to learn a linear model or line which can be used to predict sentiment (Y). Accuracy of the model can be improved by reducing the error.

analysis sentiment research

Example of simple linear regression.

analysis sentiment research

Support Vector Machines: a model that ​​plots labelled data as points in a multi-dimensional space. The hyperplane or decision boundary is a line which divides the data points. In the example below, anything to the left of the hyperplane would be classified as negative. And everything to the right would be classified as positive. The best hyperplane is one where the distance to the nearest data point of each tag is the largest. Support vectors are those data points which are closer to the hyperplane. They influence its position and orientation. These are the points which help to build the support vector machine.

analysis sentiment research

Deep Learning: here, an artificial neural network performs multiple layers of processing. Deep learning is a diverse set of algorithms that imitate human brain learning through associations and abstractions. Deep learning has significant advantages over traditional classification algorithms. These neural networks can understand context, and even the mood of the writer.

Deep Learning & Sentiment Analysis

It’s worth exploring deep learning in more detail since this approach results in the most accurate sentiment analysis. Up until recently the field was dominated by traditional ML techniques, which require manual work to define classification features. They also often fail to consider the impact of word order. Deep learning and artificial neural networks have transformed NLP.

Deep learning algorithms were ​​inspired by the structure and function of the human brain. This approach led to an increase in the accuracy and efficiency of sentiment analysis. In deep learning the neural network can learn to correct itself when it makes an error. With traditional machine learning errors need to be fixed via human intervention.

Long Short-Term Memory

One important Deep Learning approach is the Long Short-Term Memory or LSTM. This approach reads text sequentially and stores information relevant to the task.

The LSTM consists of three parts which are known as “gates”:

Forget Gate: This first part decides whether previous data is to be remembered. If it is irrelevant to the task, it can be forgotten.

Input Gate: In the second part the cell tries to learn new information from the new data.

Output Gate: The final part is where the cell passes updated information to the next timestamp.

For sentiment analysis it’s useful that there are cells within the LSTM which control what data is remembered or forgotten. Negation is crucial in accurate sentiment analysis. For example, it’s obvious to any human that there’s a big difference between “great” and “not great”. An LSTM is capable of learning that this distinction is important and can predict which words should be negated. The LSTM can also infer grammar rules by reading large amounts of text.

Transformer models

LSTMs have their limitations especially when it comes to long sentences. The model can often forget the content of distant words. And the sentence has to be processed word by word.

An alternative solution is to use a transformer. This model differentially weights the significance of each part of the data. Unlike a LTSM, the transformer does not need to process the beginning of the sentence before the end. Instead it identifies the context that confers meaning to each word. This is known as an attention mechanism. Transformers have now largely replaced LTSMs as they’re better at analysing longer sentences.

Pre-trained models

Pre-trained models allow you to get started with sentiment analysis right away. It’s a good solution for companies who do not have the resources to obtain large datasets or train a complex model.

What Are The Current Challenges For Sentiment Analysis?

analysis sentiment research

Subjectivity

Texts can be objective or subjective.

Consider the following sentences as an example:

The first sentence is clearly subjective and most people would say that the sentiment is positive. The second sentence is objective and would be classified as neutral. In this “good” is considered more subjective than “small”. The challenge here is that machines often struggle with subjectivity. Let’s take the example of a product review which says “the software works great, but no way that justifies the massive price-tag”. In this case the first half of the sentence is positive. But it’s negated by the second half which says it’s too expensive. The overall sentiment of the sentence is negative.

Large training datasets that include lots of examples of subjectivity can help algorithms to classify sentiment correctly. Deep learning can also be more accurate in this case since it’s better at taking context and tone into account.

Context is crucial when it comes to understanding sentiment. Opinion words can change their polarity depending on the context. Machines need to learn about context in order to correctly classify a text.

For example, the question “what did you like about our product” could produce the following answers:

“Versatility”

The first answer would be classified as positive. The second answer is also positive, but on its own it is ambiguous. If we changed the question to “what did you not like”, the polarity would be completely reversed. Sometimes, it’s not the question but the rating that provides the context.

The solution to this is to preprocess or postprocess the data to capture the necessary context. This can be a complex and lengthy process.

Irony & Sarcasm

Humour and sarcasm can present big challenges for machine learning techniques! Take the real life example of a complaint letter sent to LIAT Caribbean Airlines by passenger Arthur Hicks:

analysis sentiment research

With irony and sarcasm people use positive words to describe negative experiences. It can be tough for machines to understand the sentiment here without knowledge of what people expect from airlines. In the example above words like ‘considerate” and “magnificent” would be classified as positive in sentiment. But for a human it’s obvious that the overall sentiment is negative.

Luckily, in a business context only a very small percentage of reviews use sarcasm .

analysis sentiment research

Comparisons

Comparison is another potential stumbling block to correct sentiment classification. Consider these example online reviews:

In the first case it’s obvious sentiment is positive. The second one is trickier since they rely on comparisons. Without knowing what the product is being compared to, it’s hard to know if these are positive, negative or neutral. In the second sentence it depends on the “alternatives”. If the person considers the other products they’ve used to be very poor, this sentence could be less positive than it seems at face value.

Speaking about Competitors:

If you are company X and your competitor is company Y, it is impossible to have one sentiment model that captures positive sentiment about Y as negative sentiment about X. Let’s say you get these comments:

I love the service that I get from company X

I love the service that I get from company Y

A general model can only say both are positive. If you want to say that a comment speaking highly of your competitor is negative, then you need to train a custom model.

analysis sentiment research

Emojis can require extensive preprocessing especially when using data sources like social media platforms. There are two key types of emojis, Western emojis and Eastern emojis. Western emojis use only a couple of characters, such as :). Eastern emojis use more characters in a vertical combination, such as ¯\_(ツ)_/¯ which means something like “smiley sideways shrug” in Japan.

Machine Learning algorithms struggle with idioms and phrases. An example is “not my cup of tea”. This would potentially confuse the algorithm. If a reviewer uses an idiom in product feedback it could be ignored or incorrectly classified by the algorithm. The solution is to include idioms in the training data so the algorithm is familiar with them.

For accurate sentiment analysis defining the neutral label appropriately is important. The criteria need to be consistent to generate good quality and reliable analysis. Examples of texts that should be classified as neutral include objective statements like the example we looked at above: “This laptop is black”. There are no obvious sentiments expressed in this sentence.

Irrelevant data can be classified as neutral. Another approach is to filter out any irrelevant details in the preprocessing stage.

Use of the word “wish” may indicate neutral sentiment. Consider the example, “I wish I had discovered this sooner.” However, you’ll need to be careful with this one as it can also be used to express a deficiency or problem. For example, a customer might say, “I wish the platform would update faster!” This word can express a variety of sentiments.

analysis sentiment research

Negation can also create problems for sentiment analysis models. For example, if a product reviewer writes “I can’t not buy another Apple Mac'' they are stating a positive intention. Machines need to be trained to recognize that two negatives in a sentence cancel out.

As mentioned earlier, a Long Short-Term Memory model is one option for dealing with negation efficiently and accurately. This is because there are cells within the LSTM which control what data is remembered or forgotten. A LSTM is capable of learning to predict which words should be negated. The LSTM can “learn” these types of grammar rules by reading large amounts of text.

Negation can also be solved by using a pre-trained transformer model and by carefully curating your training data. Pre-trained transformers have within them a representation of grammar that was obtained during pre-training. They are also well suited to parallelization, making them efficient for training using large volumes of data. Curating your data is done by ensuring that you have a sufficient number of well-varied, accurately labelled training examples of negation in your training dataset.

Audiovisual Content

Video and audio are a very different type of data to text. Audio on its own or as part of videos will need to be transcribed before the text can be analyzed using Speech-to-text algorithm. Sentiment analysis can then analyze transcribed text similarly to any other text. There are also approaches that determine sentiment from the voice intonation itself, detecting angry voices or sounds people make when they are frustrated. These techniques can also be applied to podcasts and other audio recordings.

Limitations Of Human Annotator Accuracy

As we mentioned above, even humans struggle to identify sentiment correctly. This can be measured using an inter-annotator agreement, also called consistency , to assess how well two or more human annotators make the same annotation decision. Since machines learn from training data, these potential errors can impact on the performance of a ML model for sentiment analysis.

Based on a recent test, Thematic’s sentiment analysis correctly predicts sentiment in text data 96% of the time. But we also talked extensively about the meaning of accuracy and how one should take any reports of accuracy with a grain of salt.

That said, when it comes to aspect based sentiment analysis (ABSA), as defined earlier, we did run a study where we compared aspects discovered by 4 people vs. aspects discovered by Thematic. We learned that on average, Thematic agrees with people more than they agree with each other !

How To Get Started With Sentiment Analysis

Getting started with sentiment analysis can be intimidating. Luckily there are many online resources to help you as well as automated SaaS sentiment analysis solutions. Or you might choose to build your own solution using open source tools.

Choosing A Sentiment Analysis Approach

Should you build your own or invest in existing software? The answer probably depends on how much time you have and your budget. Usually, building inhouse is more expensive. Let’s dig into the details of building your own solution or buying an existing SaaS product.

analysis sentiment research

Building Your Own

Building your own sentiment analysis solution can be a lengthy and complex process. The steps required to build this type of tool are:

Research The first step is to understand which machine learning options are best for your business. You’ll need to consider the programming language to use as well.

Build You can develop the algorithms yourself or, most likely, use an off-the shelf model.

Model training The model is fed a sentiment-labelled training set. The model then learns to associate input data with the most appropriate corresponding labels. This can be time-consuming as the training data needs to be curated, labelled or generated. Integrate: Build an API or manually integrate the model with your existing tools. You may also need to construct a user-friendly interface if your tool will be used by non-technical colleagues.

Team training Non-technical teams in particular may require detailed onboarding training on how to use the tool. You may need to create internal training manuals. Launch: The final phase is to start using your tool within your business. Regular monitoring and tweaking may be required to optimize performance.

Pros: The tool can be customized to meet your exact business requirements.

Cons: Building your own sentiment analysis solution takes considerable time. The minimum time required to build a basic sentiment analysis solution is around 4-6 months. You may need to hire or reassign a team of data engineers and programmers. Creating custom software may take longer than you had planned. Deadlines can easily be missed if the team runs into unexpected problems. This can cause costs to increase significantly. Once the tool is built it will need to be updated and monitored. It’s a custom-built solution so only the tech team that created it will be familiar with how it all works.

Python Sentiment Analysis

Python is a popular programming language to use for sentiment analysis. An advantage of Python is that there are many open source libraries freely available to use. These make it easier to build your own sentiment analysis solution.

Here are some resources that can help you use Python for sentiment analysis:

NLTK or Natural Language Toolkit is one of the main NLP libraries for Python. It includes useful features like tokenizing, stemming and part-of-speech tagging. NLTK also has a pretrained sentiment analyzer called VADER (Valence Aware Dictionary and sEntiment Reasoner). VADER works better for shorter sentences like social media posts. It can be less accurate when rating longer and more complex sentences.

spaCy is another NLP library for Python that allows you to build your own sentiment analysis classifier. Like NLTK it offers part-of-speech tagging and named entity recognition.

PyTorch is a machine learning library primarily developed by Facebook’s AI Research lab. It is popular with developers thanks to its simplicity and easy integrations.

You might also find these tutorials helpful:

NLTK has developed a comprehensive guide to programming for language processing. It covers writing Python programs, working with corpora, categorizing text, and analyzing linguistic structure.

This beginner’s guide from Towards Data Science covers using Python for sentiment analysis.

Java Sentiment Analysis

Java is another popular language for sentiment analysis. Here’s a list of useful toolkits for Java:

OpenNLP is an Apache toolkit which uses machine learning to process natural language text. It supports tokenization, part-of-speech tagging, named entity extraction, parsing, and much more.

The Stanford CoreNLP NLP toolkit also has a wide range of features including sentence detection, tokenization, stemming, and sentiment detection.

Another open source option for text mining and data preparation is Weka . This collection of machine learning algorithms features classification, regression, clustering and visualization tools.

Take a look at this tutorials to learn more about using Java for sentiment analysis:

This Red Hat tutorial looks at performing sentiment analysis of Twitter posts using Stanford CoreNLP .

analysis sentiment research

Buying A SaaS Product

There are a variety of pre-built sentiment analysis solutions like Thematic which can save you time, money, and mental energy.

Let’s consider the pros and cons of using a SaaS solution for sentiment analysis:

Pros: SaaS products like Thematic allow you to get started with sentiment analysis straight away. You can instantly benefit from sentiment analysis models pre-trained on customer feedback.

No coding is needed. This makes SaaS solutions ideal for businesses that don’t have in-house software developers or data scientists.

Costs are a lot lower than building a custom-made sentiment analysis solution from scratch.

One-click integrations into feedback collection tools and APIs enable seamless and secure data transfer.

Access to comprehensive customer support to help you get the most out of the tool.

Cons: There are many sentiment analysis solutions on the market. It can be hard to choose the right one for your business.

Using Thematic For Powerful Sentiment Analysis Insights

For many businesses the most efficient option is to purchase a SaaS solution that has sentiment analysis built in. Thematic is a great option that makes it easy to perform sentiment analysis on your customer feedback or other types of text.

Thematic uses sentiment analysis algorithms that are trained on large volumes of data using machine learning. A unique feature of Thematic is that it combines sentiment with themes discovered during the thematic analysis process.

Thematic Analysis Vs. Sentiment Analysis

Before we dig into the benefits of combining sentiment analysis and thematic analysis , let’s quickly review these two types of analysis.

analysis sentiment research

Thematic Analysis

Thematic analysis is the process of discovering repeating themes in text. A theme captures what this text is about regardless of which words and phrases express it. For example, one person could say “the food was yummy”, another could say “the dishes were delicious”. In both cases, it’s the same theme. We could call it “tasty food”.

AI researchers came up with Natural Language Understanding algorithms to automate this task. Thematic software is powered by these algorithms. You can learn more about how it works in our blog post .

Where does the Sentiment Analysis come in?

We talked earlier about Aspect Based Sentiment Analysis, ABSA. Themes capture either the aspect itself, or the aspect and the sentiment of that aspect. In addition, for every theme mentioned in text, Thematic finds the relevant sentiment.

How To Use Sentiment Analysis And Thematic Analysis Together

Let’s walk through how you can use sentiment analysis and thematic analysis in Thematic to get more out of your textual data.

Step 1: Upload Your Data

The first step is to upload your unstructured data to a feedback analytics tool like Thematic. This could include online survey feedback, chat conversations, or social media mentions. Thematic has a wide range of one-click integrations that make it really easy to connect all your channels. These include Qualtrics, Trustpilot, Amazon, Facebook, Intercom, Twitter, Tripadvisor, and many more. Thematic then automatically cleans and prepares your data so it’s ready to be analyzed.

Step 2: Analysis

Thematic Analysis Thematic analysis can then be applied to discover themes in your unstructured data. Thematic’s AI groups themes into a 2-level taxonomy. For a given text there will be core themes and related sub-themes. For example, a core theme could be “staff behavior”. A sub-theme could be “friendly crew”. This helps you easily identify what your customers are talking about, for example, in their reviews or survey feedback.

analysis sentiment research

Sentiment Analysis Sentiment analysis builds on thematic analysis to help you understand the emotion behind a theme. Sentiment analysis scores each piece of text or theme and assigns positive, neutral or negative sentiment.

analysis sentiment research

In the example above the theme “print boarding passes” has been selected within the Thematic dashboard. Here you can get an overview of the sentiment associated with this theme across your textual data. Overall this theme has negative sentiment with 61.2% of theme appearances classified as negative. You can also see that this theme appears in 0.4% of customer reviews.

Another option is to filter your themes by sentiment. This allows you to quickly identify the areas of your business where customers are not satisfied. You can then use these insights to drive your business strategy and make improvements.

Combining these two types of analysis can be very powerful. It allows you to understand how your customers feel about particular aspects of your products, services, or your company.

Step 3: Sentiment Analysis + Metrics

Combining Thematic and Sentiment analysis can also help you understand metrics like NPS or customer churn.

This example from the Thematic dashboard tracks customer sentiment by theme over time. You can see that the biggest negative contributor over the quarter was “bad update”. This makes it really easy for stakeholders to understand at a glance what is influencing key business metrics.

analysis sentiment research

With Thematic you also have the option to use our Customer Goodwill metric. This score summarizes customer sentiment across all your uploaded data. It allows you to get an overall measure of how your customers are feeling about your company at any given time.

In the example below you can see the overall sentiment across several different channels. These channels all contribute to the Customer Goodwill score of 70.

analysis sentiment research

Step 4: AI + Human

Thematic’s platform also allows you to go in and make manual tweaks to the analysis. Combining the power of AI and a human analyst helps ensure greater accuracy and relevance.

For example, you may want to scan through the themes and delete any which are not useful. You also have the option to merge themes together, create new themes, and switch between themes and sub-themes.

analysis sentiment research

Step 5: Real-Time Monitoring

The final step in the process is continual real-time monitoring. This can help you stay on top of emerging trends and rapidly identify any PR crises or product issues before they escalate.

In the example above you can see sentiment over time for the theme “chat in landscape mode”. The visualization clearly shows that more customers have been mentioning this theme in a negative sentiment over time. Looking at the customer feedback on the right indicates that this is an emerging issue related to a recent update. Using this information the business can move quickly to rectify the problem and limit possible customer churn.

analysis sentiment research

Where Can You Learn More About Sentiment Analysis?

Sentiment analysis books.

For those who want a really detailed understanding of sentiment analysis there are some great books out there. One of the classics is “ Sentiment Analysis and Opinion Mining ” by Bing Liu. Liu is considered a thought-leader in machine learning. His book is great at explaining sentiment analysis in a technical yet accessible way.

If you’d like to know more about deep learning for sentiment analysis, a great option is “ Deep Learning-Based Approaches for Sentiment Analysis ”. It was published in 2020 and includes insights into the latest trends and advances in deep learning for sentiment analysis.

Those especially interested in social media might want to look at “ Sentiment Analysis in Social Networks ”. This specialist book is authored by Liu along with several other ML experts. It looks at natural language processing, big data, and statistical methodologies.

Sentiment Analysis Research Papers

The field of sentiment analysis is always evolving and there’s a constant flow of new research papers. Here’s a selection of recent papers for those who want to dig deeper into specific subtopics:

  • “ Sentiment Analysis and Subjectivity ” by Bing Liu.
  • “ Sentiment Analysis for Social Media ” by Carlos Iglesias and Antonio Moreno.
  • “ Sentiment Analysis of Twitter Data ” by Apoorv Agarwal et al.
  • “ Machine learning based customer sentiment analysis for recommending shoppers, shops based on customers’ review ” by Shanshan Yi & Xiaofang Liu.
  • “ Sentiment Analysis in English Texts ” by Arwa A. Al Shamsi, Reem Bayari and Said Salloum.

Sentiment Analysis Training

There are plenty of online resources to help you learn how to do sentiment analysis using NLP. Here’s a selection to help you get started:

  • For a great overview of sentiment analysis, check out this Udemy course called “ Sentiment Analysis, Beginner to Expert ”.
  • Udemy also has a useful course on “ Natural Language Processing (NLP) in Python”. This includes how to write your own sentiment analysis code in Python.
  • Buildbypython on Youtube has put together a useful video series on using NLP for sentiment analysis.
  • Those who like a more academic approach should check out Stanford Online. They’ve released some of their lectures on Youtube like this one which focuses on sentiment analysis.

Sentiment Analysis Datasets

To get going with sentiment analysis you may need access to suitable datasets if you don’t already have your own data. Here’s a selection of freely-available datasets which you can use to experiment with sentiment analysis:

  • Amazon product reviews : this dataset features millions of Amazon reviews in fastText format.
  • Reddit comments : this interesting dataset focuses on Reddit comments on Bitcoin between 2009 and 2019.
  • Booking.com reviews : this dataset has thousands of unique hotel reviews.

Those looking at a rule-based approach will need sentiment analysis lexicons or lists of words that have been pre-labelled with sentiment. Here are some useful options:

  • “ Sentiment Lexicons for 81 Languages ” contains both positive and negative sentiment lexicons for 81 different languages.
  • “ Loughran-McDonald Master Dictionary ” includes sentiment word classifications.
  • “ Emoji Sentiment Ranking v1.0 ” is a useful resource that explores the sentiment of popular emoticons.

Final Thoughts On Sentiment Analysis

We hope this guide has given you a good overview of sentiment analysis and how you can use it in your business. Sentiment analysis can be applied to everything from brand monitoring to market research and HR. It’s helping companies to glean deeper insights, become more competitive, and better understand their customers.

Sentiment analysis is also a fast-moving field that’s constantly evolving and developing. That’s why it’s important to stay on top of the latest trends. Another option is to work with a platform like Thematic that’s continually being upgraded and improved. For more information about how Thematic works you can request a personalized guided trial right here .

Get started with a guided trial on your data

Complete the form to get in touch with one of our experts & see the Thematic platform in action.

analysis sentiment research

Book a time for an introductory call

Last updated 20/06/24: Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

analysis sentiment research

  • > Sentiment Analysis
  • > Introduction

analysis sentiment research

Book contents

  • Sentiment Analysis
  • Studies in Natural Language Processing
  • Copyright page
  • Acknowledgments
  • 1 Introduction
  • 2 The Problem of Sentiment Analysis
  • 3 Document Sentiment Classification
  • 4 Sentence Subjectivity and Sentiment Classification
  • 5 Aspect Sentiment Classification
  • 6 Aspect and Entity Extraction
  • 7 Sentiment Lexicon Generation
  • 8 Analysis of Comparative Opinions
  • 9 Opinion Summarization and Search
  • 10 Analysis of Debates and Comments
  • 11 Mining Intent
  • 12 Detecting Fake or Deceptive Opinions
  • 13 Quality of Reviews
  • 14 Conclusion
  • Bibliography

1 - Introduction

Published online by Cambridge University Press:  23 September 2020

Sentiment analysis, also called opinion mining, is the field of study that analyzes people’s opinions, sentiments, appraisals, attitudes, and emotions toward entities and their attributes expressed in written text. The entities can be products, services, organizations, individuals, events, issues, or topics. The field represents a large problem space. Many related names and slightly different tasks – for example, sentiment analysis, opinion mining, opinion analysis, opinion extraction, sentiment mining, subjectivity analysis, affect analysis, emotion analysis, and review mining – are now all under the umbrella of sentiment analysis. The term sentiment analysis perhaps first appeared in Nasukawa and Yi (2003), and the term opinion mining first appeared in Dave et al. (2003). However, research on sentiment and opinion began earlier (Wiebe, 2000; Das and Chen, 2001; Tong, 2001; Morinaga et al., 2002; Pang et al., 2002; Turney, 2002). Even earlier related work includes interpretation of metaphors; extraction of sentiment adjectives; affective computing; and subjectivity analysis, viewpoints, and affects (Wiebe, 1990, 1994; Hearst, 1992; Hatzivassiloglou and McKeown, 1997; Picard, 1997; Wiebe et al., 1999). An early patent on text classification included sentiment, appropriateness, humor, and many other concepts as possible class labels (Elkan, 2001).

Access options

Save book to kindle.

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Introduction
  • Bing Liu , University of Illinois, Chicago
  • Book: Sentiment Analysis
  • Online publication: 23 September 2020
  • Chapter DOI: https://doi.org/10.1017/9781108639286.002

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

What is Sentiment Analysis? Guide, Tools, Uses, Examples

Appinio Research · 12.03.2024 · 30min read

What is Sentiment Analysis Guide Tools Uses Examples

Have you ever wondered how businesses can understand what people really think about their products or services just by analyzing online reviews and social media comments? Sentiment analysis, also known as opinion mining, is the key to unlocking these insights. It's like having a superpower to decipher whether people are happy, frustrated, or indifferent from the words they write. Sentiment analysis uses algorithms to analyze text data and determine the sentiment or opinion expressed within it. From understanding customer feedback to tracking brand perception and even predicting election outcomes, sentiment analysis plays a crucial role in today's data-driven world. In this guide, we'll explore everything you need to get started with sentiment analysis. Whether you're a business looking to improve customer satisfaction or a researcher diving into the world of natural language processing, we will equip you with the knowledge and tools to effectively harness the power of sentiment analysis.

What is Sentiment Analysis?

Sentiment analysis, also known as opinion mining, is the process of analyzing text to determine the sentiment or opinion expressed within it. Whether it's understanding customer feedback , tracking brand perception on social media , or analyzing public sentiment toward political candidates, sentiment analysis helps uncover valuable insights from textual data.

Importance of Sentiment Analysis

  • Informed Decision Making : By analyzing sentiment, businesses can make data-driven decisions, identify trends, and adapt strategies accordingly.
  • Customer Satisfaction : Understanding customer sentiment allows businesses to address concerns, improve products/services, and enhance overall customer satisfaction.
  • Brand Monitoring : Sentiment analysis enables organizations to monitor brand perception, identify potential reputation risks, and engage with customers proactively.
  • Market Intelligence : Analyzing sentiment in market research helps businesses understand consumer preferences, competitive landscapes, and emerging trends.

Sentiment Analysis Challenges and Limitations

  • Ambiguity and Context : Sentiment analysis algorithms may struggle to understand nuances such as sarcasm, irony, and cultural context.
  • Data Quality : Poorly structured or noisy data can impact the accuracy of sentiment analysis results.
  • Subjectivity : Sentiment interpretation can be subjective and context-dependent, leading to variability in results.
  • Language and Cultural Differences : Sentiment analysis models may perform differently across languages and cultures due to linguistic variations and cultural nuances.

Fundamentals of Sentiment Analysis

Sentiment analysis relies on several fundamental techniques and concepts to accurately analyze text data and extract meaningful insights about sentiment and emotions.

Text Preprocessing Techniques

Text preprocessing is a crucial step in sentiment analysis, as it involves cleaning and transforming raw text data into a format suitable for analysis. Here are some standard text preprocessing techniques:

  • Tokenization : Tokenization involves breaking down text into individual words or tokens. This process is essential for further analysis as it allows the model to understand the structure of the text.
  • Normalization : Normalization techniques ensure consistency in text data by converting all words to lowercase and removing punctuation marks. This prevents the model from treating the same word with different capitalizations as different entities.
  • Stopword Removal : Stopwords are common words that do not carry significant meaning, such as "and", "the", and "is". Removing stopwords helps reduce noise in the data and improves the efficiency of sentiment analysis algorithms.
  • Stemming and Lemmatization : Stemming and lemmatization are techniques used to reduce words to their root form. Stemming involves removing prefixes and suffixes from words to obtain their root, while lemmatization maps words to their base or dictionary form. This process helps reduce the dimensionality of the feature space and improve model performance.

Feature Extraction Methods

Feature extraction is the process of transforming raw text data into numerical features that can be used as input to machine learning models. Some common feature extraction methods used in sentiment analysis include:

  • Bag-of-Words (BoW) : The bag-of-words model represents text as a sparse matrix of word frequencies. Each document is represented as a vector where each element corresponds to the frequency of a particular word in the document. While simple and easy to implement, BoW does not consider the order of words in the text.
  • Term Frequency-Inverse Document Frequency (TF-IDF) : TF-IDF is a statistical measure used to evaluate the importance of a word in a document relative to a corpus. It assigns higher weights to words that are frequent in a document but rare in the overall corpus, making it helpful in identifying important keywords.
  • Word Embeddings : Word embeddings are dense, low-dimensional representations of words learned from large text corpora using techniques like Word2Vec, GloVe, or FastText. Word embeddings capture semantic relationships between words, allowing models to generalize unseen data better and improve performance in sentiment analysis tasks.

Types of Sentiment Analysis

Sentiment analysis can be performed at different levels of granularity, depending on the scope of analysis and the specific objectives. Some common types of sentiment analysis include:

  • Document-level Sentiment Analysis : In document-level sentiment analysis, the sentiment of an entire document, such as a review, article, or tweet, is analyzed as a whole. This approach provides an overall assessment of sentiment but may overlook nuances present at the sentence or phrase level.
  • Sentence-level Sentiment Analysis : Sentence-level sentiment analysis focuses on analyzing the sentiment expressed within individual sentences. This approach allows for more fine-grained analysis and can capture variations in sentiment within a document.
  • Aspect-based Sentiment Analysis : Aspect-based sentiment analysis aims to identify the sentiment associated with specific aspects or features of a product, service, or topic. This approach is instrumental in product reviews, where different aspects (e.g., performance, design, price) may elicit different sentiments.

Supervised vs. Unsupervised Learning Approaches

In sentiment analysis, machine learning algorithms can be categorized into supervised and unsupervised learning approaches based on the availability of labeled training data.

  • Supervised Learning : In supervised learning, sentiment analysis models are trained on labeled datasets where each text sample is associated with a sentiment label (e.g., positive, negative, neutral). Supervised learning approaches require a significant amount of labeled data for training but often yield more accurate results, especially in well-defined sentiment classification tasks.
  • Unsupervised Learning : Unsupervised learning approaches do not rely on labeled data for training. Instead, these models use techniques like clustering or dimensionality reduction to identify patterns and structures in the data without explicit supervision. Unsupervised learning can be useful in scenarios where labeled data is scarce or expensive to obtain, but it may require additional effort in interpreting the results and tuning parameters.

Understanding these fundamental concepts is essential for building effective sentiment analysis models and extracting meaningful insights from text data.

Data Collection for Sentiment Analysis

Before diving into sentiment analysis, you must ensure your data is well-prepared and structured for analysis.

Sourcing Data

Sourcing data is the first step in any data-driven analysis, including sentiment analysis. Depending on your specific application, you may collect data from various sources such as social media platforms, review websites, surveys , or feedback forms.

  • Data Relevance : Ensure that the data you collect is relevant to your analysis objectives. For example, if you're analyzing sentiment towards a particular product, collect data from sources where users discuss or review that product.
  • Data Volume : Aim to collect a sufficient volume of data to train robust sentiment analysis models. Larger datasets often result in more accurate and generalizable models.
  • Data Quality : Pay attention to the quality of the data you collect. Noisy or unstructured data can adversely affect the performance of sentiment analysis models. Consider using data validation techniques to ensure data quality.
To unlock the full potential of sentiment analysis, it's crucial to lay a solid foundation with well-prepared data. By sourcing relevant, high-quality data and following best data collection and preparation practices, you set the stage for more accurate and insightful sentiment analysis outcomes. Appinio empowers you to seamlessly collect and analyze real-time consumer insights, providing the data-driven foundation you need to make informed decisions.   Ready to experience the power of Appinio? Book a demo today and discover how our platform can supercharge your sentiment analysis efforts!

Book a Demo

Data Cleaning and Labeling

Once you've collected your data, the next step is to clean and label it for analysis. Data cleaning involves preprocessing the raw text data to remove noise and irrelevant information, while labeling involves assigning sentiment labels to the data samples. Here are some data cleaning and labeling techniques:

  • Text Preprocessing : Apply text preprocessing techniques such as tokenization, normalization, stopword removal, and stemming/lemmatization to clean the text data and standardize its format.
  • Noise Removal : Eliminate irrelevant information from the text data, such as HTML tags, special characters, or non-textual content like emojis or symbols.
  • Labeling Guidelines : Define clear guidelines for labeling sentiment in your data. Decide on the sentiment categories (e.g., positive, negative, neutral) and establish criteria for assigning labels to data samples.
  • Manual Labeling : In cases where sentiment labels cannot be inferred automatically, manually label the data by human annotators. Ensure inter-annotator agreement and consistency in labeling to maintain data quality.

Handling Imbalanced Data

Imbalanced datasets, where one sentiment class is significantly more prevalent than others, are common in sentiment analysis tasks. Imbalanced data can bias model predictions towards the majority class and lead to suboptimal performance. Here are some strategies for handling imbalanced data:

  • Resampling Techniques : Balance the dataset by either oversampling the minority class or undersampling the majority class. Oversampling involves duplicating samples from the minority class, while undersampling involves randomly removing samples from the majority class.
  • Synthetic Minority Over-sampling Technique (SMOTE) : SMOTE is a popular oversampling technique that generates synthetic samples from the minority class using interpolation.
  • Class Weighting : Adjust the class weights during model training to penalize misclassifications in the minority class more heavily. This helps the model prioritize learning from the minority class examples.
  • Ensemble Methods : Ensemble methods like bagging and boosting can improve model performance on imbalanced data by combining predictions from multiple base models trained on different subsets of the data.

Splitting Data for Training and Testing

To evaluate the performance of sentiment analysis models, splitting the data into separate training and testing sets is essential. The training set is used to train the model, while the testing set is used to evaluate its performance on unseen data.

  • Training-Testing Split Ratio : Determine the ratio of data to allocate to the training and testing sets. A common split is 80% for training and 20% for testing, but this can vary depending on the size of your dataset and the complexity of your model.
  • Cross-Validation : Consider using cross-validation techniques such as k-fold cross-validation to assess model performance more robustly. Cross-validation involves partitioning the data into multiple subsets (folds) and training/testing the model on different combinations of these subsets.

By effectively sourcing, cleaning, labeling, and preparing your data, you'll set a solid foundation for building accurate and reliable sentiment analysis models.

Machine Learning Models for Sentiment Analysis

When it comes to sentiment analysis, a variety of machine learning models can be employed to analyze text data and extract sentiment. We'll explore some of the most popular models, techniques, and metrics commonly used in sentiment analysis tasks.

Traditional Models

Traditional machine learning models have been widely used in sentiment analysis tasks, offering simplicity and interpretability. Some common traditional models include:

  • Naive Bayes : Naive Bayes is a probabilistic classifier based on Bayes' theorem, often used in text classification tasks including sentiment analysis. Despite its simplicity, Naive Bayes can achieve competitive performance, especially with large datasets.
  • Support Vector Machines (SVM) : SVM is a supervised learning algorithm that aims to find the optimal hyperplane to separate different classes. SVMs are particularly effective in high-dimensional spaces, making them suitable for sentiment analysis tasks with large feature spaces.

Deep Learning Models

Deep learning models have revolutionized sentiment analysis by leveraging the power of neural networks to learn complex patterns from data. Some popular deep learning models for sentiment analysis include:

  • Recurrent Neural Networks (RNNs) : RNNs are designed to process sequential data and are well-suited for analyzing text data. They can capture dependencies between words in a sequence, making them effective for tasks like sentiment analysis. However, RNNs suffer from the vanishing gradient problem and struggle to capture long-range dependencies.
  • Convolutional Neural Networks (CNNs) : CNNs excel at capturing local patterns in data through convolutional filters. In sentiment analysis, CNNs can learn to extract relevant features from text data, such as n-grams or word sequences, and classify sentiment based on these features.

Transfer Learning Techniques

Transfer learning techniques have gained popularity in sentiment analysis, allowing models to leverage pre-trained representations of language from large corpora. Some transfer learning techniques for sentiment analysis include:

  • BERT (Bidirectional Encoder Representations from Transformers) : BERT is a pre-trained language model that has been fine-tuned for various NLP tasks, including sentiment analysis. By fine-tuning BERT on task-specific data, you can leverage its contextual understanding of language to improve sentiment analysis performance.
  • GPT (Generative Pre-trained Transformer) : GPT is another pre-trained transformer-based model that can be fine-tuned for sentiment analysis tasks. GPT generates text by predicting the next word in a sequence, making it well-suited for generating text-based sentiment predictions.

Evaluation Metrics

To assess the performance of sentiment analysis models, various evaluation metrics can be used to measure their accuracy and effectiveness. These evaluation metrics include:

  • Accuracy : Accuracy measures the proportion of correctly classified samples out of the total samples in the dataset. While accuracy provides a general indication of model performance, it may not be suitable for imbalanced datasets.
  • Precision : Precision measures the proportion of true positive predictions among all positive predictions made by the model. It indicates the model's ability to avoid false positives.
  • Recall : Recall measures the proportion of true positive predictions among all actual positive samples in the dataset. It indicates the model's ability to capture all positive instances in the dataset.
  • F1 Score : The F1 score is the harmonic mean of precision and recall, providing a balanced measure of a model's performance. It considers both false positives and false negatives and is particularly useful for imbalanced datasets.

By understanding and selecting appropriate machine learning models and evaluation metrics, you can build robust sentiment analysis systems that accurately capture and analyze sentiment from text data.

Sentiment Analysis Uses and Applications

Here are some examples showcasing the versatility and effectiveness of sentiment analysis. They help provide concrete illustrations of how sentiment analysis is applied in various contexts, shedding light on its practical implications and potential benefits.

Customer Feedback Analysis

Imagine you're a business owner receiving numerous online reviews about your products or services. By applying sentiment analysis to these reviews, you can categorize them into positive, negative, or neutral sentiments.

For instance, positive reviews might highlight aspects like excellent customer service or product quality, while negative reviews could point out areas for improvement, such as shipping delays or product defects. Analyzing the sentiment distribution over time allows you to track trends, identify recurring issues, and proactively address customer concerns, ultimately enhancing customer satisfaction and loyalty .

Social Media Monitoring

Social media platforms like Twitter, Facebook, and Instagram are rich sources of user-generated content, offering valuable insights into public sentiment and opinion. Sentiment analysis enables organizations to monitor real-time brand mentions, hashtags, and comments, gauging public sentiment toward their brand, products, or campaigns.

For example, during a product launch, sentiment analysis can help assess the initial reception, identify influencers or brand advocates, and respond promptly to any negative feedback or issues users raise. By actively engaging with the online community and addressing concerns, businesses can build trust, foster positive relationships, and cultivate a strong brand reputation.

Market Research and Competitive Analysis

In market research , sentiment analysis is a powerful tool for understanding consumer preferences, market trends, and competitive landscapes. For instance, analyzing sentiment in product reviews and online forums can reveal emerging trends, feature preferences, and competitor strengths and weaknesses.

By identifying market gaps and customer pain points, businesses can tailor their offerings to meet consumer needs more effectively, gain a competitive edge, and capitalize on new opportunities. Moreover, sentiment analysis can aid in benchmarking against competitors , providing valuable insights into how your brand stacks up in terms of customer sentiment and satisfaction.

Political Sentiment Analysis

During elections or political campaigns, sentiment analysis is pivotal in gauging public sentiment toward candidates, parties, and policy issues. Political analysts can track sentiment trends, identify key influencers, and assess voter sentiment in real time by analyzing social media conversations , news articles, and public forums. This information can inform campaign strategies, messaging tactics, and policy priorities, helping political candidates and parties connect with voters, address concerns, and shape public perception effectively.

These examples demonstrate the diverse applications and benefits of sentiment analysis across various domains, highlighting its potential to drive informed decision-making, enhance customer experiences, and foster positive outcomes. By leveraging sentiment analysis effectively, organizations can gain valuable insights, mitigate risks, and stay ahead in today's data-driven world.

Advanced Topics in Sentiment Analysis

As sentiment analysis continues to evolve, researchers and practitioners explore advanced topics and techniques to enhance the accuracy and effectiveness of sentiment analysis systems.

Domain Adaptation

Domain adaptation refers to the process of adapting sentiment analysis models to new domains or contexts where labeled data may be scarce or non-representative . In real-world applications, sentiment analysis models trained on one domain may not perform well when applied to another domain due to differences in vocabulary, style, or sentiment expressions.

Domain adaptation techniques aim to mitigate this domain shift by leveraging transfer learning or domain-specific features to adapt the model to the target domain. Examples of domain adaptation techniques include unsupervised domain adaptation, where the model learns domain-invariant features, and adversarial training, where the model is trained to discriminate between source and target domains.

Multimodal Sentiment Analysis

Multimodal sentiment analysis integrates information from multiple modalities, such as text, images, audio, and video, to enhance the understanding of sentiment. In many real-world scenarios, sentiment is expressed not only through text but also through visual and auditory cues.

For example, in product reviews , sentiment can be conveyed through the tone of voice in videos or the facial expressions in images. Multimodal sentiment analysis techniques leverage deep learning architectures capable of processing multiple modalities simultaneously, enabling more comprehensive sentiment analysis. By integrating information from diverse modalities, multimodal sentiment analysis can provide richer and more nuanced insights into sentiment expressions.

Sentiment Analysis in Social Media

Social media platforms have become rich sources of user-generated content, making them valuable for sentiment analysis. However, sentiment analysis in social media poses unique challenges due to the informal language, short text lengths, and the prevalence of sarcasm and irony. Social media sentiment analysis techniques often involve preprocessing steps tailored to handle noisy and informal text, such as hashtags, emojis, and slang.

Additionally, sentiment analysis models for social media may incorporate user-specific features, such as user profiles and social connections, to improve sentiment prediction accuracy. Sentiment analysis in social media enables organizations to monitor brand perception, track public sentiment toward specific topics or events, and identify emerging trends and influencers.

Handling Context and Sarcasm

Understanding context and sarcasm is essential for accurate sentiment analysis, as text often contains subtle nuances that influence sentiment interpretation. Contextual sentiment analysis techniques aim to capture the context surrounding text snippets to infer the intended sentiment accurately. This may involve analyzing preceding or succeeding text segments, identifying sentiment modifiers, or considering the broader context of the conversation.

Sarcasm detection in sentiment analysis presents a particularly challenging task, as sarcastic expressions often convey sentiments opposite to their literal meaning. Sarcasm detection techniques leverage linguistic cues, such as lexical ambiguity, incongruity, and sentiment reversals, to identify sarcastic utterances. By effectively handling context and sarcasm, sentiment analysis systems can provide more accurate and contextually relevant sentiment predictions, leading to better decision-making and insight extraction.

Sentiment Analysis Tools

When it comes to sentiment analysis, many tools and libraries are available to streamline the development process and empower analysts and developers to build robust sentiment analysis systems. Let's explore some of the key tools, libraries, and considerations for sentiment analysis.

Popular Sentiment Analysis Libraries

Several libraries and frameworks have gained popularity for their effectiveness and ease of use in sentiment analysis tasks:

  • NLTK (Natural Language Toolkit) : NLTK is a comprehensive library for natural language processing tasks, including sentiment analysis. It provides various tools and resources for text processing, such as tokenization, stemming, and part-of-speech tagging.
  • SpaCy : SpaCy is a fast and efficient natural language processing library known for its performance and ease of use. It offers pre-trained models for tasks like part-of-speech tagging, named entity recognition, and dependency parsing, making it suitable for sentiment analysis tasks.
  • TensorFlow : TensorFlow is an open-source machine learning framework developed by Google for building and training deep learning models. It offers high-level APIs like Keras for building neural networks, making it suitable for sentiment analysis tasks involving deep learning architectures.
  • PyTorch : PyTorch is another popular deep learning framework known for its flexibility and dynamic computation graph. It provides a Pythonic interface for building and training neural networks, making it suitable for sentiment analysis tasks requiring flexibility and customization.

Sentiment Analysis APIs

For developers looking to integrate sentiment analysis into their applications quickly, sentiment analysis APIs offer a convenient solution. These APIs provide pre-trained sentiment analysis models accessible via simple HTTP requests, allowing developers to analyze text data with minimal setup. Some popular sentiment analysis APIs include:

  • Google Cloud Natural Language API : offers sentiment analysis capabilities, along with other natural language processing features such as entity recognition and syntax analysis.
  • Microsoft Azure Text Analytics API : provides sentiment analysis, key phrase extraction, and language detection capabilities, enabling developers to extract insights from text data effortlessly.
  • IBM Watson Natural Language Understanding : offers sentiment analysis, emotion detection, and entity extraction capabilities, empowering developers to analyze text data comprehensively.

Custom Implementation Considerations

While pre-built libraries and APIs offer convenience, custom implementation of sentiment analysis models provides flexibility and control over the entire pipeline. Here are some key factors to consider:

  • Data Availability : Ensure you have access to sufficient labeled data for training your custom sentiment analysis model. High-quality labeled data is crucial for building accurate and robust models.
  • Model Selection : Choose the appropriate machine learning or deep learning model based on your data characteristics, task requirements, and computational resources. Experiment with different architectures and hyperparameters to find the optimal model for your sentiment analysis task.
  • Feature Engineering : Explore various feature extraction techniques to represent text data effectively for sentiment analysis. Consider using word embeddings, TF-IDF vectors, or domain-specific features to capture meaningful information from text.
  • Model Evaluation : Evaluate your custom sentiment analysis model using appropriate evaluation metrics, such as accuracy, precision, recall, and F1 score. Validate the model's performance on unseen data to ensure its generalization ability.

By leveraging these tools, libraries, and considerations, you can develop robust sentiment analysis systems tailored to your specific needs and requirements, whether through pre-built solutions or custom implementations.

Sentiment Analysis Best Practices

Effective sentiment analysis requires careful consideration of various factors, from data preprocessing to model selection and evaluation. Here are some best practices and tips to enhance the accuracy and reliability of your sentiment analysis systems:

  • Define Clear Objectives : Clearly define the objectives and scope of your sentiment analysis project. Determine the specific sentiment categories of interest (e.g., positive, negative, neutral) and the target audience or domain.
  • Preprocess Text Data : Invest time in preprocessing your text data to clean and standardize it. Apply techniques like tokenization, normalization, and stopword removal to prepare the data for analysis. Consider language-specific preprocessing steps based on the characteristics of your text.
  • Collect Diverse Data: Ensure that your dataset includes diverse samples representing different demographics , geographic regions, and user segments , and leverage data collection tools like Appinio to simplify and automate the process. This helps to capture a comprehensive range of sentiments and avoids biases in the analysis .
  • Leverage Domain Knowledge : Gain domain knowledge relevant to your sentiment analysis task. Understand the context in which sentiment is expressed, including industry-specific terminology, slang, and cultural nuances. Domain knowledge can help improve the accuracy and relevance of sentiment analysis results.
  • Explore Feature Engineering : Experiment with different feature engineering techniques to represent text data effectively for sentiment analysis. Consider using word embeddings, TF-IDF vectors, or domain-specific features to capture meaningful information from text and improve model performance.
  • Select Appropriate Models : Choose the appropriate machine learning or deep learning models based on your data characteristics, task requirements, and computational resources. Consider factors such as model complexity, interpretability, and scalability when selecting models for sentiment analysis.
  • Evaluate Model Performance : Evaluate the performance of your sentiment analysis models using appropriate evaluation metrics, such as accuracy, precision, recall, and F1 score. Validate the models on diverse datasets and consider conducting cross-validation to assess robustness.
  • Address Bias and Fairness : Be mindful of bias and fairness considerations in sentiment analysis. Evaluate models for bias across different demographic groups and take steps to mitigate bias in training data and model predictions. Consider incorporating fairness-aware techniques into your sentiment analysis pipeline.
  • Monitor Model Performance : Continuously monitor the performance of your sentiment analysis models in real-world applications. Track changes in model performance over time and collect feedback from end-users to identify areas for improvement. Update models regularly to adapt to evolving language patterns and user preferences.
  • Document and Share Insights : Document your sentiment analysis workflow, including data preprocessing steps, model selection criteria, and evaluation results. Share insights and findings with stakeholders to foster collaboration and informed decision-making. Consider creating visualizations or dashboards to communicate sentiment analysis results effectively.

Following these best practices and tips, you can develop robust and reliable sentiment analysis systems that provide valuable insights into the attitudes, emotions, and opinions expressed in text data.

Conclusion for Sentiment Analysis

Sentiment analysis offers a powerful tool for understanding and interpreting human emotions and opinions expressed in text data. By leveraging advanced algorithms and techniques, businesses, researchers, and individuals can gain valuable insights into customer preferences, brand perception, market trends, and more. From improving product offerings to enhancing customer satisfaction and informing strategic decision-making, sentiment analysis has the potential to drive positive outcomes across various domains. As technology continues to evolve, sentiment analysis will remain a vital component of the data analytics toolkit, empowering organizations to stay competitive, responsive, and attuned to the needs and sentiments of their stakeholders. However, it's essential to acknowledge the challenges and limitations inherent in sentiment analysis, such as the ambiguity of language, cultural differences, and the subjective nature of sentiment interpretation. Despite these challenges, continuous advancements in machine learning, natural language processing, and data analytics hold promise for overcoming these obstacles and improving the accuracy and reliability of sentiment analysis systems.

How to Conduct Sentiment Analysis in Minutes?

Introducing Appinio , the real-time market research platform revolutionizing sentiment analysis. With Appinio, companies can effortlessly collect real-time consumer insights to fuel their data-driven decisions. Say goodbye to tedious research processes and hello to instant insights that drive business success.

Here's why Appinio is the ultimate tool for conducting sentiment analysis:

  • Get from questions to insights in minutes:  With our intuitive platform, you can design and deploy surveys in no time, allowing you to gather valuable sentiment data at lightning speed.
  • No research degree required:  Appinio's user-friendly interface makes it easy for anyone to conduct market research, regardless of their background. You don't need a PhD in research to navigate our platform and extract meaningful insights.
  • Reach your target audience quickly and accurately:  With access to over 1,200 characteristics and the ability to survey respondents in over 90 countries, you can precisely define your target group and gather sentiment data from the right audience.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Brand Development Definition Process Strategies Examples

26.06.2024 | 35min read

Brand Development: Definition, Process, Strategies, Examples

Discover future flavors using Appinio predictive insights to stay ahead of consumer preferences.

18.06.2024 | 7min read

Future Flavors: How Burger King nailed Concept Testing with Appinio's Predictive Insights

What is a Pulse Survey Definition Types Questions

18.06.2024 | 32min read

What is a Pulse Survey? Definition, Types, Questions

Subscribe to the PwC Newsletter

Join the community, add a new evaluation result row, sentiment analysis.

1330 papers with code • 39 benchmarks • 93 datasets

Sentiment Analysis is the task of classifying the polarity of a given text. For instance, a text-based tweet can be categorized into either "positive", "negative", or "neutral". Given the text and accompanying labels, a model can be trained to predict the correct sentiment.

Sentiment Analysis techniques can be categorized into machine learning approaches, lexicon-based approaches, and even hybrid methods. Some subcategories of research in sentiment analysis include: multimodal sentiment analysis, aspect-based sentiment analysis, fine-grained opinion analysis, language specific sentiment analysis.

More recently, deep learning techniques, such as RoBERTa and T5, are used to train high-performing sentiment classifiers that are evaluated using metrics like F1, recall, and precision. To evaluate sentiment analysis systems, benchmark datasets like SST, GLUE, and IMDB movie reviews are used.

Further readings:

  • Sentiment Analysis Based on Deep Learning: A Comparative Study

analysis sentiment research

Benchmarks Add a Result

--> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> -->
Trend Dataset Best ModelPaper Code Compare
T5-11B
RoBERTa-large with LlamBERT
Heinsen Routing + RoBERTa Large
XLNet
VLAWE
XLNet
MA-BERT
AnglE-LLaMA-7B
BERT large
BERT large
InstructABSA
W2V2-L-LL60K (pipeline approach, uses LM)
BERTweet
UDALM: Unsupervised Domain Adaptation through Language Modeling
RoBERTa-large 355M + Entailment as Few-shot Learner
k-RoBERTa (parallel)
CalBERT
LSTMs+CNNs ensemble with multiple conv. ops
RobBERT v2
AEN-BERT
RuBERT-RuSentiment
xlmindic-base-uniscript
LSTMs+CNNs ensemble with multiple conv. ops
FiLM
Space-XLNet
fastText, h=10, bigram
CNN-LSTM
CNN-LSTM
Random
RoBERTa-wwm-ext-large
RoBERTa-wwm-ext-large
AraBERTv1
AraBERTv1
AraBERTv1
Naive Bayes
SVM
RCNN
lstm+bert
CalBERT

analysis sentiment research

Most implemented papers

Bert: pre-training of deep bidirectional transformers for language understanding.

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.

Convolutional Neural Networks for Sentence Classification

analysis sentiment research

We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks.

Universal Language Model Fine-tuning for Text Classification

analysis sentiment research

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.

Bag of Tricks for Efficient Text Classification

facebookresearch/fastText • EACL 2017

This paper explores a simple and efficient baseline for text classification.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

A Structured Self-attentive Sentence Embedding

This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

Deep contextualized word representations

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training.

Domain-Adversarial Training of Neural Networks

Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains.

Sentiment analysis: Find out what users think about your brand & products

Last updated

14 February 2023

Reviewed by

Do you know exactly how your target audience feels about your business?

To thrive, your company needs an in-depth understanding of its customers. Beyond exploring their experience shopping with and using your brand, it’s equally important to study how your customers feel about your business.

Thankfully, there are systematic ways to collect information about your customers' opinions. Through detailed sentiment analysis, you can collate valuable data about positive, negative, and emotional responses that your target audiences hold about your brand.

Whether you’re looking to expand to a new market or trying to get to the bottom of a string of negative social media comments, sentiment analysis is a must-use tool. 

It’s perfect for informing your team about how your target audience sees your products and services.

  • What is sentiment analysis?

Sentiment analysis (also known as emotions AI, opinion mining, or affective rating) systematically analyzes and classifies text to determine a tone of positivity, negativity, or neutrality. Simply put, it is the process of using computerized systems to determine the emotional tone and context of words used in customer feedback.

Sentiment analysis tools look for particular words and phrases that convey tone and emotion. 

These tools can analyze sentiment by combing through social media posts, Google reviews, responses to customer satisfaction surveys, and more. 

So, what does this software look for?

analysis sentiment research

Learn more about sentiment analysis software

Positive sentiment.

This is when a customer enjoys or appreciates your product or service. They often express their appreciation with words like “good,” “great,” “amazing,” “fantastic, ” and “ awesome .” 

In most cases, the more positive feedback that your business receives, the more in touch you are with your desired audience.

Negative sentiment

Negative sentiment comes with harsh and emotional language in feedback. Examples of words that imply negative sentiment include “bad,” “gross,” “difficult,” “terrible,” and “ disappointing .” 

You can find these emotionally-charged words in customer reviews, social media rants, and responses to product surveys .

Neutral sentiment 

Neutral sentiment is harder to discern because it is not a particularly emotionally-charged tone. Responses including words like “ok,” “alright,” and “fine” are examples of customer feedback that you could consider neutral. 

In most cases, this is the least common form of feedback a business receives because very few people leave mid-tier reviews for products or services.

Don’t forget how slang can play into feedback: Words such as “wicked” or “bad” can sometimes mean the opposite!

  • What is natural language processing (NLP) sentiment analysis?

To speed up the process of churning through thousands of pieces of unstructured customer feedback, natural language processing (NLP) sentiment analysis can be a helpful strategy.

Natural language processing is a complex interdisciplinary field that combines computer science, artificial intelligence, and linguistics. It teaches software systems how to interact, understand, and rate the emotion and nuances of human language.

NLP sentiment analysis software trains computers to identify and contextualize human words and phrases, saving time and energy throughout the analysis process. 

Depending on the type of information that your company is looking to gain, you can use NLP sentiment analysis for several purposes:

Identifying emotional tone

Qualifying the extent of positivity and negativity

Identifying specific, granular emotions such as happiness, disappointment, and anger 

Often, companies pay NLP sentiment analysis services to provide software for these tasks, offloading the work of creating and training these systems from internal resources.

  • What is a sentiment score?

A sentiment score is one of the metrics you get from sentiment analysis. Often on a numeric scale, a sentiment score comes from your brand’s positive or negative feedback.

One of the most common types of sentiment scores is grading customer feedback on a scale from one to ten, from most negative to most positive. 

Depending on the type of information that you are looking to collect, sentiment scores can value information such as:

The number of positive or negative ratings 

The emotional strength of each writing piece (strongly enjoyed vs. intensely disliked)

The context surrounding the feedback, like the timing and reasoning behind the rating

Using your sentiment score as a guide, your company will have a clearer idea of which path to take moving forward. For example, if your score is lower than you wanted, you know that you need to focus on customer satisfaction and adjusting your product/service/user experience. 

If your score is higher, your business is meeting your customer’s expectations. Still, there are always areas to improve.

  • Example data sources for sentiment analysis

Depending on the information you want to collect, you can analyze different datasets to provide a deeper understanding of the general sentiment of your customers. Examples of common data sources for sentiment analysis include:

Survey text analysis

Have you recently sent out a customer satisfaction survey ? This is the perfect opportunity to try sentiment analysis! You can use a sentiment analysis tool to evaluate survey feedback and report on your customers’ most commonly felt emotions toward your brand.

Customer review text analysis

Whether you collect customer reviews directly on your website, after a purchase, or from Google reviews, this is a goldmine of valuable sentiment information. Reviews are fascinating data sets because they are very polarizing — often very positive or negative.

Social media text analysis

As social media platforms like Twitter, Facebook, Instagram, and LinkedIn continue to rise in popularity, so do unprompted conversations about your brand. 

Your business can better understand your brand’s day-to-day sentiment changes using live tracking sentiment analysis tools. This is particularly helpful during product launches, website redesigns, or when a controversy or crisis affects your brand or service.  

  • What is sentiment analysis used for?

As the interest in customer satisfaction and experience increases across all industries, more companies of all sizes are integrating sentiment analysis into their user research practices.

Whether your business is in eCommerce, marketing, manufacturing, or market research, sentiment analysis can offer a wide variety of benefits:

Brand reputation management

Your brand is your reputation — are you doing enough to protect it from negative publicity?

Managing a brand is a complex, multi-faceted process. A significant portion of the public opinion of your brand comes from the value and experience of using your products or services. However, other factors like website usability, search engine optimization, and social media presence influence your customers' perception of your brand. 

Your company can run a sentiment analysis to keep tabs on what the general public and your target audience are saying about your brand. Look at social media posts, published articles, and general think-pieces that mention your brand in any way. Insight into the tone of feedback your company is getting can support data-driven decision-making .

Marketing research

No matter how long your company has been in business, conducting thorough marketing research to better understand your competitors and customers is part of a winning brand strategy.

As a great tool for providing detailed information about specific markets, niches, and customer spending habits, sentiment analysis helps you quickly and efficiently identify trends. If your business is looking to launch a new product or enter a new market, conducting a thorough sentiment analysis is one of the best ways to encourage long-term success .

Processing customer feedback

If your company is already using customer satisfaction surveys as part of your user research process, sentiment analysis can help you get even more information from your feedback.

Depending on the type of questions you’re using in your surveys, a sentiment analysis tool can parse the tone and emotion behind each submission. This style of data analysis is particularly helpful for surveys with open-ended text questions, where customers can freely type their feedback.

A sentiment analysis program can go through survey answers and provide insights into your customers’ general tone to determine whether your customers are happy. Once complete, this information will be essential in improving and adjusting your offers to best meet the needs of your target audience.

Free AI content analysis generator

Make sense of your research by automatically summarizing key takeaways through our free content analysis tool.

analysis sentiment research

Crisis prevention and political management

We all know the saying, “any press is good press,” but it’s usually best to avoid negative publicity.

In the modern age of social media and trending topics, companies can easily catch unwanted negative attention seemingly overnight. 

In the event of a brand crisis, it’s vital to quickly identify negative sentiment trends by live-tracking social media and public contact form posts. High-quality sentiment analysis tools can keep your brand one step ahead of any swells of incoming negative feedback.

  • Types of sentiment analysis

When starting a sentiment analysis project for your business, it is vital to know the different types and methods to get the specific results you need. 

Examples of some of the most effective types of sentiment analysis include:

Graded sentiment analysis

One of the most common types of sentiment analysis involves grading or numerically scaling the positive or negative sentiment in your data.

While filtering and processing the collected information, sentiment analysis tools can create a numerical value that represents the specific metric you are measuring. 

For example, graded sentiment analysis can create:

Sentiment scores on a scale from 1–10, ranging from negative to positive feedback

A star rating system, indicating happiness with your product/service from 1–5 stars

Positivity or negativity percentages, based on a set group of feedback

Ranges of descriptors such as “very positive” or “somewhat negative” with a product or service offering

Aspect-based sentiment analysis

Product and service troubleshooting is an essential area that sentiment analysis can assist with.

You may have minor hitches when your business launches a new offering. Perhaps your customers can’t access their accounts, or your website has 404 errors. Whatever the issue, your business must be able to correctly identify it to find the solution.

Using aspect-based sentiment analysis, your company can collect and interpret valuable information about these events. This helps you identify trends and areas requiring additional assistance and tweaking. 

Examples of situations where aspect-based sentiment analysis is incredibly valuable include:

Customers reporting a payment error when trying to purchase your new product

Website errors reported to your customer service chatbot

Subscription or service cancellations based on a bad experience with a staff member

Multilingual sentiment analysis

Does your company have international reach? If so, it’s essential to consider the many languages used by your customer base when collecting valuable data about brand reputation.

Depending on the broadness of your target audience, cultural and language differences can significantly impact the type of data and feedback you will receive. 

While it is essential to collect this information to see how your brand is performing in different cultural and geographic locations, it poses challenges. Multilingual sentiment analysis can come up against issues when translating and examining the true meaning of customer feedback.

To tackle this issue, you can train multilingual sentiment analysis tools to perform the following essential steps to successful language translation:

Speech tagging

This is the process of identifying and labeling nouns, verbs, and descriptive, emotional words in each sentence. It’s the first step to understanding the tone and content of each sentence.

Word root identification

Also referred to as lemmatization, a language-specific tool analyzes identified words to better understand their root. 

For example, the words “drinking,” “drank,” and “ drunk ” come from the root word “ drink .” Once the tool categorizes these words, the sentiment analysis tool better understands the sentence context.

Prior polarity

A sentiment analysis tool can search for words’ intent and emotional context. Words like “ love ” and “ despise ” highlight a strong emotional meaning: Something that creates a polarizing effect. 

Identifying these words helps the system interpret the feedback’s emotional tone and context.

Grammar assessment

Every language has a set list of grammatical rules so speakers can understand each other. Our languages range in complexity from minor sentence structure choices to contextual sarcasm and humor. 

That means multilingual sentiment analysis must be able to identify and understand the unique grammar quirks of each language.

Machine learning

Once you’ve completed all these steps, machine learning tools can compile the data into a coherent score or statistic for your company. In most cases, this is a positive or negative numerical score, indicating the emotional response to the area you are studying.

Emotion detection

As it says in the name, this style of analysis hyper-focuses on gaining information about the underlying emotions and experiences of your customers and target audience.

Instead of creating a numeric scale to represent the results, this sentiment analysis uses emotional words or images to create a more inclusive, broad result. 

Emotion detection sentiment analysis works best on longer written feedback from social media or feedback surveys. Through the analysis, the tool can highlight buzzwords and keywords, helping your team better understand how people perceive your brand and the level of engagement. 

This type of sentiment analysis can be beneficial in the following areas of business expansion:

Content creation and rating: Which blog topics connect with people the most?

Website redesign : Are your customers enjoying the design changes you made?

After a new product launch: Did your target audience respond well to your offering? 

  • Using sentiment analysis to grow your business

Is your business looking to expand, change, or explore new business opportunities and markets? Starting strong with detailed sentiment analysis is an essential step to success.

How customers feel about your brand is more important than many people realize. 

No matter how great, amazing, or revolutionary your offerings are, if your target audience finds reasons to dislike your brand, they will not spend money with your company. They may have political, emotional, or user-experience-based reasons. Whatever their reasons may be, you need to identify these sticking points. 

Sentiment analysis is a key strategy for your user research plan. It’s the best way to combat any developing trends of negativity, learn more about pain points, and lean into what your customers really enjoy.

Should you be using a customer insights hub?

Do you want to discover previous customer research faster?

Do you share your customer research findings with others?

Do you analyze customer research data?

Start for free today, add your research, and get to key insights faster

Editor’s picks

Last updated: 25 May 2023

Last updated: 22 February 2024

Last updated: 25 March 2023

Last updated: 20 March 2024

Last updated: 16 March 2024

Last updated: 29 February 2024

Last updated: 13 May 2024

Last updated: 2 March 2024

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next.

analysis sentiment research

Users report unexpectedly high data usage, especially during streaming sessions.

analysis sentiment research

Users find it hard to navigate from the home page to relevant playlists in the app.

analysis sentiment research

It would be great to have a sleep timer feature, especially for bedtime listening.

analysis sentiment research

I need better filters to find the songs or artists I’m looking for.

Log in or sign up

Get started for free

Sentiment analysis: The Complete Guide

Home Blog Media Monitoring Sentiment analysis: The Complete Guide

Posted on April 11th 2024

Lucas Carval | 21 min read

Sentiment analysis has become an essential tool in interpreting the textual data generated daily online. Wondering what it’s all about?

Essentially, it’s the process businesses and researchers use to measure the public’s feelings towards products, services, or topics by analyzing language and emotion in text.

Imagine being able to understand the collective mood of your customers through their tweets, reviews, or feedback – that’s exactly what sentiment analysis works towards.

From customer feedback to monitoring brand reputation, it’s helping companies to listen more closely to the voice of their customer than ever before.

It’s not just about whether the sentiment is positive or negative, but also about the why and how. Companies are leveraging this technique not only to better understand their audience’s reactions but also to tailor their strategies accordingly.

Sentiment Analysis intro

One might wonder, “ Is it really a game-changer? “

Well, when sentiment analysis tools are applied to social media comments or customer reviews, businesses can rapidly identify a shift in public opinion, giving them a leg up in managing their brand image proactively.

While sentiment analysis sounds rather technical, it’s a very accessible concept with tools available for both beginners and experienced data analysts.

Whether for analyzing stock market trends, assessing campaign impact, or ensuring compliance, sentiment analysis helps in translating raw data into actionable insights.

It’s not just about numbers; it’s about understanding human emotion at scale.

So next time you see a company reacting swiftly to customer opinions or a product tweaked to better meet consumer needs, there may just be a sophisticated sentiment analysis system doing its job in the background.

Want to learn more about it? Let’s dive in!

What is Sentiment Analysis?

Sentiment analysis, sometimes referred to as opinion or data mining, has become an invaluable tool in interpreting big quantity of texts. It’s there to decode the emotional subtext in everything from tweets to product reviews.

The Power of Sentiment Analysis: Leveraging Data Insights to Understand Customer Perception and Improve Marketing Strategies

Definition and Scope

Sentiment analysis is a blend of text analysis and natural language processing.

But what does it really do?

Simply put, sentiment analysis is identifying and categorizing opinions expressed in text to determine the writer’s attitude towards a particular topic, product, or service. This includes pinpointing whether you have positive sentiment, negative sentiment, or neutral sentiment.

  • Text Analysis: Sorts through chunks of text to find relevant information.
  • Natural Language Processing (NLP): Helps computers understand human language.

Now, let’s talk scope!

Think of sentiment analysis as a radar for public opinion. Companies and organizations use this technique to sift through feedback, social media banter , and discussion forums .

But it’s not just about whether people are giving thumbs up or down. They try to gauge the intensity and emotional undertones of those sentiments.

  • Positive Sentiment: Yay! Your customers love the new coffee flavor.
  • Negative Sentiment: Uh-oh! Maybe that ad campaign wasn’t a hit after all.
  • Neutral Sentiment: No strong feelings here, it’s business as usual.

With sentiment analysis, one can extract invaluable insights from customer feedback, predict market trends, and even monitor brand health. And because it’s automated, one can analyze vast amounts of text swiftly, without needing a whole team of human readers.

Types of Sentiment Analysis

Let’s dive in and discover the various types of sentiment classification that give businesses the crystal ball to see how their customers really feel.

Polarity-Based Sentiment Analysis

This is the basic type of sentiment analysis. It’s like a compass that points to whether the sentiment of a text is positive, negative, or neutral. Companies depend on it to quickly analyze sentiment and measure public perception.

Fine-Grained Sentiment Analysis

Looking for more nuance?

Fine-grained sentiment analysis does just that, breaking down opinions into categories like “ very positive “, “ positive “, “ neutral “, “ negative ” and “ very negative “.

It’s not just a thumbs up or down; it’s the whole hand with all its gestures!

Fine-Grained Sentiment Analysis

Emotion Detection

Now, this one is fascinating! Emotion detection identifies specific emotions like joy, anger, sadness, or surprise in the text.

Imagine it as a detective sifting through words to uncover hidden emotional clues.

Aspect-Based Sentiment Analysis (ABSA)

Ever wondered what particular part of a product people love or hate?

ABSA sheds light on this by linking sentiments to specific aspects or attributes of a product or service.

It’s like having a spotlight that only shines on what really matters.

Intent Analysis

Lastly, there’s intent analysis. This is less about how people feel and more about what they intend to do.

Will they buy? Will they recommend? It’s the crystal ball for predicting future actions.

Why is Sentiment Analysis important?

Sentiment analysis stands out in the digital era as a pivotal tool for understanding vast streams of online communication.

Consider the enormous quantity of data generated every minute; companies find it invaluable to sort through and categorize sentiments efficiently.

Why search through data manually when algorithms can swiftly do the heavy lifting?

Imagine you’re a business trying to measure public opinion. Through sentiment analysis, it’s feasible to perform Real-Time Analysis, capturing the pulse of consumer sentiment as it fluctuates.

This not only saves time but also allow businesses to react proactively rather than retrospectively to market trends.

Accuracy is paramount.

Sentiment analysis tools have evolved, no longer just tallying up ‘good’ and ‘bad’ keywords but understanding context.

They’ve become adept at sifting through sarcasm, slang, and nuances, offering a fine-tuned breakdown of emotions.

The focus here is not just on what is being said, but how it’s being expressed.

Businesses leverage sentiment analysis to address customer feedback, monitor brand perception , and evolve strategies.

In short, it’s about staying informed and agile in a dynamic market landscape.

Sentiment Analysis Examples

Sentiment analysis, quite the game-changer, isn’t it? It’s fascinating how machines can now decipher our very emotions from words.

How do they do it? Let’s take a peek at some examples that might just resonate with everyone’s daily digital interactions.

Picture a customer tweeting this :

Positive sentiment analysis

A sentiment analysis algorithm would likely classify this as Positive sentiment. The jubilation is clear as day, with words like ‘ absolutely love ‘ and ‘ superb ‘ waving the positive flag.

On the flip side, consider the frustration in this one :

Negative sentiment analysis

Here, the negative word ‘ disappointed ‘ steers this towards a Negative classification – a not-so-subtle nod to a not-so-great experience.

And what about those grey areas? “The movie was okay, nothing special but not bad either.”

Such a fence-sitting statement often lands in the Neutral zone, where excitement nor disappointment reign.

It gets even more intricate. Sentiment analysis can dive deeper, pinpointing the intensity of negative sentiment.

Imagine a spectrum from slightly perturbed to full-blown delight. That’s the kind of nuance sentiment analysis is aiming for!

Have email providers ever flagged your perfectly normal email as urgent? It’s due to sentiment analysis.

Think of a support center pinpointing the angriest emails to prioritize, thanks to algorithms catching words that scream SOS, like ‘urgent’ or ‘immediately’!

Whether it’s to deduce market sentiments toward stocks or to read the room in feedback forums , sentiment analysis is there, like a friend who always knows whether you’re crying out of joy or slicing onions.

It’s a tech buddy helping businesses measure reactions at scale – pretty neat for a bunch of codes, right?

Media monitoring campaign

How does Sentiment Analysis algorithms Work?

Let’s look at the different Sentiment Analysis algorithms and how they work.

Lexicon-Based Approaches

Lexicon-based sentiment analysis runs on the principle that certain words carry inherent emotional weight.

This method utilizes a dictionary, or lexicon , of words each tagged with their respective sentiment scores—imagine “ happy ” as +1 and “ sad ” as -1 .

When analyzing text, these scores are tallied up to determine the overall sentiment.

It’s a bit like nutritional information on the back of your cereal box; every ingredient (or word) contributes to the total.

  • Pros: Easy to understand / No need for training data
  • Cons: Context-agnostic / Struggles with nuances such as sarcasm or idioms

IBM’s take on sentiment analysis suggests that while this approach is straightforward, it’s not without its pitfalls—context is king, and lexicon-based methods often lack the crown.

Machine Learning Techniques

When lexicons fall short, machine learning steps in, flexing its computational muscle to understand text in a deeper, more context-rich way.

Algorithms are trained on large datasets, learning patterns that humans might miss, and classifying sentiments as positive, negative, or neutral.

Think of it as a highly refined palate, distinguishing subtle flavors in a complex dish.

Machine learning approaches can be:

  • Supervised: Trained on labeled data where the sentiment is pre-defined.
  • Unsupervised: No labels, relying on algorithms to find inherent structures and sentiments.

Machine Learning Techniques

Hybrid Systems

Why choose one when you can have the best of both worlds? Hybrid systems meld lexicons with machine learning, creating a more accurate and context-sensitive system.

These systems can iron out the kinks that single-method systems encounter, especially for tricky tasks like identifying sarcasm or changing sentiments within a single text.

  • Pros: Combines the interpretability of lexicon-based approaches with the contextual understanding of machine learning. More robust to variances in language use.
  • Cons: More complex to implement. Potentially more resources required.

It’s akin to having both a trusty map and a local guide when trekking through unknown territory—the combined insight ensures a more nuanced journey through the terrain of human emotion in text.

By exploring these three approaches, we’ve seen the kaleidoscope through which sentiment analysis views the linguistic landscape.

Whether through the stark clarity of lexicons, the dynamic learning of machines, or the clever synergy of hybrid systems, sentiment analysis continues to evolve, offering sharper insights into our collective emotions.

Understanding Sentiment Scores

Sentiment scores are crucial in gauging the emotional response evoked by text. They tell us the overall feeling of a group of interaction, that’ll be categorized in different sections depending on the method used.

Let’s break this down a bit more, shall we?

Polarity and Subjectivity

Polarity refers to the orientation of the sentiment conveyed in a text. It’s a way to label emotions as either positive, negative, or neutral.

Typically, polarity is expressed on a scale that ranges from – 1 to 1 .

Think of it like a thermometer for feelings: numbers closer to -1 are chilly and negative, while those near 1 are warm and positive. Neutral sentiment? That’s your zero, the perfect balance.

The concept of subjectivity, on the other hand, involves interpreting how much personal opinion or emotion is expressed in the text. Unlike polarity, which is fairly cut and dry, subjectivity can be a little fuzzier, involving more nuanced judgement calls.

Polarity and Subjectivity sentiment analysis

Quantifying Emotions

Now, let’s talk about how we make sense of these emotions numerically.

A sentiment score is essentially the heart rate monitor for the text’s emotional state. It provides a way to translate the complex, multifaceted emotions conveyed in language into a single, quantifiable number.

  • Positive statements boost the score higher.
  • Negative ones pull it down.
  • Neutral sentiments? They don’t really move the needle either way.

To give you a solid example, words like ‘ love ‘ and ‘ happy ‘ can amp up the sentiment score, revealing positive vibes. Meanwhile, negative words, such as ‘ disappointed ‘ or ‘ terrible ‘ can make the score drop, showing discontent and negative comments.

Neutral words like ‘ the ‘ or ‘ is ‘, they’re just along for the ride, not affecting the score much.

Understanding sentiment scores helps users, from business owners to social media analysts, get a quick and digestible look at how their content resonates with their audience.

It’s not just about the cold, hard numbers; it’s about connecting with people on an emotional level through text.

Challenges and Pitfalls

Navigating the realm of social sentiment analysis is akin to walking through a maze filled with unexpected turns. Each corner may present a unique challenge, from deciphering the tone behind a sarcastic tweet to making sense of a sea of emojis.

But don’t worry, our trusty map—the following sections—will guide you through!

Handling Unstructured Data

The vast ocean of unstructured data is both a treasure trove and a beast to tame. They have countless pieces of text from social media posts , blog comments, and product reviews, all in a jumble of formats.

Trying to find actionable insights here is like looking for your keys in a room full of toddlers—anything but straightforward.

Contextual Ambiguity

Next, let’s tackle contextual ambiguity.

Imagine reading a novel where the author forgot to mention that it’s set in a dream. Context matters! In sentiment and semantic analysis, misjudging the context of a word or phrase can flip its meaning upside down, leading to interpretations that are as accurate as a weather forecast that changes every minute.

Irony and Sarcasm

Now, onto the curveball of language: irony and sarcasm.

Sarcasm detection is like trying to read someone’s poker face—it’s tricky. A sentence like “ Oh, great! Another email! ” could either be a cheer or a jeer, and that’s just the tip of the iceberg for sentiment analysis tools, as they often lack the human touch for detecting these emotions.

Sarcasm and irony example

Finally, let’s not forget about emojis.

Who knew that a tiny yellow face could complicate things so much? Emojis can be like spices: Just a pinch can change the flavor of a message entirely. But interpreting them correctly? That’s a skill that can leave even the smartest algorithms scratching their digital heads.

Sentiment Analysis Use Cases

In today’s digital age, sentiment analysis has become a cornerstone for understanding public opinion and consumer behavior. From social media posts to online reviews, sentiment analysis work because it unlocks value in real-time feedback.

Social Media Monitoring

Sentiment analysis breathes life into social media monitoring , allowing companies to track what’s being said about their brand.

It’s like having a superpower to sift through tweets, posts, and comments, translating likes and shares into actionable data. Social monitoring tools are a radar for public sentiment , helping businesses stay on top of trends.

Brand Monitoring

Keeping an eye on brand health goes beyond social media platforms and social media channels alone.

Brand monitoring through sentiment analysis taps into various sources—think forums, blogs, and news articles.

Brands can gain insights, unearth perceptions and position themselves strategically in the marketplace. This nuanced understanding aids in effective brand reputation management.

Voice of Customers (VoC)

The Voice of the Customer programs are supercharged by sentiment analysis models.

By analyzing customer feedback, organizations get a candid look at customer joys and pain points. This insightful feedback shapes products and services, ensuring they hit the bullseye in addressing customer needs.

Customer Support Ticket Analysis

Each customer support ticket is a story and sentiment analysis extracts the mood chapter by chapter.

Detecting frustration or satisfaction in customer communications helps companies tailor their approach and solutions, and improve customer service, resulting in smarter decision-making for future interactions.

Market Research

Understanding the market means listening to the collective voice.

Sentiment analysis turns market research from a guessing game into a fact-based strategy session.

By dissecting positive and negative words from other’s opinions from reviews and surveys, companies can pivot or persevere with confidence.

How to do market research

Sentiment Analysis Tools and Software

With a wealth of sentiment analysis tools available, it’s crucial for businesses to choose those that best meet their needs.

From real-time feedback solutions to comprehensive analytics software, finding the right tool can transform how a company understands and reacts to its audience.

Best Sentiment Analysis Tools

Mention: Top of the list for excellent brand monitoring, Mention’s sentiment analysis tool enables businesses to identify the tone of conversations concerning their brand or competitors across social media and the web.

HubSpot: Offering tools that significantly cut down the waiting period for feedback analysis, allowing for a faster response to customer sentiments.

Hotjar: Focuses on user sentiments to help businesses understand underlying opinions and frustrations through effective analysis software.

Sprout Social: Provides social media management tools with features that include tagging sentiments in posts which aids in sorting and prioritizing customer engagement.

MonkeyLearn: Boasts a suite of text analysis tools with a particular emphasis on high-accuracy sentiment analysis that integrates well with other services.

Build your own tool: While building an in-house sentiment analysis tool is possible, it requires a substantial investment of time and resources from developers, making it less viable compared to ready-to-use, specialized solutions.

The appropriate sentiment analysis tool can reveal more than just the numbers; it offers valuable insights into the emotional tone and emotions behind the data, reflecting the subtleties of customer sentiment in precise, actionable ways.

Best Practices and Strategies

When harnessing sentiment analysis to its full potential, one must focus on both integrating it effectively and deriving actionable insights.

Adopting these strategies ensures not only an improvement in customer satisfaction but also a boost in the overall decision-making process.

Integrating Sentiment Analysis

Data Quality and Sources: A strong start is crucial, and it begins with selecting high-quality data sources—think customer reviews or social media feedback.

The principle here is simple: garbage in, garbage out . Companies need to prioritize sources that accurately reflect their audience’s voice to create a reliable sentiment analysis framework.

  • Real-Time Analysis: If you’re not analyzing sentiments in real-time, you’re probably missing out.
  • Why? Because real-time insights allow for immediate action—just what you need to stay ahead of the customer satisfaction game. You wouldn’t want to learn about a trend when it’s no longer trending, right?

Here’s an example of sentiment analysis done by Mention from their software dashboard :

Sentiment from Mention

Actionable Insights

Interpreting Scores and Trends: So, you’ve got your sentiment scores—what next? It’s about turning those numbers into strategies.

Deciphering the highs and lows in sentiment trends can guide product development and marketing efforts.

Feedback Loop: Remember, it’s not just about collecting data; it’s about what you do with it.

Establishing a feedback loop ensures that the insights gleaned from the sentiment analysis model are actually implemented.

If customers express appreciation for a new feature, that’s a signal to perhaps double down. Spot a recurring complaint pattern? That’s a clear sign to pivot or make improvements.

Specificity in Action: Lastly, always ask, “ How can we act on this? “

Whether it’s to enhance the customer experience or to refine marketing messages , the goal is to translate gathered sentiments into palpable and constructive changes.

It’s like finding out from a friend that your party playlist is a hit or miss, and then tweaking it until everyone’s grooving. That’s how you make data work for you, ensuring each insight leads to a response that resonates with your target audience.

Sentiment analysis has emerged as a crucial tool in understanding opinions and emotions in various forms of textual communication. It’s a dynamic field that blends linguistics, computer science, and artificial intelligence to interpret and classify the sentiments expressed in text data.

Businesses and individuals can now tap into customer feedback, monitor brand perception, and comprehend market trends through advanced algorithms and techniques to perform sentiment analysis.

This translation of subjective information into actionable insights is what makes sentiment analysis invaluable in today’s data-driven world.

Curious to see sentiment analysis in action? Try Mention for free for 14 days!

Lucas Carval

Lucas is a Digital Marketing Specialist at Mention since October 2023. His areas of expertise include digital marketing, SEO and outreach. He grew a streetwear Instagram page network from 0 to 120k in a year, and now helps Mention improve their number of qualified leads. He's working on getting a Master’s degree in Digital Strategy by 2025.

Digital Growth Specialist @Mention

Get the latest and greatest digital marketing + social media tips every week!

How to Choose the Right Social Listening Tool ?

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection
  • PMC10026245

Logo of phenaturepg

Sentiment analysis: A survey on design framework, applications and future scopes

Monali bordoloi.

1 School of Computer Science and Engineering, VIT-AP University, Inavolu, Amaravati, Andhra Pradesh 522237 India

Saroj Kumar Biswas

2 Computer Science and Engineering Department, NIT Silchar, NIT Road, Silchar, Assam 788010 India

Sentiment analysis is a solution that enables the extraction of a summarized opinion or minute sentimental details regarding any topic or context from a voluminous source of data. Even though several research papers address various sentiment analysis methods, implementations, and algorithms, a paper that includes a thorough analysis of the process for developing an efficient sentiment analysis model is highly desirable. Various factors such as extraction of relevant sentimental words, proper classification of sentiments, dataset, data cleansing, etc. heavily influence the performance of a sentiment analysis model. This survey presents a systematic and in-depth knowledge of different techniques, algorithms, and other factors associated with designing an effective sentiment analysis model. The paper performs a critical assessment of different modules of a sentiment analysis framework while discussing various shortcomings associated with the existing methods or systems. The paper proposes potential multidisciplinary application areas of sentiment analysis based on the contents of data and provides prospective research directions.

Introduction

The advent of digitization accelerated the scope of the general public to express their sentiment or opinion on an online platform. An expert or general public nowadays desires to reach an optimal decision or opinion with the use of available opinionative data. Any online platform, such as an e-commercial website or a social media site, maintains a level of transparency, increasing its chance of influencing other users. However, a single topic or item can possess millions of varied opinions on a single platform. The opinions or sentiments expressed can hold minute details or even a general opinion, which increases the research community’s interest in further investigation. This was the beginning of the principle of sentiment analysis, also known as opinion mining. Sentiment analysis makes it easier to retrieve sentimental details, analyze opinionative/sentimental web data, and classify sentimental patterns in a variety of situations.

Sentiment analysis can be stated as the procedure to identify, recognize, and/or categorize the users’ emotions or opinions for any service like movies, product issues, events, or any attribute as positive, negative, or neutral (Mehta and Pandya 2020 ). When sentiment is stated as a polarity in computational linguistics, it is typically treated as a classification task. When sentiment scores lying inside a particular range are used to express the emotion, the task is however regarded as a regression problem. Cortis et al. ( 2017 ) mentioned various research works where sentiment analysis is approached as either a classification or regression task. While analyzing the sentiments by assigning the instances sentiment scores within the range [− 1,1], Cortis et al. ( 2017 ) discovered that there can be circumstances where the prediction is sometimes considered to be a classification task and other times to be regression. To solve the regression/classification problem, the authors developed a novel approach that combined the use of two evaluation methods to compute the similarity matrix. Therefore, mining and analysis of sentiment are either limited to positive/negative/neutral; or even deeper granular sentimental scale, depending on the necessity, topic, scenario, or application (Vakali et al. 2013 ).

In the last decade since the paper by Pang et al. ( 2002 ), a large number of techniques, methods, and enhancements have been proposed for the problem of sentiment analysis, in different tasks, at different levels. Numerous review papers on sentiment analysis are already available. It has been noted that the current studies do not give the scientific community a comprehensive picture of how to build a proper sentiment analysis model. A general, step-by-step framework that can be used as a guide by an expert or even by a new researcher would be ideal for designing a proper sentiment analysis model. Many of the existing surveys basically report the general approaches, methods, applications, and challenges available for sentiment analysis. The survey paper by Alessia et al. ( 2015 ) reports basic three levels of sentiment analysis, presents three types of sentiment classification approaches, discusses some of the available tools and methods, and points out four domains of applications of sentiment analysis. The study can be further extended to give more details about the different levels, methods/approaches, additional applications, and other related factors and areas. Wankhade et al. ( 2022 ) provided a detailed study of different sentiment analysis methods, four basic levels of sentiment analysis, applications based on domain and industries, and various challenges. The survey emphasizes several classification methods while discussing some of the necessary procedures in sentiment analysis. Instead of only concentrating on the procedures that are necessary for sentiment analysis, a detailed description of all the possible approaches is highly desirable as it can help in selecting the best among all for a certain type of sentiment analysis model. Each step/module of the sentiment analysis model should be discussed in detail to gain insight into which technique should be used given the domain, dataset availability, and other variables; or how to proceed further to achieve high performance. Further, applications of sentiment analysis are commonly described based on the domain or applicable industry. Possible application areas based purely on the dataset are rarely covered by recent review papers. Some of the survey papers focus on only one direction or angle of sentiment analysis. Multimodal sentiment analysis and its applications, as well as its prospects, challenges, and adjacent fields, were the main topics of the paper by Kaur and Kautish ( 2022 ). Schouten and Frasincar ( 2015 ) focused on the semantically rich concept-centric aspect-level sentiment analysis and foreseen the rise of machine learning techniques in this context in the future. Verma ( 2022 ) addressed the application of sentiment analysis to build a smart society, based on public services. The author showed that understanding the future research directions and changes in sentiment analysis for smart society unfolds immense opportunities for elated public services. Therefore, this survey paper aims to categorize sentiment analysis techniques in general, while critically evaluating and discussing various modules/steps associated with them.

This paper offers a broad foundation for creating a sentiment analysis model. Instead of focusing on specific areas, or enumerating the methodological steps in a scattered manner; this paper follows a systematic approach and provides an extensive discussion on different sentiment analysis levels, modules, techniques, algorithms, and other factors associated with designing an effective sentiment analysis model. The important contributions can be summarized as follows:

  • The paper outlines all the granularity levels at which sentiment analysis can be carried out, through appropriate representative examples.
  • The paper provides a generic step-by-step framework that can be followed while designing a simple as well as a high-quality sentiment analysis model. An overview of different techniques of data collection and standardization, along with pre-processing which significantly influences the efficiency of the model, are presented in this research work. Keyword extraction and sentiment classification having a great impact on a sentiment analysis model is thoroughly investigated.
  • Possible applications of sentiment analysis based on the available datasets are also presented in this paper.
  • The paper makes an effort to review the main research problems in recent articles in this field. To facilitate the future extension of studies on sentiment analysis, some of the research gaps along with possible solutions are also pointed out in this paper.

The remaining paper is organized into five different sections to provide a clear vision of the different angles associated with a sentiment analysis process. Section 2 provides knowledge of the background of sentiment analysis along with its different granularity levels. A detailed discussion of the framework for performing sentiment analysis is presented in the Sect. 3 . Each module associated with designing an effective sentiment analysis is discussed in this section. Section 4 discusses different performance measures which can be used to evaluate a sentiment analysis model. Section 5 presents various possible applications of sentiment analysis based on the content of the data. Section 6 discusses the future scope of research on sentiment analysis. At last, Sect. 7 concludes the paper.

Background and granularity levels of sentiment analysis

The first ever paper that focused on public or expert opinion was published in 1940 by Stagner ( 1940 ). However, at that time studies were survey based. As reported in Mäntylä et al. ( 2018 ), the earliest computer-based sentiment analysis was proposed by Wiebe ( 1990 ) to detect subjective sentences from a narrative. The research on modern sentiment analysis accelerated in 2002 with the paper by Pang et al. ( 2002 ), where ratings on movie reviews were used to perform machine learning-based sentiment classification. Pang et al. ( 2002 ) classified a document based on the overall sentiment, i.e., whether a review is positive or negative rather than based on the topic.

Current studies mostly concentrate on multilabel sentiment classification, while filtering out neutral opinions/sentiments. Due to the unavailability of proper knowledge of handling neutral opinion, the exclusion of neutral sentiment might lead to disruption in optimal decision-making or valuable information loss. Based on a consensus method, Valdivia et al. ( 2017 ) proposed two polarity aggregation models with neutrality proximity functions. Valdivia et al. ( 2018 ), filtered the neutral reviews using induced Ordered Weighted Averaging (OWA) operators based on fuzzy majority. Santos et al. ( 2020 ) demonstrated that the examination of neutral texts becomes more relevant and useful for comprehending and profiling particular frameworks when a specific polarity pre-dominates. Besides, there can be opinions that usually contain both positive and negative emotions as a result of noise. This kind of opinion is termed an ambivalence opinion, which is often misinterpreted as being neutral. Wang et al. ( 2020 ) presented a multi-level fine-scaled sentiment sensing and showed that the performance of the sentiment sensing improves with ambivalence handling. Wang et al. ( 2014 ) introduced the concept to classify a tweet with more positive than negative emotions into a positive category; and one with more negative emotions than the positive one into a negative sentiment category.

Computational linguistics, Natural Language Processing (NLP), text mining, and text analysis are different areas that are closely interlinked with the sentiment analysis process. The relationship between sentiment analysis and the different areas is summarized below:

Sentiment Analysis is a blend of linguistics and computer science (Taboada 2016 ; Hart 2013 ). Nowadays thousandths of human languages and other abbreviated or special languages exist, say the ones used in social media, which are used to convey thoughts, emotions, or opinions. People might use one single language or a combination of different languages, say for example Hinglish (a combination of Hindi and English) along with emoticons or some symbols to convey their messages. Computational linguistics assists in obtaining the computer-executable and understandable language from the vast source of raw languages through proper representation, to extract the associated sentiments properly. While developing formal theories of parsing and semantics along with statistical methods like deep learning, computational linguistics forms the foundation for performing sentiment analysis.

Linguistics knowledge aids in the development of the corpus set that will be used for sentiment analysis while understanding the characteristics of the data it operates on and determining which linguistic features may be applied. Data-driven or rule-based computer algorithms are designed to extract subjective information or to score polarity with the help of linguistic features, corpus linguistics, computational semantics, part of speech tagging, and the development of analytical systems for parsing. Connotations and associations are used to construct sentiment lexicons.

Recognition of sarcasm, mood classification, and polarity classification are some of the tasks covered by sentiment analysis, which is just a small subset of the discipline of computational linguistics. Approaches to classifying moods introduce a new dimension that is based on external psychological models. Methods for detecting sarcasm make use of ideas like “content” and “non-content” terms, which coexist in linguistic theory. Language models, such as Grice’s well-known maxims, are used to define sarcasm.

NLP deciphers human language and makes it machine understandable. With the aid of NLP, the sentiments behind human-generated online comments, social media posts, blogs, and other information can be processed and represented by patterns and structures that can be used by software to comprehend and implement them. Sentiment analysis can be considered as a subset of NLP which helps users in opinionative/sentimental decision-making.

Different NLP tasks such as tokenization, stemming, lemmatization, negation detection, n-gram creation, and feature extraction aid in proper sentiment analysis. NLP-based pre-processing helps in improving the polarity classifier’s performance by analyzing the sentiment lexicons that are associated with the subject (Chong et al. 2014 ). As a result, NLP facilitates text comprehension, accurately captures text polarity, and ultimately facilitates improved sentiment analysis (Rajput 2020 ; Solangi et al. 2018 ).

Advanced NLP techniques are often needed when dealing with emoticons, multilingual data, idioms, sarcasm, sense or tone, bias, negation, etc. Otherwise, the outcome can drastically deteriorate. If the NLTK’s general stopwords list is utilized, words like not, nor, and no, for instance, are frequently deleted when removing stopwords during pre-processing. However, the removal of such words can alter the actual sentiment of the data. Thus, depending on its application, NLP tasks can either improve or deteriorate the result.

Text messages, comments, reviews, and blog posts are excellent sources of sentimental information. The extraction of useful information and knowledge hidden in textual data is an important aspect of sentiment analysis. Mining the relevant information from textual data possesses multi-dimensional advantages such as improved decision-making, public influence, national security, health and safety, etc. (Zhang et al. 2021 ; Wakade et al. 2012 ). Text mining involves the use of statistical techniques to retrieve quantifiable data from unstructured text, and uses NLP to transform the unstructured text into normalized, structured data, which makes it suitable for sentiment analysis.

Sentiment analysis, however, is not just confined to text. In most cases, such as when a sarcastic comment is made, or while pointing a finger at someone and saying- “You are responsible!”, the exact sentiment behind the plain text might not be conveyed properly. Non-text data like video, audio, and image are helpful in such a scenario to portray sentiment accurately.

A key part of sentiment analysis is extracting insightful information, trends, and patterns. To extract them from unstructured and semi-structured text data, text analysis is a process that supports sentiment analysis. Using techniques including word spotting, manual rule usage, text classification, topic modeling, and thematic analysis, the procedure helps in the extraction of meaning from the text. Text analysis can be used to specify individual lexical items (words or phrases) and observe the pattern.

Sentiment analysis, in contrast to basic text analytics, fundamentally shows the emotion concealed beneath the words, while text analytics analyses the grammar and relationships between words. Sentiment analysis essentially identifies whether a topic conveys a positive, negative, neutral, or any other sentiment; while text analysis is used to identify the most popular topics and prevalent ideas-based texts. In addition, it can be more challenging to specify the intended target in the context of sentiment conveyed, than it is to determine a document’s general subject.

A textual document with numerous opinions would have a mixed polarity overall, as opposed to having no polarity at all (being objective). It is also important to distinguish the polarity and the strength of a conveyed sentiment. One may have strong feelings about a product being decent, average, or awful while having mild feelings about a product being excellent (due to the possibility that one just had it for a brief period before having an opinion.). Also, unlike topical (involving text) analysis, in many cases such as that of the quotes, it is critical to understand whether the sentiment conveyed in the document accurately reflects the author’s true intentions or not.

Analyzing the existence of an important word in conjunction with the use of a sentiment score approach can help to uncover the most profound and specific insights that can be used to make the best decision in many situations. Areas of application for sentiment analysis aided by appropriate text analysis include strategic decision-making, product creation, marketing, competition intelligence, content suggestion, regulatory compliance, and semantic search.

Granularity levels

At present, a sentiment analysis model can be implemented at various granular levels according to the requirement and scope. There are mainly four levels of sentiment analysis that have gained a lot of popularity. They are document level (Pang et al. 2002 ; Li and Li 2013 ; Hu and Li 2011 ; Li and Wu 2010 ; Rui et al. 2013 ; Zhan et al. 2009 ; Yu et al. 2010 ), sentence or phrase level (Nguyen and Nguyen 2017 ; Wilson et al. 2005 ; Narayanan et al. 2009 ; Liu et al. 2013 ; Yu et al. 2013 ; Tan et al. 2012 ; Mullen and Collier 2004 ), word level (Nielsen 2011 ; Dang et al. 2009 ; Reyes and Rosso 2012 ; Bollegala et al. 2012 ; Thelwall and Buckley 2013 ; Li et al. 2014 ), and entity or aspect level (Li et al. 2012 ; Li and Lu 2017 ; Quan and Ren 2014 ; Cruz Mata et al. 2013 ; Mostafa 2013 ; Yan et al. 2015 ; Li et al. 2015a ).

Some of the other research works concentrate on concept level (Zad et al. 2021 ; Tsai et al. 2013 ; Poria et al. 2013 ; Balahur et al. 2011 ; Cambria et al. 2022 ; Cambria 2013 ), link/user level (Rabelo et al. 2012 ; Bao et al. 2013 ; Tan et al. 2011 ), clause level (Kanayama and Nasukawa 2006 ; Liu et al. 2013 ), and sense level (Banea et al. 2014 ; Wiebe and Mihalcea 2006 ; Alfter et al. 2022 ) sentiment analysis. Some of the important levels of sentiment analysis are discussed in the following sub-sections. To understand the different levels, let us consider a customer review R as shown below.

R = “I feel the latest mobile from iPhone is really good. The camera has an outstanding resolution. It has a long battery life. I can even bear the mobile’s heating problem. However, I feel it could have been a bit light weighted. Given the configurations, it is a bit expensive; but I must give a thumbs up for the processor.”

In the following subsections, we will observe the analysis of review R based on different levels.

Document-level sentiment analysis

It aims to assess a document’s emotional content. It assumes that the overall document expresses a single sentiment (Pang et al. 2002 ; Hu and Li 2011 ). The general approach of this level is to combine the polarities of each word/sentence in the document to find the overall polarity (Kharde and Sonawane 2016 ). According to document-level sentiment analysis, the overall sentiment of the document represented by review R is positive. According to Turney ( 2002 ), there are two approaches to document sentiment classification namely term-counting and machine learning. Term counting measure derives a sentiment measure while calculating total positive and negative terms in the document. Machine learning approaches generally yield superior results as compared to term-counting approaches. In this approach, it is assumed that the document is focused on only one object and thus holds an opinion about that particular object only. Thus, if the document contains opinions about different objects, this approach is not suitable.

Sentence/phrase-level sentiment analysis

The sentiment associated with each sentence of a set of data is analyzed at this level of sentiment analysis. The general approach is to combine the sentiment orientation of each word in a sentence/phrase to compute the sentiment of the sentence/phrase (Kharde and Sonawane 2016 ). It attempts to classify a sentence as conveying either positive/negative/neutral/mixed sentiment or as a subjective or objective sentence (Katrekar and AVP 2005 ). Objective sentences are facts and do not convey any sentiment about an object or entity. They do not play any role in polarity determination and thus need to be filtered out (Kolkur et al. 2015 ). The polarity of a sentence in review R is found to be positive/negative/mixed irrespective of its overall polarity.

Word-level sentiment analysis

Through proper examination of the polarity of each and every word, this sentiment analysis level investigates how impactful individual words can be on the overall sentiment. The two methods of automatically assigning sentiment at this level are dictionary-based and corpus-based methods (Kharde and Sonawane 2016 ). According to Reyes and Rosso ( 2012 ), in corpus-based techniques, the co-occurrence patterns of words are used for sentiment determination. However, most of the time, statistical information needed for the determination of a word’s sentiment orientation is large corpus dependent. The dictionary-based approaches use synonyms, antonyms, and hierarchies from lexical resources such as WordNet and SentiWordNet (SWN) to determine the sentiments of words (Kharde and Sonawane 2016 ). Such techniques assign positive, negative, and objective sentiment scores to each synset. If the words in review R such as outstanding, expensive, etc. are evaluated individually, different words within a particular sentence are observed to hold different polarities.

Aspect or entity-level sentiment analysis

For a specific target entity, this approach essentially identifies various aspects associated with it. Then, the sentiment expressed towards the target by each of its aspects is determined in this level of sentiment analysis. As a result, it can be divided into two different tasks, namely extraction of aspects and sentiment classification of aspects (Liu and Zhang 2012 ). For the different aspects such as resolution, weight, and price of the same product in review R, different sentiments are conveyed.

Concept-level sentiment analysis

Most of the time, merely using emotional words to determine sentiment or opinion is insufficient. To obtain the best results, a thorough examination of the underlying meaning of the concepts and their interactions is required. Concept-level sentiment analysis intends to convey the semantic and affective information associated with opinions, with the use of web ontologies or semantic networks (Cambria 2013 ). Rather than simply using word-cooccurrences or other dictionary-based approaches as in word-level sentiment analysis, or finding overall opinion about a single item as in document-level sentiment analysis; concept-level sentiment analysis generally makes use of feature spotting and polarity detection based on different concepts. E.g., For “long battery life” in review R is considered positive. However, a “long route” might not be preferable if someone wants to reach the destination in minimum time, and thus can be considered as negative. Tsai et al. ( 2013 ) made use of features of the concept itself as well as features of the neighboring concepts.

User-level sentiment analysis

User–level sentiment analysis takes into account the fact that if there is a strong connection among users of a social platform, then the opinion of one user can influence other users. Also, they may hold similar sentiments/opinions for a particular topic (Tan et al. 2011 ). At the user level, all the followers of the reviewer of review R may get influenced by this review.

Clause-level sentiment analysis

A sentence can be a combination of multiple clauses, each conveying different sentiments. The clauses in review R can be observed to represent opposing polarity because they are separated by the word “but”. Clause-level sentiment analysis focuses on the sentiment associated with each clause based on aspect, associated condition, domain, grammatical dependencies of the words in the clause, etc.

Sense-level sentiment analysis

The words which form a sentence can interpret different meanings based on their usage in the sentence. Specifically, when the same word has multiple meanings, the sense with which the word is used, can highly affect the sentiment orientation of the whole sentence or document. E.g., let us consider the word “bear” in review R. Is the word bear referring to the mammal bear? Otherwise, is it indicating the bearing (holding) of something? In what sense it is used? Is it used as a noun or a verb? In such a case, proper knowledge of the grammatical structure or word sense can contribute immensely to the determination of the appropriate sentiment of any natural language text. Thus, solving words’ syntactic ambiguity and performing word sense disambiguation (Wiebe and Mihalcea 2006 ) are vital parts of designing an advanced sentiment analysis model. Alfter et al. ( 2022 ) provided a sense-level annotated resource rather than word-level annotation and performed various experiments to explore the explanations of difficult words.

The analysis of the review R at different levels shows that the same review can have different interpretations based on the requirement. Single-level approaches work well in most cases. However, sometimes when the evaluation of sentiments is based on very short document(s) or even very long document(s), the model may fail to handle the flexibility. To determine the polarity of the overall documents, Li et al. ( 2010 ) combined phrase-level and sentence-level sentiment analysis to design a multi-level model. Valakunde and Patwardhan ( 2013 ) advised following a ladder-like computation. In this technique, aspect or entity-level sentiment is employed to compute the sentence-level sentiments and then use the weightage of entities along with the sentence-level sentiments for evaluation of the complete document.

General framework of sentiment analysis

The evolution of sentiment analysis marks the emergence of different models by different experts. After going through more than 500 sentiment analysis models proposed till now, a general framework of sentiment analysis is presented in Fig.  1 . The framework comprises mainly four modules along with an additional optional module. The modules perform collection and standardization of data; pre-processing of the dataset; extraction of features or keywords which represent the overall dataset; prediction or classification of the sentiments associated with the keywords or the whole sentence or document according to the requirement; and summarization of the overall sentiment associated with the dataset. The different modules are discussed in detail below.

An external file that holds a picture, illustration, etc.
Object name is 10462_2023_10442_Fig1_HTML.jpg

Data collection and standardization

With the growing platforms of expression, the type and format of expressing people’s views, opinions, or sentiments on a particular subject is increasing. Among the different available types of data such as text, image, audio, or video, the research on textual data has gained momentum in the last few years. Currently, though multi-lingual text data has attracted few researchers, however, 90% of sentiment analysis studies, experimentation, and design concentrates mainly on English textual data.

The development, examination, and validation of a system typically depend on the quality and structure of data used for building, operating, and maintaining the model. The overall functionality of a model depends on the data used from the boundless and voluminous source of available data to a great extent. Many public data sources are available which are used by some researchers to design a sentiment analysis model. Publicly available dataset namely Blitzer’s multi-domain sentiment data (Blitzer et al. 2007 ) is used by Dang et al. ( 2009 ). Public product reviews by Epinions (epinions.com) are also used by some of the researchers (Kharde and Sonawane 2016 ; Fahrni and Klenner 2008 ). UCI Machine Learning Repository provides standard datasets for sentiment namely Twitter data for Arabic Sentient Analysis, Sentiment Labelled Sentences, Paper Reviews, Sentiment Analysis in Saudi Arabia about distance education during Covid-19, etc. The overwhelming rate of data production demands designing a system that keeps on updating the database from time to time to avoid generality or biased interest at a particular time. A manual approach to collecting a substantial volume of data is not a desirable practice. Thus, automatic big data collection techniques are indeed a vital aspect that must be keenly observed. Several tools or APIs have come up recently that help to collect data from online social or e-commercial platforms. Some of them are NodeXL, Google spreadsheet using Twitter Achiever, Zapier, Rapid Miner, Parsehub, BeautifulSoup in Python, WebHarvy, etc. Most of these tools or APIs help to collect real-time data. But the main problem occurs when someone desires to work with historical data; because many of these techniques such as Twitter API do not permit extracting tweets older than seven days. Building a standard database involves dealing with the unstructured information attached to the data from the internet. For a dataset representing a particular topic, proper standardization in an appropriate type, format, and context, extensively boosts the overall outcome of the analysis. To design a robust system, the homogeneity of the data must be maintained. Besides, proper labelling of the collected data can improve the performance of the sentiment analysis model. Different online labelling techniques are available nowadays. However, online labelling techniques are sometimes full of noise, which leads to lower accuracy of the system. Designing an automatic labelling system, which makes use of various statistical knowledge of the whole corpus and appropriate domain knowledge of words, proves to contribute more to enhancing the sentiment analysis process.

Pre-processing

The process of removing any sort of noise from a textual dataset and preparing a cleaned, relevant and well-structured dataset for the sentiment analysis process is called as pre-processing. Appropriate pre-processing of any dataset noticeably improves the sentiment analysis process. For analyzing the sentiment of online movie reviews, a three-tier approach is adopted by Zin et al. ( 2017 ) to examine the effect of pre-processing task. In the first tier, they experimented with the removal of stopwords using the English stopwords list. The stopwords are the words such as the articles a, an, the, etc., which have no effective role in determining sentiment. In the second tier, the sentiment analysis is performed after the removal of stopwords and all other meaningless characters/words such as date (16/11/20), special characters (@, #), and words with no meaning (a+, a-, b+). In the third tier, more cleaning strategies are used, i.e., numbers and words having less than three characters are removed along with the stopwords and meaningless words. Their results demonstrate that the different combinations of the pre-processing steps show favorable improvement in the classification process; thus, establishing the significance of the removal of stopwords, meaningless words such as special characters, numbers, and words with less than three characters. Jianqiang ( 2015 ) found that replacing negations, and expanding acronyms have a positive effect on sentiment classification, however, the removal of URLs, numbers, and stopwords hardly changes the accuracy. Efficient pre-processing can increase the accuracy of a sentiment analysis model. To establish it, Haddi et al. ( 2013 ) combined various pre-processing methods using online reviews of movies and followed different steps such as cleaning online text, removal of white space, expansion of abbreviations, stemming, eliminating stopwords, and handling negation. Apart from these, they also considered feature selection as a pre-processing step. They used the chi-square method to filter out the less impactful features. To handle negation, a few researchers such as Pang et al. ( 2002 ), used the following words to tag the negation word until a punctuation mark occurs. However, authors of Haddi et al. ( 2013 ) and Dave et al. ( 2003 ) observed that the results before and after the tagging remain almost the same. Therefore, Haddi et al. ( 2013 ) reduced the number of tagged following words to three and two. Saif et al. ( 2014 ) observed that a list of pre-complied stopwords negatively affects Twitter sentiment classification. However, with the use of pre-processing the original feature space is significantly reduced. Jianqiang and Xiaolin ( 2017 ) show that stopword removal, acronym expansion, and replacing negation are effective pre-processing steps. According to Jianqiang and Xiaolin, URLs and numbers do not contain useful information for sentiment analysis. They also found that reverting words with repeated characters shows fluctuating performance. This must be because, in some situations, a word such as goooood gets replaced by goood. Thus, creating confusion about whether it should be interpreted as good or god. Such a situation may alter the actual polarity conveyed by the word. Therefore, reverting words with repeated characters is not recommendable.

Feature/keyword extraction

In a sentiment analysis model, the words and symbols within the corpus are mainly used as the features (O’Keefe and Koprinska 2009 ). Traditional topical text classification approaches are used in most sentiment analysis systems, in which a document is treated as a Bag of Words (BOW), projected as a feature vector, and then categorized using a proper classification technique. Experts use a variety of feature sets to boost sentiment classification efficiency, including higher-order n-grams (Pang et al. 2002 ; Dave et al. 2003 ; Joshi and Rosé 2009 ), word pairs and dependency relations (Dave et al. 2003 ; Joshi and Rosé 2009 ; Gamon 2004 ; Subrahmanian and Reforgiato 2008 ). Using different word-relation feature sets namely unigram (one word), bigram (two words), and dependency parsing, Xia et al. ( 2011 ) performed sentiment classification using an ensemble framework. Wiebe and Mihalcea ( 2006 ) introduced a ground-breaking study focused on the Measure of Concern (MOC) to assess public issues using Twitter data and the most significant unigrams. While conducting text opinion mining, Sidorov et al. ( 2013 ) demonstrated the supremacy of unigrams, as well as other suitable settings such as minimal classes, the efficacy of balanced and unbalanced corpus, the usage of appropriate machine learning classifiers, and so on. Every word present in a dataset is not always important in the context of sentiment analysis. The difficulty of determining precise sentiment classifications has been increased by the continuous growth of knowledge. Even after cleaning the dataset with various pre-processing steps, using all of the data in the dataset can result in dimensionality issues, longer computation times, and the use of irrelevant or less significant features or terms. Especially in the case of higher dimensional and multivariate data, these problems become even worse. According to Li et al. ( 2017 ), a good word representation that captures sentiment is good at word sentiment analysis and sentence classification; and building document-level sentiment analysis dynamically based on words in need is the best practice. Keyword extraction is a method for extracting essential features/terms from textual data by defining particular terms, phrases, or words from a document to represent the document concisely (Benghuzzi and Elsheh 2020 ). If a text’s keywords are extracted correctly, the text’s subject can be thoroughly researched and evaluated, and a good decision can be made about the text. Given that, manually extracting keywords from such a large number of databases is a repetitive, time-consuming, and costly process, automated keyword extraction has become a popular field of research for most researchers in recent years. Automatic keyword extraction can be categorized into supervised, semi-supervised, and unsupervised methods (Beliga et al. 2015 ). The keywords are mainly represented using either Vector Space Model (VSM) or a Graph-Based Model (GBM) (Ravinuthala et al. 2016 ; Kwon et al. 2015 ). Once the datasets are represented using any of the VSM or GBM techniques, the keywords are extracted using simple statistics, linguistics, machine learning techniques, and hybridized methods (Bharti and Babu 2017 ). Simple methodologies that do not include training data and are independent of language and domain are included in the statistical keyword extraction methods. To identify keywords, researchers used frequency of terms, Term Frequency-Inverse Document Frequency (TF-IDF), co-occurrences of terms, n-gram statistics, PATricia (PAT) Tree, and other statistics from documents (Chen and Lin 2010 ). The linguistic approach examines the linguistic properties of words, sentences, and documents, with lexical, semantic, syntactic, and discourse analysis being the most frequently studied linguistic properties (HaCohen-Kerner 2003 ; Hulth 2003 ; Nguyen and Kan 2007 ). A machine learning technique takes into account supervised or unsupervised learning while extracting keywords. Supervised learning produces a system that is trained on a collection of relevant keywords followed by identification and analysis of keywords within unfamiliar texts (Medelyan and Witten 2006 ; Theng 2004 ; Zhang et al. 2006 ). All of these methods are combined in the hybrid method for keyword extraction. O’Keefe and Koprinska ( 2009 ) performed sentiment analysis using machine learning classifiers, which they validated using the movie review dataset. Along with the use of feature presence, feature frequency, and TF-IDF as feature weighting methods, they proposed SWN Word Score Groups (SWN-SG), SWN Word Polarity Groups (SWN-PG), and SWN Word Polarity Sums (SWN-PS) using words which are grouped by their SWN values. The authors suggest categorical Proportional Difference (PD), SWN Subjectivity Scores (SWNSS), and SWN Proportional Difference (SWNPD) as feature selection techniques. They discovered that feature weights based on unigrams, especially feature presence, outperformed SWN-based methods. Using different machine learning techniques; Tan and Zhang ( 2008 ) proposed a model for sentiment analysis in three domains: education, film, and home, which was written in Chinese and used various feature selection techniques for the purpose. Mars and Gouider ( 2017 ) proposed a MapReduce-based algorithm for determining opinion polarity using features of consumer opinions and big data technologies combined with Text Mining (TM) and machine learning tools. Using a supervised approach, Kummer and Savoy ( 2012 ) suggested a KL score for providing weightage to features for sentiment and opinion mining. All these research works establish that the machine learning approach of keyword extraction when incorporated with any other techniques has a great scope in the field of sentiment analysis. There are different kinds of methods that are used to perform keyword extraction using VSM and GBM approaches. They are discussed in detail below.

Vector space model

In VSM, the documents are represented as vectors of the terms (Wang et al. 2015 ). VSM involves building a matrix V which is usually termed as a document-term matrix, where the rows represent the documents in the dataset, whereas columns correspond to the terms of the whole dataset. Thus, if the set of documents is represented by D = ( d 1 , d 2 , . . . . , d m ) and the set of terms/tokens representing the entire corpus is T = ( t 1 , t 2 , . . . . , t n ) , then the element d t i , j ∈ V mxn , i = 1 , 2 , … , m , and j = 1 , 2 , … , n is assigned a weight w i , j . The weights can be assigned based on the word frequency associated with a document or the entire dataset. According to Abilhoa and De Castro ( 2014 ), the frequencies can be binary, absolute, relative, or weighted. Algorithms such as binary, Term Frequency (TF), TF–IDF, etc. are used in traditional term weighting schemes.

If document d i contains the term t j , the element d t i , j of a term vector is assigned a value 1 in the binary term weighting scheme, otherwise, the value 0 is assigned (Salton and Buckley 1988 ). It has the obvious drawback of being unable to recognize the most representative words in a text. Furthermore, using word frequency often helps to increase the importance of terms in documents.

The limitation of the binary term weighting scheme motivates the use of term frequency as the weight of a term for a specific text. The number of times a word appears in a text is known as its term frequency. As a result, a value w i , j is assigned to d t i , j with w i , j equaling the number of times the word t j appears in the document d i . However, as opposed to words that appear infrequently in documents, terms that appear consistently in all documents have less distinguishing power to describe a document (Kim et al. 2022 ). This is an area where the TF algorithm falls short.

The number of documents in the entire document corpus where a word appears is known as its document frequency. If a word has a higher document frequency, it has a lower distinguishing power, and vice versa. As a result, the Inverse Document Frequency (IDF) metric is used as a global weighting factor to highlight a term’s ability to identify documents. Equation  1 (Zhang et al. 2020 ) may be used to describe a term’s TF-IDF weight as follows:

where, t f k denotes the frequency of the term t k in a specific document and d f k denotes the document frequency of the term t k , i.e., the number of documents containing the term t k . The total number of documents in the corpus is denoted by m .

Using the traditional term-weighing techniques, many experts tried to propose their improvised version. Some of them are TF-CHI (Sebastiani and Debole 2003 ), TF-RF (Lan et al. 2008 ), TF-Prob (Liu et al. 2009 ), TF-IDF-ICSD (Ren and Sohrab 2013 ), and TF-IGM (Chen et al. 2016 ).

Graph based model

A graph G is constructed in GBM, with each node or vertex V i representing a document term or function t i and the edges E i , j representing the relationship between them (Beliga et al. 2015 ). Nasar et al. ( 2019 ) showed that various properties of a graph, like centrality measures, node’s co-occurrence, and others, play a significant role in keyword ranking. Semantic, syntactic, co-occurrence, and similarity relationships are some of the specific perspectives of graph-based text analysis. In GBM techniques, centrality measures tend to be the most significant deciding factor (Malliaros and Skianis 2015 ). The importance of a term is calculated by using the centrality measure, to calculate the importance of the node in the graph. Beliga ( 2014 ) presented the knowledge of nineteen different measures which are used for extraction purposes. Degree centrality, closeness centrality, betweenness centrality, selectivity centrality, eigenvector centrality, PageRank, TextRank, strength centrality, neighborhood size centrality, coreness centrality, clustering coefficient, and other centrality measures have been proposed so far. Some of the popular centrality measures are discussed below.

Degree centrality is used to measure how often a term occurs with any other term. For a particular node, the total count of edges incident on it is used to measure the metric (Beliga 2014 ). The more edges that cross the node, the more significant it is in the graph. A node V i ’s degree centrality is measured using Eq.  2 .

where, D C ( V i ) represents node V i ’s degree centrality, ∣ N ∣ indicates the total count of nodes and ∣ n ( V i ) ∣ represents the overall nodes linked with the node V i .

Closeness centrality determines the closeness of a term with all other terms of the dataset. This metric calculates the average of the shortest distance from a given node to every other node in the graph. It is defined by Eq.  3 (Tamilselvam et al. 2017 ) as the reciprocal of the number of all node distances to any node, i.e. the inverse of farness.

where, C C ( V i ) represents node V i ’s closeness centrality, ∣ N ∣ represents graph’s node count, and d i s t ( V i , V j ) represents the shortest distance from node V i to node V j .

This metric is used to see how often a word appears in the middle of another term. This metric indicates how many times a node serves as a bridge between two nodes on the shortest path. For a node V i , it is calculated using Eq.  4 (Tamilselvam et al. 2017 ).

In Eq.  4 , B C ( V i ) represents V i ’s betweenness centrality, σ V x V y represents the overall shortest paths from node V x to V y , and the overall shortest paths from node V x to V y via. V i is represented by σ V x V y ( V i ) .

Selectivity Centrality ( S C ( V i ) ) (Beliga et al. 2015 ) is the average weight on a node’s edges. As shown in Eq.  5 , S C ( V i ) is equal to the fraction of strength of node s ( V i ) to its degree d ( V i ) .

As shown in Eq.  6 , node V i ′ s strength, s ( V i ) , is the summation of overall edge weights incident on ( V i ) .

This centrality measure determines the global importance of a term. It is calculated for a node using the centralities of the neighbors of the node. It is calculated using the adjacency matrix and a matrix calculation to determine the principal eigenvector (Golbeck 2013 ). Assume that A is a ( nxn ) similarity matrix, with A = ( α V i V j ) , α V i V j = 1 if V i is bound to V j and α V i V j = 0 , otherwise. The i-th entry in the normalized eigenvector belonging to the largest eigenvalue of A is then used to describe the eigenvector centrality E V C ( V i ) of node V i . Equation  7 (Bonacich 2007 ) shows the formula for eigenvector centrality.

where, λ is the largest eigenvalue of A . Castillo et al. ( 2015 ) suggested a supervised model with the use of degree and closeness centrality measures of a co-occurrence graph, to determine words belonging to each sentiment while representing existing relationships among document terms. Nagarajan et al. ( 2016 ) have also suggested an algorithm for the extraction of keywords based on centrality metrics of degree and closeness. For obtaining the optimal set of ranked keywords, Vega-Oliveros et al. ( 2019 ) used nine popular graph centralities for the determination of keywords and introduced a new multi-centrality metric. They found that all of the centrality measures have a strong relationship. The authors also discovered that degree centrality is the quickest and most efficient measure to compute. While experimenting with various centrality measures, Lahiri et al. ( 2014 ) also noticed that degree centrality makes keyword and key extraction much simpler. Abilhoa and De Castro ( 2014 ) suggest a keyword extraction model based on graph representation, and eccentricity and closeness centrality measures. As a tiebreaker, they used the degree centrality. In several real-world models, disconnected graphs are common, and using eccentricity and closeness centralities to achieve the expected result often fails. Yadav et al. ( 2014 ) recommended extracting keywords using degree, eccentricity, closeness, and other centralities of the graph while emphasizing the semantics of the terms. With the use of Part of Speech (PoS) tagging, Bronselaer and Pasi ( 2013 ) presented a method to represent textual documents in a graph-based representation. Using various centralities, Beliga et al. ( 2015 ) proposed a node selectivity-driven keyword extraction approach. Kwon et al. ( 2015 ) suggested yet another ground-breaking keyword weighting and extraction method using graph. To improvise the traditional TextRank algorithm, Wang et al. ( 2018 ) used document frequency and Average Term Frequency (ATF) to calculate the node weight for extraction of keywords belonging to a particular domain. Bellaachia and Al-Dhelaan ( 2012 ) introduced the Node and Edge rank (NE-rank) algorithm for keyword extraction, which basically combines node weight (i.e., TF-IDF in this case) with TextRank. Khan et al. ( 2016 ) suggested Term-ranker, which is a re-ranking approach using graph for the extraction of single-word and multi-words using a statistical method. They identified classes of semantically related words while estimating term similarity using term embedding, and used graph refinement and centrality measures for extraction of top-ranked terms. For directed graphs, Ravinuthala et al. ( 2016 ) weighted the edges based on themes and examined their framework for keywords produced both automatically and manually. Using the PageRank algorithm, Devika and Subramaniyaswamy ( 2021 ) extracted keywords based on the graph’s semantics and centralities. The above studies show that centrality measures are a catalyst for effective sentiment analysis. This is because a powerful keyword’s effect or position in determining the sentiment score is often greater than a weaker keyword. For the extraction of sentiment sentences, Shimada et al. ( 2009 )suggested the use of a hierarchical acyclic-directed graph and similarity estimation. For sentences’ sentiment representation, Wu et al. ( 2011 ) developed an integer linear programming-based structural learning system using graph. Using graphs, Duari and Bhatnagar ( 2019 ) also suggested keyword’s score determination and extraction procedures based on the sentences’ cooccurrence with a window size set to 2, position-dependent weights, contextual hierarchy, and connections based on semantics. In comparison to other existing models, their model has an excessively high dimensionality with terms in the text interpreted as nodes and edges representing node relationships in the graph. A variety of unsupervised graph-driven automated keyword extraction approaches is investigated by Mothe et al. ( 2018 ) using node ranking and varying word embedding and co-occurrence hybridization. Litvak et al. ( 2011 ) suggested DegExt, an unsupervised cross-lingual keyphrase extractor that makes use of syntactic representation of text using graphs. Order-relationship between terms represented by nodes is represented by the edges of such graphs. However, without a restriction on the maximum number of possible nodes which can be used, their algorithm generates exponentially larger graphs with larger datasets. As a result, dimensionality is one of the consequences of a graph-based keyword extraction procedure that must be regulated using appropriate means for sentiment analysis to be efficient. Chen et al. ( 2019 ) suggested extracting keywords using an unsupervised approach that relied solely on the article as a corpus. Words are ranked in their model based on their occurrence in strong motifs. Bougouin et al. ( 2013 ) assessed the relevance of a document’s topic in order to suggest TopicRank, an unsupervised approach for extracting key phrases. However, it should be mentioned that their model does not have the optimal key selection approach. To retrieve topic-wise essential keywords, Zhao et al. ( 2011 ) suggested a three-stage algorithm. Edge-weighting is used to rate the keywords (i.e., nodes) using two words’ co-occurrence frequency, followed by generation as well as the ranking of candidate keyphrases. Shi et al. ( 2017 ) suggested an automated single document keyphrase extraction technique based on co-occurrence-based knowledge graphs, which learns hidden semantic associations between documents using Personalized PageRank (PPR). Thus, many experts have used co-occurrence graphs, as well as other graph properties such as centrality metrics, to demonstrate the effectiveness of these methods for keyword ranking in sentiment analysis.

Sentiment prediction and classification techniques

Different techniques have emerged till now for serving sentiment prediction and classification purposes. Several researchers group the techniques based on the applicability of the techniques, challenges, or simply the general topics of sentiment analysis. According to Cambria ( 2016 ), affective computing can be performed either by using knowledge-based techniques, statistical methods, or hybrid approaches. Knowledge-based techniques categorize text into affect categories with the use of popular sources of affect words or multi-word expressions, based on the presence of affect words such as ‘happy’, ‘sad’, ‘angry’ etc. Statistical methods make use of affectively annotated training corpus and determine the valence of affect keywords through word co-occurrence frequencies, the valence of other arbitrary keywords, etc. Hybrid approaches such as Sentic Computing (Cambria and Hussain 2015 ) make use of knowledge-driven linguistic patterns and statistical methods to infer polarity from the text.

Medhat et al. ( 2014 ) presented different classification techniques of sentiment analysis in a very refined and illustrative manner. Inspired by their paper, the current sentiment prediction and classification techniques are depicted in The evolution of sentiment analysis marks the emergence of different models by different experts. After going through more than 500 sentiment analysis models proposed till now, a general framework of sentiment analysis is presented in Fig.  2 . The framework comprises mainly four modules along with an additional optional module. The modules perform collection and standardization of data; pre-processing of the dataset; extraction of features or keywords which represent the overall dataset; prediction or classification of the sentiments associated with the keywords or the whole sentence or document according to the requirement; and summarization of the overall sentiment associated with the dataset. The different modules are discussed in detail below. The techniques are examined thoroughly below, to assist in choosing the best sentiment analysis classification or prediction method for a particular task.

An external file that holds a picture, illustration, etc.
Object name is 10462_2023_10442_Fig2_HTML.jpg

Sentiment classification techniques

Machine learning approach

The machine learning approach of sentiment classification uses well-known machine learning classifiers or algorithms along with linguistic features to classify the given set of data into appropriate sentiment classes (Cambria and Hussain 2015 ). Given a set of data, machine learning algorithms focus to build models which can learn from the representative data Patil et al. ( 2016 ). The extraction and selection of the best set of features to be used to detect sentiment are crucial to the models’ performance Serrano-Guerrero et al. ( 2015 ). There are basically two types of machine learning techniques namely supervised and unsupervised. However, some researchers also use a hybrid approach by combining both these techniques.

The supervised machine learning approach is based on the usage of the initial set of labeled documents/opinions, to determine the associated sentiment or opinion of any test set or new document. Among the different supervised learning techniques Support Vector Machine (SVM), Naive Bayes, Maximum Entropy, Artificial Neural Network (ANN), Random Forest, and Gradient Boosting are some of the most popular techniques which are employed in the sentiment analysis process. A brief introduction to each of these techniques is presented below; followed by a discussion on some of the research works using these algorithms either individually, in combination, or in comparison to each other.

SVM classifier is basically designed for binary classification. However, if the model is extended to support multi-class classification, One-vs-Rest (OvR)/One against all or One-vs-One (OvO)/One against one strategy is applied for the SVM classifier (Hsu and Lin 2002 ). In OvR, the multi-class dataset is re-designed into multiple binary datasets, where data belonging to one class is considered positive while the rest are considered negative. Using the binary datasets, the classifier is then trained. The final decision on the assignment of a class is made by choosing the class which classifies the test data with the greatest margin. Another, strategy One-vs-One (OvO) can also be used, and thus choose the class which is selected by majority classifiers. OvO involves splitting the original dataset into datasets representing one class versus every other class one by one.

Ahmad et al. ( 2018 ) presented a systematic review of sentiment analysis using SVM. Based on the papers published during the span of 5 years, i.e., from 2012 to 2017, they found that a lot of research works are published either using SVM directly for analysis or in a hybrid manner or even for comparing their proposed model with SVM. Some of the recent studies that used SVM for sentiment analysis are listed in Table ​ Table1 1 .

Recent literature on sentiment analysis using SVM

AuthorsTitle of the paperContribution(s)
Hidayat et al. ( )Sentiment analysis of Twitter data related to Rinca Island development using Doc2Vec and SVM and logistic regression as a classifier.

Studied public opinion on Twitter regarding the development in Rinca Island using SVM and logistic regression.

Used two types of Doc2Vec, distributed memory model of a paragraph vector (PV-DM) and a paragraph vector with a distributed bag of Words (PV-DBOW).

The result of PV-DBOW with SVM, PV-DM with SVM showed the best results.

Cepeda and Jaiswal ( )Sentiment Analysis on Covid-19 Vaccinations in Ireland using Support Vector Machine

Used tweets on the Covid-19 vaccination program in Ireland.

A lexicon and rule-based VADER tool labeled the global dataset as negative, positive, and neutral. After that, Irish tweets were classified into different sentiments using SVM.

Results show positive sentiment toward vaccines at the beginning of the vaccination drive, however, this sentiment gradually changed to negative in early 2021.

Mullen and Collier ( )Sentiment analysis using support vector machines with diverse information sources

Uses SVMs to bring together diverse sources of potentially pertinent information, including several favourability measures for phrases and adjectives and, where available, knowledge of the topic of the text.

Hybrid SVMs which combine unigram-style feature-based SVMs with those based on real-valued favourability measures obtain superior performance.

Zainuddin and Selamat ( )Sentiment analysis using support vector machine

The features were extracted using N-grams and different weighting schemes.

Use of Chi-Square weight features to select informative features for the classification using SVM proves to improve the accuracy.

Luo et al. ( )Affective-feature-based sentiment analysis using SVM classifier

Considered text sentiment analysis as a binary classification.

The feature selection method of Chi-square Difference between the Positive and Negative Categories (CDPNC) was proposed to consider the entire corpus contribution of features and each category contribution of features.

The sentiment Vector Space Model (s-VSM) was used for text representation to solve data sparseness.

With the combination of document frequency with Chi-Square, the experimental results were found to be superior to other feature selection methods using SVM.

Patil et al. ( )Sentiment analysis using support vector machine.

Stated that SVM acknowledges some properties of text like high dimensional feature space, few irrelevant features, sparse instance vector and also eliminates the need for feature selection with the ability to generalize high dimensional feature space.

The authors showed that the textual sentiment analysis performed better using SVM as compared to ANN.

Prastyo et al. ( )Tweets Responding to the Indonesian Government’s Handling of COVID-19: Sentiment Analysis Using SVM with Normalized PolyKernel

The SVM analysis on the sentiments on general aspects using two-classes dataset achieved the highest performance in average accuracy, precision, recall, and f-measure.

Demonstrated that the SVM algorithm with the Normalized Poly Kernel can be used to predict sentiment on Twitter for new data quickly and accurately.

There are basically two models which are commonly used for text analysis i.e., Multivariate Bernoulli Naive Bayes (MBNB) and Multinomial Naive Bayes (MNB) (Altheneyan and Menai 2014 ).

However, for continuous data, Gaussian Naive Bayes is also used. MBNB is used for classification when multiple keywords (features) represent a dataset. In MBNB, the document-term matrix is built using BoW, where the keywords for a document are represented by 1 and 0 based on the occurrence or non-occurrence in the document.

Whenever the count of occurrence is considered, MNB is used. In MNB, the distribution is associated with vector parameters θ c = ( θ c 1 , θ c 2 , . . . , θ ci ) for class c , where i is the number of keywords, and θ ci is the probability P ( V i ∣ C l a s s c ) of keyword V i appearing in a dataset belonging to class c . For estimating θ c , a smoothed variant of maximum likelihood namely relative frequency counting is employed as shown below.

where, α is the smoothing factor, N ci is the number of times keyword k appears in the training set and N c is the total number of keywords in class c .

To conduct a thorough investigation of the sentiment of micro-blog data, Le and Nguyen ( 2015 ) developed a sentiment analysis model using Naive Bayes and SVM, as well as information gain, unigram, bigram, and object-oriented feature extraction methods. Wawre and Deshmukh ( 2016 ) presented a system for sentiment classification that included comparisons of the common machine learning approaches Naive Bayes and SVM. Bhargav et al. ( 2019 ) used the Naive Bayes algorithm and NLP to analyze customer sentiments in various hotels.

Using the empirical probability distribution, maximum entropy models a given dataset by finding the highest entropy to satisfy the constraints of the prior knowledge. The unique distribution that shows maximum entropy is of the exponential form as shown in Eq.  12 .

Here, f i ( d o c i , C ) is a keyword and λ i is a parameter to be estimated. The denominator of Eq.  12 is a normalizing factor to ensure proper probability.

The flexibility offered by the maximum entropy classifier helps to augment syntactic, semantic, and pragmatic features with the stochastic rule systems. However, the computational resources and annotated training data required for the estimation of parameters for even the simplest maximum entropy model are very high. Thus, for large datasets, the model is not only expensive but is also sensitive to round-off errors because of the sparsely distributed features. For the estimation of parameters, different methods such as gradient ascent, conjugate gradient, variable metric methods, Generalized Iterative Scaling, and Improved Iterative Scaling are available (Hemalatha et al. 2013 ). Yan and Huang ( 2015 ) used the maximum entropy classifier to perform Tibetan sentences’ sentiment analysis, based on the probability difference between positive and negative outcomes. To identify the sentiment expressed by multilingual text, Boiy and Moens ( 2009 ) combined SVM, MNB, and maximum entropy describing different blogs, reviews, and forum texts using unigram feature vectors.

Deep learning (DL): Deep Learning is essentially an ANN with three or more layers that has the capability to handle large datasets and their associated complexities such as non-linearity, intricate patterns, etc. It involves the transformation and extraction of features automatically, which facilitates self-learning as it goes by multiple hidden layers, in a way similar to humans. These advantages of deep learning lead to enhanced performance of a sentiment analysis model and thus have led to its popularity since 2015 for the same. The input features of many deep learning models are generally preferred to be word embeddings. Word embeddings can be learned from text data by using an embedding layer, Word2Vec, or Glove vectors. Word2Vec can be learned either by the Continuous Bag of Words (CBOW) or the Continuous Skip-Gram model. Some of the common deep learning algorithms include CNNs, RecNN, RNN, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Deep Belief Networks (DBN). The detailed study by Yadav and Vishwakarma ( 2020 ) on sentiment analysis using DL, has found that LSTM performs better than other popular DL algorithms.

Tembhurne and Diwan ( 2021 ) provided valuable insight into the usage of several architectural versions of sequential deep neural networks, such as RNN, for sentiment analysis of inputs in any form, including textual, visual, and multimodal inputs. Tang et al. ( 2015 ) introduced several deep NNs with the use of sentiment-specific word embeddings for performing word-level, sentence-level, and lexical-level sentiment analysis. To encode the sentiment polarity of sentences, the authors introduced different NNs including a prediction model and a ranking model. They discovered discriminative features from different domains using sentiment embeddings to perform sentiment classification of reviews. According to the authors, the SEHyRank model shows the best performance among all the other proposed models. To fit CNN in aspect-based sentiment analysis, Wang et al. ( 2021 ) proposed an aspect mask to keep the important sentiment words and reduce the noisy ones. Their work made use of the position of aspects to perform aspect-based sentiment analysis in a unified framework. Hidayatullah et al. ( 2021 ) performed sentiment analysis using tweets on the Indonesian President Election 2019 using various deep neural network algorithms. According to the authors, Bidirectional LSTM (Bi-LSTM) showed better results as compared to CNN, LSTM, CNN-LSTM, GRU-LSTM, and other machine learning algorithms namely SVM, Logistic Regression (LR), and MNB. Soubraylu and Rajalakshmi ( 2021 ) proposed a hybrid convolutional bidirectional recurrent neural network, where the rich set of phrase-level features are extracted by the CNN layer and the chronological features are extracted by Bidirectional Gated Recurrent Unit (BGRU) through long-term dependency in a multi-layered sentence. Priyadarshini and Cotton ( 2021 ) suggested a sentiment analysis model using LSTM-CNN for a fully connected deep neural network and a grid search strategy for hyperparameter tuning optimization.

The Emotional Recurrent Unit (ERU) is an RNN, which contains a Generalized Neural Tensor Block (GNTB) and a Two-Channel Feature Extractor (TFE) designed to tackle conversational sentiment analysis. Generally, using ERU for sentiment analysis involves obtaining the context representation, incorporating the influence of the context information into an utterance, and extracting emotional features for classification. Li et al. ( 2022 ) employed ERU in a bidirectional manner to propose a Bidirectional Emotional Recurrent Unit (BiERU) to perform sentiment classification or regression. BiERU follows a two-step task instead of the three steps mentioned for simple ERUs. According to the source of context information, the authors presented two types of BiERUs namely, BiERU with global context (BiERU-gc) and BiERU with local context (BiERU-lc). As compared to c-LSTM (Poria et al. 2017 ), CMN (Hazarika et al. 2018 ), DialogueRNN (Majumder et al. 2019 ), and DialogueGCN (Ghosal et al. 2019 ), AGHMN (Jiao et al. 2020 ), BiERU showed better performance in most of the cases.

The low correlation between models is the key. Much the same as how speculations with low relationships meet up to shape a portfolio that is more prominent than the number of its parts, uncorrelated models can create group expectations that are more exact than any of the individual forecasts. The explanation behind this great impact is that the trees shield each other from their individual mistakes. While a few trees might not be right, numerous different trees will be correct, so as a gathering the trees can move the right way. So, the requirements for the random forest to perform well are:

There should be some real sign in our highlights so that models manufactured utilizing those highlights show improvement over random speculating.

The predictions made by the individual trees need to have low correlations with one another. As we realize that a forest is comprised of trees and more trees imply a more robust forest. Likewise, a random forest algorithm makes choice trees on information tests and afterward gets the forecast from every one of them, and lastly chooses the best arrangement by methods for casting a ballot. It is a gathering strategy that is superior to a solitary choice tree since it decreases the over-fitting by averaging the outcome.

Baid et al. ( 2017 ) analyzed the movie reviews using various techniques like Naïve Bayes, K-Nearest Neighbour, and Random Forest. The authors showed that Naïve Bayes performed better as compared to other algorithms. While performing sentiment analysis of real-time 2019 election twitter data, Hitesh et al. ( 2019 ) demonstrated that Word2Vec with Random Forest improves the accuracy of sentiment analysis significantly compared to traditional methods such as BoW and TF-IDF. This is because Word2Vec improves the quality of features by considering the contextual semantics of words.

Jain and Dandannavar ( 2016 ) suggested a system for sentiment analysis of tweets based on an NLP-based technique and machine learning algorithms such as MNB and decision tree, which use features extracted based on various parameters. For sentiment analysis of online movie reviews, Sharma and Dey ( 2012 ) have developed a noteworthy comparison of seven current machine learning techniques in conjunction with various feature selection approaches. Tan and Zhang ( 2008 ) also introduced a similar work, in which sentiment analysis of various fields, such as education, movies, and houses, is carried out using various feature selection methods along with machine learning techniques. Depending on the applicability and need for better-quality models for sentiment analysis, experts in the field use a variety of cascaded and ensemble approaches to combine machine learning algorithms with other existing options (Ji et al. 2015 ; Tripathy et al. 2015 ; Xia et al. 2011 ; Ye et al. 2009 ).

In unsupervised learning, the models are trained using unlabeled datasets. This technique in most cases relies on clustering methods such as k-means clustering, expectation-maximization, and cobweb. Darena et al. ( 2012 ) used k-means clustering through the use of Cluto 2.1.2 to determine the sentiment associated with customer reviews.

In self-supervised learning, the model begins with unlabeled datasets and then trains itself to learn a part of the input by leveraging the underlying structure of the data. Although the use of an unlabeled dataset gives this learning technique the notion of being unsupervised, they are basically designed to execute downstream tasks that are traditionally addressed by supervised learning. One of the self-supervised learning techniques which have gained a lot of popularity in recent years is the Pretrained Language Model (PML).

Typical steps in the process of creating a sentiment analysis model from scratch usually involve making use of standard sentiment lexicons, sentiment scoring and data labeling by human experts, and proper parameter tuning of the model that performs well on the rest of the dataset. This procedure could be expensive and time-consuming. PLM makes it simpler for developers of sentiment analysis models to implement the model in less training time with improved accuracy, by providing extensive semantic and syntactic information with the usage of a few lines of code. PLM acts as a reusable NLP model for various tasks associated with sentiment analysis such as PoS tagging, lemmatization, dependency parsing, tokenization, etc. Thus, PLMs can be proved to be advantageous to solve similar new tasks using old experience, without training the sentiment analysis model from the scratch.

Chan et al. ( 2022 ) provided a detailed study on the evolution and advancement of sentiment analysis using pretrained models. Additionally, the authors covered various tasks of sentiment analysis, for which the pretrained models can be used. The early works on PML involved transferring a single pretrained embedding layer to the task-oriented network architecture. To cope with numerous challenges such as word sense, polysemy, grammatical structure, semantics, and anaphora, models are presently being improved to a higher representation level.

Bidirectional Encoder Representations from Transformers BERT (Devlin et al. 2018 ), NLTK (Loper and Bird 2002 ), Stanford NLP (Manning et al. 2014 ), Universal Language Model Fine-tuning (ULMFit) (Howard and Ruder 2018 ), Embeddings from Language Models (ELMo) (Sarzynska-Wawer et al. 2021 ) are some of the well-known PLMs that serve as open-source NLP libraries for sentiment analysis. The performance of BERT was determined to be superior by Mathew and Bindu ( 2020 ) who thoroughly analyzed numerous PLMs that are frequently used for sentiment analysis.

Many pre-trained models use self-supervision strategies to learn the semantic content; however, give less importance to the sentiment-specific knowledge during the pre-training phase. There might also be a risk of overfitting associated with a pretraining model, which may lead to domain-specific sentiment mismatch between the source and the target domain. While dealing with social media related content, the PLM might cause biases in the results. The language in which the PLM was trained might differ from the language which is generally used in social media platforms. Further in-depth analysis and model development may be constrained if PLM behaves in a black-box manner. In a few cases, the PLM might not be able to handle the multi-class problem, if it was originally designed for identifying single or binary classes. This might also lead to ignorance/mishandling of one of the important classes, say neutral class, if the PLM was initially designed for handling positive and negative classes. Thus, while choosing a particular PLM model, we must consider the domain and data it was originally designed for. Also, a human expert might be required to validate the results, whenever required, to assure the quality of the sentiment analysis model.

Mao et al. ( 2022 ) provided an in-depth analysis of how PLMs are biased toward prompt-based sentiment analysis and emotion detection. According to the authors, the number of label classes, emotional label-word selections, prompt templates and positions, and the word forms of emotion lexicons leads to biased results. To address the issue of cross-domain tasks, Zhou et al. ( 2020 ) proposed SENTIX, a sentiment-aware model that learns the domain invariant sentiment knowledge during the pre-training phase. For addressing several factors related to sentiment analysis, experts have till now presented a variety of improvised modifications of the original PLMs. Some of them include Dynamic Re-weighting BERT (DR-BERT) (Zhang et al. 2022 ), BERT-based Dilated CNN (BERT-DCNN) (Jain et al. 2022 ), Attention-based ELMo (A-ELMo) (Huang and Zhao 2022 ), Contextual Sentiment Embeddings (CoSE) (Wang et al. 2022a ), Extended Universal Language Model Fine-Tuning (Ext-ULMFiT) and Fine-Tuned (FiT-BERT) (Fazlourrahman et al. 2022 ), etc.

Many researchers combine supervised and unsupervised techniques to generate hybrid approaches or even semi-supervised techniques which can be used to classify sentiments (König and Brill 2006 ; Kim and Lee 2014 ). With new information generated every millisecond, finding a fully labeled large dataset representing all the required information is nearly impossible. In such a scenario, semi-supervised algorithms train an initial model on a few labeled samples and then iteratively apply it to the greater number of unlabelled data and make predictions on new data. Among various semi-supervised techniques, Graph Convolution Network (GCN) (Kipf and Welling 2016 ; Keramatfar et al. 2022 ; Dai et al. 2022 ; Zhao et al. 2022 ; Lu et al. 2022 ; Yu and Zhang 2022 ; Ma et al. 2022 ) has recently gained the attention of researchers for performing sentiment analysis.

GCN is based on CNN which operates directly on graphs while taking advantage of the syntactic structure and word dependency relation to correctly analyze sentiment. GCNs learn the features by inspecting neighboring nodes. By using a syntactic dependency tree, a GCN model captures the relation among different words and links specific aspects to syntax-related words. Each layer of the multi-layer GCN architecture encodes and updates the representation of the graph’s nodes using features from those nodes’ closest neighbors. GCNs assist in performing node-level, edge-level, and graph-level prediction tasks for sentiment analysis, such as determining how connections on a social media platform affect the opinions of the users within that network, creating user recommendations based on connections between various products previously purchased, suggesting movies, etc. Generally, GCNs focus on learning the dependency information from contextual words to aspect words based on the dependency tree of the sentence. As a result, GCN has mainly attracted researchers in the field of aspect-based sentiment analysis.

Lu et al. ( 2021 ) built a GCN on the sentence dependency tree to fully utilize the syntactical and semantic information. Their methodology fixed the issues of incorrectly detecting irrelevant contextual words as clues for evaluating aspect sentiment, disregarding syntactical constraints, and long-range sentiment dependencies, which were present in earlier models. SenticGCN was proposed by Liang et al. ( 2022 ) to capture the affective dependencies of the sentences according to the specific aspects. To combine the affective knowledge between aspects and opinion words, the model performs aspect-based sentiment analysis using SenticNet along with GCN.

Along with the local structure information of a given sentence, such as locality, sequential knowledge, or syntactical dependency constraints within the sentence, global dependency information also holds importance in determining the sentiments accurately. Zhu et al. ( 2021 ) proposed a model named Global and Local Dependency Guided Graph Convolutional Networks (GL-GCN), where word global semantic dependency relations were revealed with the use of a word-document graph representing the entire corpus. An attention mechanism was adopted by the authors to combine both local and global dependency structure signals.

In general, the layers in GCN models are not devised distinctively for processing the aspect. To handle this issue, Chen et al. ( 2021 ) integrated GCN and co-attention networks for aspect-based sentiment analysis, to extract relevant information from contexts and remove the noise while considering colloquial texts. Tian et al. ( 2021 ) addressed the issues of the inability to learn from different layers of GCN, not considering dependency types, and lacking mechanisms for differentiating between various relations in the context of sentiment analysis. The authors utilized dependency types for aspect-based sentiment analysis with Type-aware GCN (T-GCN).

Opinion terms are used in a lexicon-based approach to execute sentiment classification tasks. This method suggests that a sentence’s or document’s cumulative polarity is the sum of the polarities of individual terms or phrases (Devika et al. 2016 ). According to Zhang et al. ( 2014 ), in opinion lexicon methods, evaluated and tagged sentiment-related words are counted and weighted with the use of a lexicon to perform sentiment analysis. This approach is based on sentiment lexicons, which are a compilation of recognized and pre-compiled terms, phrases, and idioms formed for traditional communication genres, according to Kharde and Sonawane ( 2016 ). More complex systems, such as dictionaries or ontologies, may also be used for this approach (Kontopoulos et al. 2013 ). Some of the recent lexicons available for sentiment analysis are discussed below in Table ​ Table2 2 .

Lexicons for sentiment analysis

S. no.Sentiment lexicaMain featurePros and/or cons
1Loughran and McDonald Sentiment Word Lists (Loughran and McDonald )

The dictionary reports counts, the proportion of the total, the average proportion per document, the standard deviation of proportion per document, document count, seven sentiment category identifiers, the number of syllables, and the source for each word.

Indicator for sentiment related to financial context: “negative”, “positive”, “litigious”, “uncertainty”, “constraining”, or “superfluous”.

Cons:

Does not contain acronyms, hyphenated words, names, or phrases, British English,

Contains a limited number of abbreviations.

2Stock Market Lexicon (Oliveira et al. )

Learning stock market lexicon from StockTwits for the stock market and general financial applications.

About 17.44% of the StockTwits messages are labeled as “bullish” or “bearish” by their authors, to show their sentiment toward the mentioned stocks.

Pros:

Presents Sentiment oriented word embeddings for the stock market.

Cons:

Imbalanced dataset Bullish is much higher than that labeled as Bearish, with an overall ratio of 4.03.

3SentiWordNet 3.0 (Baccianella et al. )

Makes use of WordNet 3.0 to assign positive, negative, and objective scores to terms.

Comprises more than 100,000 words that occur in different contexts.

Pros:

For machine learning based sentiment classification a mixture of documents of different domains achieves good results.

Cons:

For Cross-domain sentiment analysis, rule-based approaches with fixed opinion lexica are unsuited.

4SenticNet 7 (Cambria et al. ) The input sentence is translated from natural language into a sort of ‘protolanguage’ sentence, which generalizes words and multiword expressions in terms of primitives and, hence, connects these (in a semantic-role-labeling fashion) to their corresponding emotion and polarity labels.

Pros:

Sentence, which generalizes words contains multiword expressions which enable polarity disambiguation.

Cons:

Sentence, which generalizes words do not handle sarcasm or antithetic opinion targets perfectly.

5VADER (Hutto and Gilbert )

A lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media.

Used for sentiment analysis of text which is sensitive to both the polarities, i.e., positive/negative and finds the intensity (strength) of emotion.

Especially attuned to microblog-like contexts.

Pros:

Not only presents the positivity and negativity score but also tells us about how positive or negative a sentiment is.

Cons:

May not work for complex data, does not recognize context, and requires additional tools for visualizing output.

6Opinion Lexicon (Hu and Liu ) A list of positive and negative opinion words or sentiment words for English customer reviews (around 6800 words).

Cons:

Does not help to find features that are liked by customers.

7MPQA Subjectivity Lexicon (Wilson and Wiebe )

In the corpus, individual expressions are marked that correspond to explicit mentions of private states, speech events, and expressive subjective elements.

Annotators were asked to judge all expressions in context.

Includes 5,097 negative and 2,533 positive words. Each word is assigned a strong or weak polarity.

Cons:

It is rooted in the subjective interpretations of a single person.

Works great for short sentences, such as tweets or Facebook posts.
8NRC Hashtag Sentiment Lexicon (Mohammad and Kiritchenko ; Mohammad )

Association of words with positive (negative) sentiment generated automatically from tweets with sentiment-word hashtags such as #amazing and #terrible.

Number of terms: 54,129 unigrams, 316,531 bigrams, 308,808 pairs,

Association scores: real-valued.

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

9NRC Hashtag Emotion Lexicon (Mohammad et al. ; Zhu et al. ; Kiritchenko et al. )

Association of words with eight emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive).

Manually annotated on Amazon’s Mechanical Turk.

Generated automatically from tweets with emotion-word hashtags such as #happy and #anger.

Number of terms: 16,862 unigrams (words), 5,000 word senses, Association scores: real-valued

Pros:

Available in 40 different languages.

Cons:

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

10NRC Hashtag Affirmative Context Sentiment Lexicon (Mohammad et al. ; Zhu et al. ; Kiritchenko et al. )

Association of words with positive (negative) sentiment in affirmative or negated contexts generated automatically from tweets with sentiment-word hashtags such as #amazing and #terrible.

Number of terms: Affirmative contexts: 36,357 unigrams, 159,479 bigrams.

Association scores: real-valued

Pros:

Available in 40 different languages.

Cons:

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

11NRC Hashtag Negated Context Sentiment Lexicon (Mohammad et al. ; Zhu et al. ; Kiritchenko et al. )

Association of words with positive (negative) sentiment in negated contexts generated automatically from tweets with sentiment-word hashtags such as #amazing and #terrible.

Number of terms: Negated contexts: 7,592 unigrams, 23,875 bigrams.

Association scores: real-valued

Pros:

Available in 40 different languages.

Cons:

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

12NRC Word-Emotion Association Lexicon/NRC Emotion Lexicon (Mohammad and Turney , )

Association of words with eight emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive) manually annotated on Amazon’s Mechanical Turk.

Available in 40 different languages.

Number of terms: 14,182 unigrams (words), 25,000 word senses.

Association scores: binary (associated or not).

Pros:

Available in 40 different languages.

Cons:

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

13Emoticon Lexicon/Sentiment140 Lexicon (Mohammad et al. ; Zhu et al. ; Kiritchenko et al. )

Association of words with positive (negative) sentiment generated automatically from tweets with emoticons such as:) and:(.

Number of terms: 62,468 unigrams, 677, 698 bigrams, 480,010 pairs.

Number of terms: 14,182 unigrams (words), 25,000 word senses.

Association scores: real-valued.

Pros:

Available in 40 different languages.

Cons:

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

14Sentiment140 Affirmative Context Lexicon (Mohammad et al. ; Zhu et al. ; Kiritchenko et al. )

Association of words with positive (negative) sentiment in affirmative contexts generated automatically from tweets with emoticons such as:) and:(.

Number of terms: Affirmative contexts: 45,255 unigrams, 240,076 bigrams.

Pros:

Available in 40 different languages.

Cons:

Words can have multiple meanings and senses, and the meaning and sense that is common in one domain may not be common in another. Furthermore, words that are not generally considered sentiment-bearing can imply sentiments in specific contexts.

15Yelp Restaurant Sentiment Lexicon (Kiritchenko et al. )

The Yelp dataset is a subset of our businesses, reviews, and user data for use in personal, educational, and academic purposes.

Created from the Yelp dataset, from the subset of entries about these restaurant-related businesses.

Cons:

Few reviews are considered to be fake.

No proper boundary to detect neutrality.

Consists of 10 attributes, namely, unique Business ID, Date of Review, Review ID, Stars given by the user, Review given by the user, Type of text entered—Review, Unique User ID, Cool column: The number of cool votes the review received, Useful column: The number of useful votes the review received, Funny Column: The number of funny votes the review received.

Number of reviews: 183,935 reviews.

Have one to five-star ratings associated with each review.

16Amazon Laptop Sentiment Lexicon (McAuley and Leskovec )

Collected reviews posted on Amazon.com from June 1995 to March 2013. Extracted from this subset are all reviews that mention either a laptop or notebook.

Have one to five-star ratings associated with each review.

26,577 entries for unigrams (includes affirmative and negated context entries), 155,167 entries for bigrams.

Cons:

May not work well with the neutral sentiment.

17Macquarie Semantic Orientation Lexicon (Mohammad et al. )

76,400 terms.

Sentiments: negative, positive

Automatic: Using the structure of a thesaurus and affixes.

18Harvard’s General Inquirer Lexicon (Stone and Hunt )

A lexicon attaching syntactic, semantic, and pragmatic information to part-of-speech tagged words.

2000 positive and 2000 negative words.

19IMDB (Yenter and Verma )

50K movie reviews.

A set of 25,000 highly polar movie reviews for training and 25,000 for testing.

Pros:

The data is refreshed daily.

Cons:

IMDB reviews are not considered to be overly trustworthy, as big Hollywood studios generally dictate the scores and the overall consensus.

The algorithm used by IMDB to collate its reviews is generally considered inferior to those used by Rotten Tomatoes and similar sites.

20AFINN (Nielsen )

AFINN is the simplest yet most popular lexicon used for sentiment analysis developed by Finn Årup Nielsen.

It contains 3300+ words with a polarity score associated with each word.

A list of English terms manually rated for valence with an integer between -5 (negative) and +5 (positive) by Finn Årup Nielsen between 2009 and 2011.

Primarily analyze Twitter sentiment.

Cons:

Using the raw AFINN score the longer texts may yield higher values simply because they contain more words.

21Corpus of Business News (Moreno-Ortiz et al. )

Covers non-specific sentiment-carrying terms and phrases.

It contains 6,470 entries, both single and multi-word expressions, each with tags denoting their semantic orientation and intensity.

Pros:

A wide coverage, a domain-specific lexicon for the analysis of economic and financial texts in English.

22DepecheMood Affective Lexicon (Staiano and Guerini )

Harvested crowd-sourced affective annotation from a social news network.

Considered the affective dimensions namely Afraid, Amused, Angry, Annoyed, Don’_Care, Happy, Inspired, and Sad.

37 thousand terms annotated with emotion scores.

Cons:

Cannot handle similar words which are not present in the training document.

23Financial Phrasebank (Malo et al. )

Polar sentiment dataset of sentences from financial news.

The dataset consists of 4840 sentences from English-language financial news categorized by sentiment. The dataset is divided by an agreement rate of 5–8 annotators.

Pros:

Works well for NLP-related tasks in multi-class financial domain classifications.

The lexicon-based approach is categorized into three methods: manual, dictionary-based, and corpus-based methods based on the various approaches to classification (Zhang et al. 2014 ). Because of the considerable time investment, researchers seldom use the manual approach, though it is often paired with the other two automated approaches.

Dictionary-based approach starts with a series of manually annotated opinion seed terms. The collection is then extended by searching through a dictionary such as WordNet (Miller et al. 1990 ) to find synonyms and antonyms. SWN (Baccianella et al. 2010 ) is one of the earliest thesauri and makes use of WordNet to assign positive, negative, and objective ratings to terms. The new words are added to the initial list after they have been discovered. The next iteration begins and the method continues until no new words need to be added after a particular point. While considering valence shifters (intensifiers, downtoners, negation, and irrealis markers), Read and Carroll ( 2009 ) proposed a word-level sentiment analysis model called Semantic Orientation CALculator (SO-CAL). In SO-CAL, lexicon-based sentiment classification is performed using dictionaries of sentiment-bearing terms annotated with their polarities and strengths.

The use of a dictionary for sentiment analysis suffers from one major drawback. This methodology does not adequately handle the domain and context-sensitive orientations of opinion terms.

The corpus-based approach uses syntactic patterns or co-occurring patterns in a vast corpus to extend the original seed list of opinion terms (Cambria and Hussain 2015 ). It is very tough to generate a huge corpus using the corpus-based approach, to cover each and every English word. However, using a domain corpus has the advantage of allowing you to identify the domain and context-related opinion terms as well as their orientations. The corpus-based approach aims to provide dictionaries that are specially related to a particular domain (Kharde and Sonawane 2016 ). To expand the dictionary, statistical or semantic approaches may be used to look for words that are similar as discussed below.

The statistical approach includes searching co-occurrence patterns or seed opinion words. Searching for co-occurrence trends or seed opinion terms is one statistical technique. If the corpus is insufficient, the issue of certain words not being available can be solved by using the whole collection of indexed documents on the web as the corpus for creating the dictionary (Turney 2002 ). In a broad annotated corpus, even the appearance of a word in the positive or negative text may be used to determine its polarity (Read and Carroll 2009 ). Similar opinion words are likely to co-occur in a corpus, according to Cambria and Hussain ( 2015 ), and hence the polarity of an unfamiliar word can be calculated using the relative frequency of co-occurrence with another word. In this case, PMI can be used (Turney 2002 ). The statistical approach to the semantic orientation of a word is used in conjunction with PMI (Cambria and Hussain 2015 ). Another such approach is Latent Semantic Analysis (LSA) (Deerwester et al. 1990 ).

Semantically close words are assigned similar polarities based on this approach. This method is based on various criteria for measuring word similarity (Cambria and Hussain 2015 ). The relative count of positive and negative synonyms of an unknown word can be used to find out the polarity of that word using different semantic relationships given by WordNet (Kim and Hovy 2004 ).

A combination of both statistical and semantic approaches is also followed by a few researchers to perform sentiment analysis. Zhang et al. ( 2012 ) applied a mixture of both these approaches to online reviews to determine the weakness of products. Sentence-based sentiment analysis, according to their model, is carried out by taking into account the effect of degree adverbs to determine the polarity of each aspect within a sentence. To find the implicit features, they used the collocation statistics-based selection method-Pointwise Mutual Information (PMI). With the use of semantic methods, feature words of the products are grouped into corresponding aspects.

Ding et al. ( 2008 ) demonstrated that the same term can have multiple polarities in different contexts, even within the same domain. Therefore, rather than simply finding domain-dependent sentient words using the corpus-based approach, they explored the notion of intra-sentential and inter-sentential sentiment consistency.

In the lexicon-based approach, one point is worth noticing. The initial manual annotation of the seed list can be a costly procedure. Secondly and most importantly, the use of a dictionary even for seed list generation can lead to the insufficiency of handling cross-domain problems. Thus, the usage of a proper technique to generate a seed list for a lexicon-based approach is an open problem. Also, whenever linguistic rules are involved in handling knowledge, there might be situations where it fails to correctly grasp the affective sentiment.

Hybrid approaches which use sentiment lexicons in machine learning methods have also attracted many researchers to combine the benefits of both approaches. Trinh et al. ( 2018 ) used the hybrid approach to perform sentiment analysis of Facebook comments in the Vietnamese language. While their dictionary is partly based on SO-CAL, the authors manually built the dictionary to include nouns, verbs, adjectives, and adverbs along with emotional icons. They performed sentence-level sentiment analysis of product reviews using the SVM classifier. Appel et al. ( 2016 ) also performed sentence-level sentiment analysis using a combination of lexicon and machine learning approaches. They extended their sentiment lexicon with the use of SWN and used fuzzy sets to determine the polarity of sentences. Using an SVM classifier, Zhang et al. ( 2011 ) performed entity-level sentiment analysis of tweets, with the use of a lexicon that supports business marketing or social studies. They made use of the lexicon by Ding et al. ( 2008 ) along with some frequently used opinion hashtags to build the lexicon for their model. Pitogo and Ramos ( 2020 ) performed sentiment analysis for Facebook comments using a lexicon-based approach called Valence Aware Dictionary and Sentiment Reasoner (VADER) along with a hierarchical clustering algorithm.

Sentiment or opinion summarization

Sentiment or Opinion summarization or aggregation aims to provide an idea of the overall influence or polarity depicted by the dataset, by summing up the polarity of all individual words/aspects /sentences/documents of the dataset. Sentiment summarization must not be confused with text summarization, though they are slightly related. Text summarization aims to provide a summary of the dataset, while sentiment summarization provides a generalized polarity depicted by the whole dataset.

Different types of summarization models are proposed by researchers to obtain an average sentiment. Pang and Lee ( 2004 ) first extracted all subjective sentences and then summarized those subjective sentences. Blair-Goldensohn et al. ( 2008 ) used a tool to choose a few representative documents from a vast number of documents and then used them for emotion summarization based on aspects. By mining opinion features from product feedback, Hu and Liu ( 2004 ) suggested an aspect-based sentiment summarization strategy for online consumer reviews. Using the ratings on different aspects, Titov and McDonald ( 2008 ) proposed a model which can contribute to the sentiment summarization process. Their algorithm is designed to find related topics in text and collect textual evidence from reviews to support aspect ratings. Sokolova and Lapalme ( 2009 ) developed an emotion summarization model to summarise the opinionated text in consumer goods by integrating different polarity detection techniques and automated aspect detection algorithms. Different types of summarization models are proposed by researchers to obtain an average sentiment. Pang and Lee ( 2004 ) first extracted all subjective sentences and then summarized those subjective sentences. Blair-Goldensohn et al. ( 2008 ) used a tool to choose a few representative documents from a vast number of documents and then used them for emotion summarization based on aspects. By mining opinion features from product feedback, Hu and Liu ( 2004 ) suggested an aspect-based sentiment summarization strategy for online consumer reviews. Using the ratings on different aspects, Titov and McDonald ( 2008 ) proposed a model which can contribute to the sentiment summarization process. Their algorithm is designed to find related topics in text and collect textual evidence from reviews to support aspect ratings. Bahrainian and Dengel ( 2013 ) developed an emotion summarization model to summarise the opinionated text in consumer goods by integrating different polarity detection techniques and automated aspect detection algorithms.

Performance analysis measures

The evaluation of performance is one of the principal concepts associated with building a resourceful model. Once the sentiments are classified as either positive or negative, the performance of the model needs to be evaluated. The papers by Sokolova and Lapalme ( 2009 ) provided a better understanding of the applicability of performance measures depending on the variability of the classification tasks. Among different kinds of available metrics for measuring the performance of a textual sentiment analysis model, metrics based on the confusion matrix are widely used (Sokolova and Lapalme 2007 , 2009 ; John and Kartheeban 2019 ). The details concerning the classifications that are expected and those that are calculated by a classifier are shown in the confusion matrix. A confusion matrix for binary classification problems consists of four separate data entries, namely True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN), as shown in Table ​ Table3 3 .

Confusion matrix for binary classification

Actual classification
PositiveNegative
ExpectationPositiveTPFN
NegativeFPTN

TP overall positive data that are classified as positive, TN overall negative data that are classified as negative, FP overall negative data that are classified as positive, FN overall positive data that are classified as negative

The most frequently used performance metric is accuracy to measure the overall effectiveness of the model. Accuracy determines the proportion of a total number of instances (i.e., documents/ sentences/words) that are correctly predicted by the sentiment analysis model. Equation  13 shows the formula for estimating the model’s accuracy.

Apart from accuracy, precision and recall are well-known metrics that are best suited for text applications (Sokolova and Lapalme 2007 ). The number of correctly classified positive instances is determined by positive predictive value or precision, while the number of correctly classified negative instances is determined by negative predictive value. The proportion of actual positive instances that are correctly classified is determined by sensitivity or recall; the proportion of actual negative instances that are correctly classified is determined by negative recall or specificity.

The following are the formulas for calculating them (Salari et al. 2014 ).

Precision and recall are better indicators of the current system’s success than accuracy for an imbalanced binary classifier. Yet, in certain situations, a system may have high precision but poor recall, or vice versa. In this case, the f-measure allows you to articulate all issues with a single number. Once the precision and recall for a binary or multi-class classification task have been calculated, the two scores together form the f-measure, as seen in Eq.  18 . F - m e a s u r e , F = 2 ∗ P r e c i s i o n ∗ R e c a l l P r e c i s i o n + R e c a l l 18 Accuracy or f-measure can show overoptimistic inflated results, especially on imbalanced datasets. Matthew’s Correlation Coefficient (MCC) is a more reliable statistical rate that produces a high score only if the prediction obtained good results in all of the four confusion matrix categories proportionally, both to the size of positive elements and the size of negative elements in the dataset. The confusion matrix or an error matrix can be summed up using MCC as shown in Eq.  19

. MCC ranges from [− 1,1], where 1 indicates the best agreement between the predicted and actual values. The MCC helps us to identify the ineffectiveness of the classifier in classifying especially the minority class samples. M C C = T N ∗ T P - F N ∗ F P ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N ) 19 To measure the ability of a sentiment classifier to distinguish between the polarity classes, an Area Under the Curve (AUC) is employed. The curve in AUC is generally a ROC (Receiver Operating Characteristic) curve, which is a graph showing the performance of a classification model at all classification thresholds as shown in Fig. 

3 . ROC plots TP and FP. AUC is an aggregated evaluation of the classifier as the threshold varies over all possible values. The Precision-Recall AUC summarizes the curve using a range of threshold values as a single score. AUC measures how true positive rate (recall) and false positive rate trade-off. Specifically, for imbalanced datasets, where overfitting needs to be avoided, AUC works as a preferable evaluation matrix. AUC represents the probability that a random positive instance is positioned to the right of a random negative instance. AUC ranges from 0 to 1. An AUC of 0.0 denotes a model that makes all incorrect classifications, whereas an AUC of 1.0 denotes a model that makes all correct classifications.

An external file that holds a picture, illustration, etc.
Object name is 10462_2023_10442_Fig3_HTML.jpg

AUC under ROC

When a regression task is adopted for sentiment analysis, Mean Squared Error (MSE) is employed to find the squared difference between actual and predicted values. It is an absolute measure of the goodness of fit of dependent variables in the model. The formula of MSE is given in Eq.  20 . The lower the value of MSE, the better the sentiment analyzer. It can be used as a loss function as the graph of MSE is differentiable. However, it is not very suitable in case the dataset contains outliers.

In contrast to the context dependency of MSE, R squared is a context-independent metric that is used for a regression task. It is a relative measure of how well the model fits dependent variables or how close the data is to the fitted regression line. Coefficient of Determination and Goodness of Fit are other names for R squared and it is calculated using Eq.  21 .

where SSR is the squared sum error of the regression line and SSM is the squared sum error of the mean line.

Other performance evaluation metrics that can be also considered for evaluating a sentiment analysis model are Root Mean Squared Error (RMSE), Residual Standard Error (RSE), Mean Absolute Error (MAE), etc.

Applications of sentiment analysis

Sentiment analysis or opinion mining has recently been used in studies on e-commerce feedback, tweets, Facebook posts, YouTube content, blog entries, and a variety of other data mining and knowledge-based AI programs. As a result, it has progressed significantly in fields including Information Retrieval (IR), web data analysis, mining of text, analysis of text, NLP, computational linguistics, and biometrics. Using different approaches/methods/ frameworks analyzed in this paper beforehand, sentiment analysis can be applied to various fields such as tourism, education, defense, business, politics, public, finance, hazards, health, and safety. The broad range of applications will aim to obtain the best possible combination of strengths, whether or not any of the components in Fig.  1 or any of the approaches indicated in Fig.  2 are present. Depending on the requirement/aim/ framework of a sentiment analysis model, applications can vary from a straightforward prediction of the polarity of a single word to uncovering sensitive or hidden information, or even a pattern to protect a nation from any potential terrorist attack or disaster. Many research works mention different application areas based on different domains or approaches used (Alessia et al. 2015 ; Jain and Gupta 2022 ; Saxena et al. 2022 ; Feldman 2013 ; Govindarajan 2022 ; Ravi and Ravi 2015 ). The knowledge of diverse application fields based purely on the dataset at hand is challenging to find in existing research papers. This paper aims to outline several sentiment analysis application areas based on the data/content/material in hand, that can be used by researchers for sentiment analysis.

Reviews on products

Sentiment analysis using reviews on different products with different brands is the most widespread practice, which encompasses different application angles. For a particular product, the number of brands has been increasing day to day. Also, the same brand may offer products with different specifications. Nowadays even different online shopping sites are available that sell the same product. This creates confusion among customers to reach an optimal decision. Though shopping sites offer the option of displaying comments and star ratings left by former customers to assist potential buyers, the count of current feedback can be so large that scrolling through thousands of them can be a time-consuming process. Sentiment analysis helps to alleviate this condition by giving a concise perspective on a product or brand as a whole, or even on a certain feature/aspect of the product. Also, it can be used by the sellers or manufacturers to concentrate on the suitable aspects or specifications, which can be used for upgrading the product or deciding the advertisement strategy. Product analysis by buyers, suppliers, and sellers; competitor analysis or market study by sellers or manufacturers; brand tracking and reputation management by manufacturers; customer service by e-commerce sites; and customer analysis by sellers and manufacturers are among the various application directions associated with sentiment analysis of product feedback. The necessity to detect fake reviews before using the available data for decision-making was highlighted in the research work by Vidanagama et al. ( 2022 ). The authors made use of a rule-based classifier, a domain feature ontology, and Mahalanobis distance to detect fake reviews while performing aspect-based sentiment analysis. Cao et al. ( 2022 ) have introduced a quality evaluation model of products by combining deep learning, word vector conversion, keyword clustering, and feature word extraction technologies. Their model improves product features based on consumer online reviews and finally calculates customer satisfaction and attention based on short text comments with sentiment tags. With the use of pre-trained word embeddings, Bhuvaneshwari et al. ( 2022 ) proposed a Bi-LSTM Self Attention based CNN (BAC) model for analysis of user reviews. Wang et al. ( 2022b ) designed multi-attention bi-directional LSTM (BLSTM(MA)), and used Latent Dirichlet Allocation (LDA) modeling to perform multimodal fusion for sentiment analysis of product reviews. Alantari et al. ( 2022 ) examined 260,489 reviews from five review platforms, covering 25,241 products in nine different product categories. They discovered that pretrained neural network-based machine learning techniques, in particular, provide the most precise forecasts, while topic models like LDA provide more thorough diagnostics. To make predictions, topic models are better suited than neural network models, which are not good at making diagnoses. As a result, the preference of the analysts for prediction or diagnostics is likely to determine how text review processing technologies are chosen in the future.

Political Tweets, Facebook comments, Blog posts, and YouTube Videos

Recently, people have started to openly share their views or opinion on different political parties, electoral candidates, government policies, and rules on different public platforms such as Twitter, Facebook, YouTube, and blogs. These create a great influence on the followers. Therefore, they are used by many experts to predict the outcome of an election beforehand, monitor public sentiment on various political movements, or analyze the sentiment of the public on a proposed government rule, bill, or law.

With the use of pre-trained models and the Chi-square test, Antypas et al. ( 2022 ) proposed a multilingual sentiment analysis model to analyze both influential and less popular politicians’ tweets from members of parliament of Greece, Spain, and the United Kingdom. Their study indicates that negative tweets spread rapidly as compared to positive ones. Using Valence Aware Dictionary and sentiment Reasoner (VADER), and 2 million tweets on the 2019 Indian Lok Sabha Election, Passi and Motisariya ( 2022 ) analyzed sentiments of Twitter users towards each of the Indian political parties. Using the aging estimation method with the proportion of positive message rate to negative messages rate, Yavari et al. ( 2022 ) designed an indicator of the election results in the future.

Tweets or comments on Facebook/YouTube/Instagram on social cause or events

Expressions of opinions on different social causes or events have also increased recently. This increases the scope of designing application portals that perform public sentiment analysis, monitor, and predict different possible outcomes of such an event or cause and decide the possible steps which need to be adopted in the future in case there is an outbreak of any chaotic situation.

A multi-grained sentiment analysis and event summary method employing crowd-sourced social media data on explosive accidents was built by Ouyang et al. ( 2017 ). The system can determine which components of the event draw users’ attention, identify which microblog is responsible for a large shift in sentiment, and detect those aspects of the event that affect users’ attention. Smith and Cipolli ( 2022 ) studied the emotional discourse before and after a prohibition on graphic photos of self-harm on Facebook and Instagram using a corpus of 8,013 tweets. By clarifying topical content using statistical modeling to extract abstract topics in discourse, the authors offered an insight into how the policy change relating to self-harm was viewed by those with a vested interest.

Reviews on Blogs/Tweets/Facebook comments on movie

Reviews on an upcoming movie or a movie that is onscreen in the theatres can be used to decide the success or failure of the movie. Different movie recommender systems can also be designed using the reviews from the audience. Also, the distributors or producers can use such reviews to improve their advertising strategy based on the different aspects which are liked by the viewers.

Using sentiment analysis to gain a deeper understanding of user preferences, Dang et al. ( 2021 ) proposed methods to enhance the functionality of recommender systems for streaming services. The Multimodal Album Reviews Dataset (MARD) and Amazon Movie Reviews were used to test and compare two different LSTM and CNN combinations, LSTM-CNN and CNN-LSTM. They started with a version of the recommendation engine without sentiment analysis or genres as their baseline. As compared to the baseline, the results demonstrate that their models are superior in terms of rating prediction and top recommendation list evaluation. Pavitha et al. ( 2022 ) designed a system for analyzing movie reviews in different languages, classifying them into either positive or negative using Naive Bayes and Support Vector Classifier (SVC), and recommending similar movies to users based on Cosine Similarity. For B-T4SA and IMDB movie reviews, Zhu et al. ( 2022 ) proposed a self-supervised sentiment analysis model namely Senti-ITEM. The model pairs a representative image with the social media text as a pretext task, extract features in a shared embedding space, and uses SVM for sentiment classification.

Tweets/Facebook comments on pandemic/crisis /environmental issues

Nowadays people encountering abrupt situations or difficulties due to the Covid-19 pandemic or any environmental issues such as storm or earthquake posts real-time tweets or comments on Facebook. In such a situation, by analyzing tweets or comments properly, government or any agency, or even nearby people can offer help, and perform disaster management and crisis analysis.

Hodson et al. ( 2022 ) suggested a corpus-assisted discourse analysis approach, for analyzing public opinion on COVID-19 tweets and YouTube comments related to Canadian Public Health Office. The authors found that different platforms convey key differences between comments, specifically based on the tone used in YouTube videos as compared to plain text in Tweets. To capture sarcasm or get clear information, cross-platform and diverse methods must be adopted to facilitate health-related communication and public opinion. Chopra et al. ( 2022 ) employed logistic regression, Naive Bayes, XGBoost, LSTM, GloVe, and BERT to predict disaster warnings from tweets and evaluate the seriousness of the content.

Tweets/Facebook comments/YouTube videos on Stock Market

One of the trending application areas of sentiment analysis is Stock Market Prediction. Identifying stocks and share with great potential and deciding the optimal time to buy them at the lowest price and sell them at the peak time can be performed using a suitable sentiment analysis model. Using stock market data with SVM, Ren et al. ( 2018 ) suggested a model that forecasts movement direction and predicts stock prices while capturing investor psychology. Sousa et al. ( 2019 ) used the BERT algorithm to analyze the sentiments of news articles and provide relevant information that can facilitate stock market-related quick decision-making. Considering both positive and negative financial news, de Oliveira Carosia et al. ( 2021 ) analyzed the influence on the stock market using three Artificial Deep Neural Networks namely Multi-Layer Perceptron (MLP), LSTM, and CNN. The findings of this sentiment analysis model’s observations revealed that while recurrent neural networks, such as LSTM, perform better in terms of time characteristics when used to predict the stock market, CNNs perform better when assessing text semantics.

Future scope of research in sentiment analysis

There are numerous scientific studies in the literature that focus on each of the components of the sentiment analysis approach, either independently or in combination. Each of these sentiment analysis modules offers plenty of opportunities for further investigation, improvisation, and innovation. Several challenges and issues are also faced during the process of performing sentiment analysis, which may hinder the proper functioning or performance of the model. Some of them are domain dependency, reference problems, sarcasm detection, spam detection, time period, etc. Most of these challenges influence the development of better techniques and algorithms to handle them. Some of the primary research gaps that offer scope for future research and hence encourage further sentiment analysis research are discussed below:

  • It has been found that current techniques dedicated to sentiment analysis do not employ effective data initialization and pre-processing techniques. Rather than relying on established NLP pre-processing techniques, an advanced pre-processing technique, such as standard normalization that takes deliberately into account, the case of negation and mixed emotion would be extremely beneficial.
  • One of the most critical steps in improving the performance of a sentiment analysis model is keyword extraction. Many sentiment analysis models have been observed to extract keywords using generalized dictionaries. The use of generalized dictionaries, on the other hand, produces inaccurate findings since most of these dictionaries include keywords that are relevant to a specific domain. However, there is no predefined list of keywords for a certain domain or subject in the real world. Different researchers have shown the supremacy of the degree centrality metric for the graph-based method of obtaining the best collection of representative and sentimental words. As a result, it may be used to find key terms or phrases. Automatic keyword extraction techniques can be used for sentiment analysis in a variety of applications, both independently and in combination. Most of these techniques have found applications in a variety of research areas, including Data Analysis, TM, IR, and NLP since they allow for the condensing of text records.
  • Assignment of polarity scores to keywords using sentiment dictionaries has gained a lot of attention in sentiment analysis. However, depending on its use in a specific domain, a term can serve as a positive or negative word at different times. Therefore, the usage of sentiment dictionaries with pre-defined polarities for words is not an appropriate practice for sentiment analysis. Existing sentiment dictionaries fail to handle sarcasm or negations to a great extent. It is observed that many machine learning based techniques are trained to work only for a particular domain. They do not consider that the words can change their polarity based on the context and domain of application. Thus, whenever the same word is tested for another domain using the trained classifier, it shows incorrect results in some situations.
  • New edge and node weighing approaches may be introduced and used in place of NE-Rank or TextRank centralities to determine keyword rank. To achieve improved outcomes in the future, different ensemble or individual improvised centralities may be used. This establishes a framework for future research into graph mining algorithms for sentiment analysis in various fields.

The era of digitization marks the astonishing growth of subjective textual data online. Proper analysis of the textual information, to rightly reflect the public sentiment regarding any topic, demands proper investigation of textual data. Sentiment analysis has emerged as the most important task which helps to enhance the decision-making process by extracting the underlying sentiment or opinion of data. Even though sentiment analysis has progressed in recent years, modern models have flaws such as domain dependence, negation management, high dimensionality, and the failure to use efficient keyword extraction. This paper examines and provides a comprehensive discussion of different perspectives related to the creation and implementation of an effective sentiment analysis model. A thorough examination and establishment of various modules of the sentiment analysis methodology are carried out to plan and improve effective sentiment analysis models. The keyword extraction algorithm is vital to the success of a sentiment analysis model and thus is well-studied in this paper. The paper also discusses sentiment classification methods, which form an essential aspect of a sentiment analysis model. The paper conducts a detailed review of both machine learning and lexicon-based approaches to textual data sentiment analysis.

As a thorough, well-organized study on sentiment analysis, this research effort can assist academicians and industry experts in analyzing and developing powerful sentiment analysis models in a wide range of domains. Sentiment analysis models have a lot of potential for further development and use in the near future because they have a broad range of uses in social, industrial, political, economic, health and safety, education, defense financial contexts, and others. Each of the sentiment analysis modules as discussed in this paper can be investigated, improvised, and supplemented with certain relevant algorithms to design an efficient sentiment analysis model. This study also offers prospective guidelines for carrying out proper sentiment analysis research.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Monali Bordoloi, Email: [email protected] .

Saroj Kumar Biswas, Email: moc.oohay@mukjorassib .

  • Abilhoa WD, De Castro LN. A keyword extraction method from twitter messages represented as graphs. Appl Math Comput. 2014; 240 :308–325. [ Google Scholar ]
  • Ahmad M, Aftab S, Bashir MS, Hameed N (2018) Sentiment analysis using SVM: a systematic literature review. Int J Adv Comput Sci Appl 9 (2)
  • Alantari HJ, Currim IS, Deng Y, Singh S. An empirical comparison of machine learning methods for text-based sentiment analysis of online consumer reviews. Int J Res Mark. 2022; 39 (1):1–19. doi: 10.1016/j.ijresmar.2021.10.011. [ CrossRef ] [ Google Scholar ]
  • Alessia D, Ferri F, Grifoni P, Guzzo T. Approaches, tools and applications for sentiment analysis implementation. Int J Comput Appl. 2015; 125 (3):26–33. [ Google Scholar ]
  • Alfter D, Cardon R, François T (2022) A dictionary-based study of word sense difficulty. In: Proceedings of the 2nd workshop on tools and resources to empower people with REAding DIfficulties (READI) within the 13th language resources and evaluation conference. European Language Resources Association, pp 17–24
  • Altheneyan AS, Menai MEB. Naïve bayes classifiers for authorship attribution of Arabic texts. J King Saud Univ-Comput Inf Sci. 2014; 26 (4):473–484. [ Google Scholar ]
  • Antypas D, Preece A, Collados JC (2022) Politics and virality in the time of twitter: a large-scale cross-party sentiment analysis in Greece, Spain and united kingdom. arXiv preprint arXiv:2202.00396
  • Appel O, Chiclana F, Carter J, Fujita H. A hybrid approach to the sentiment analysis problem at the sentence level. Knowl-Based Syst. 2016; 108 :110–124. doi: 10.1016/j.knosys.2016.05.040. [ CrossRef ] [ Google Scholar ]
  • Athanasiou V, Maragoudakis M. A novel, gradient boosting framework for sentiment analysis in languages where NLP resources are not plentiful: a case study for modern Greek. Algorithms. 2017; 10 (1):34. doi: 10.3390/a10010034. [ CrossRef ] [ Google Scholar ]
  • Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10), vol 10. European Language Resources Association (ELRA), pp 2200–2204
  • Bahrainian S-A, Dengel A (2013) Sentiment analysis and summarization of twitter data. In: 2013 IEEE 16th international conference on computational science and engineering. IEEE, pp 227–234
  • Baid P, Gupta A, Chaplot N. Sentiment analysis of movie reviews using machine learning techniques. Int J Comput Appl. 2017; 179 (7):45–49. [ Google Scholar ]
  • Balahur A, Hermida JM, Montoyo A. Building and exploiting emotinet, a knowledge base for emotion detection based on the appraisal theory model. IEEE Trans Affect Comput. 2011; 3 (1):88–101. doi: 10.1109/T-AFFC.2011.33. [ CrossRef ] [ Google Scholar ]
  • Banea C, Mihalcea R, Wiebe J. Sense-level subjectivity in a multilingual setting. Comput Speech Lang. 2014; 28 (1):7–19. doi: 10.1016/j.csl.2013.03.002. [ CrossRef ] [ Google Scholar ]
  • Bao H, Li Q, Liao SS, Song S, Gao H. A new temporal and social PMF-based method to predict users’ interests in micro-blogging. Decis Support Syst. 2013; 55 (3):698–709. doi: 10.1016/j.dss.2013.02.007. [ CrossRef ] [ Google Scholar ]
  • Beliga S (2014) Keyword extraction: a review of methods and approaches. University of Rijeka, Department of Informatics, Rijeka 1(9)
  • Beliga S, Meštrović A, Martinčić-Ipšić S. An overview of graph-based keyword extraction methods and approaches. J Inf Org Sci. 2015; 39 (1):1–20. [ Google Scholar ]
  • Bellaachia A, Al-Dhelaan M (2012) Ne-rank: a novel graph-based keyphrase extraction in twitter. In: 2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology, vol 1. IEEE, pp 372–379
  • Benghuzzi H, Elsheh MM (2020) An investigation of keywords extraction from textual documents using word2vec and decision tree. Int J Comput Sci Inf Secur 18 (5)
  • Bhargav PS, Reddy GN, Chand RR, Pujitha K, Mathur A. Sentiment analysis for hotel rating using machine learning algorithms. Int J Innov Technol Explor Eng (IJITEE) 2019; 8 (6):1225–1228. [ Google Scholar ]
  • Bharti SK, Babu KS (2017) Automatic keyword extraction for text summarization: a survey. arXiv preprint arXiv:1704.03242 4:410–427
  • Bhuvaneshwari P, Rao AN, Robinson YH, Thippeswamy M. Sentiment analysis for user reviews using bi-lstm self-attention based CNN model. Multimedia Tools Appl. 2022; 81 (9):12405–12419. doi: 10.1007/s11042-022-12410-4. [ CrossRef ] [ Google Scholar ]
  • Blair-Goldensohn S, Hannan K, McDonald R, Neylon T, Reis G, Reynar J (2008) Building a sentiment summarizer for local service reviews. WWW2008 workshop on NLP challenges in the information explosion era
  • Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45th annual meeting of the Association of Computational Linguistics. ACL, pp 440–447
  • Boiy E, Moens M-F. A machine learning approach to sentiment analysis in multilingual web texts. Inf Retrieval. 2009; 12 (5):526–558. doi: 10.1007/s10791-008-9070-z. [ CrossRef ] [ Google Scholar ]
  • Bollegala D, Weir D, Carroll J. Cross-domain sentiment classification using a sentiment sensitive thesaurus. IEEE Trans Knowl Data Eng. 2012; 25 (8):1719–1731. doi: 10.1109/TKDE.2012.103. [ CrossRef ] [ Google Scholar ]
  • Bonacich P. Some unique properties of eigenvector centrality. Soc Netw. 2007; 29 (4):555–564. doi: 10.1016/j.socnet.2007.04.002. [ CrossRef ] [ Google Scholar ]
  • Bougouin A, Boudin F, Daille B (2013) Topicrank: graph-based topic ranking for keyphrase extraction. In: International joint conference on natural language processing (IJCNLP), pp 543–551
  • Bronselaer A, Pasi G (2013) An approach to graph-based analysis of textual documents. In: 8th European Society for fuzzy logic and technology (EUSFLAT-2013). Atlantis Press, pp 634–641
  • Cambria E (2013) An introduction to concept-level sentiment analysis. In: Castro F, Gelbukh A, González M (eds) Advances in soft computing and its applications. Mexican international conference on artificial intelligence, MICAI 2013. Lecture notes in computer science, vol 8266. Springer, pp 478–483
  • Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016; 31 (02):102–107. doi: 10.1109/MIS.2016.31. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Hussain A. Sentic computing. Cogn Comput. 2015; 7 (2):183–185. doi: 10.1007/s12559-015-9325-0. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Liu Q, Decherchi S, Xing F, Kwok K (2022) Senticnet 7: a commonsense-based neurosymbolic AI framework for explainable sentiment analysis. In: Proceedings of LREC 2022. European Language Resources Association (ELRA), pp 3829–3839
  • Cao J, Li J, Yin M, Wang Y. Online reviews sentiment analysis and product feature improvement with deep learning. Trans Asian Low-Resour Lang Inf Process. 2022 doi: 10.1145/3522575. [ CrossRef ] [ Google Scholar ]
  • Castillo E, Cervantes O, Vilarino D, Báez D, Sánchez A (2015) Udlap: sentiment analysis using a graph-based representation. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). ACL, pp 556–560
  • Cepeda K, Jaiswal R (2022) Sentiment analysis on covid-19 vaccinations in Ireland using support vector machine. In: 2022 33rd Irish signals and systems conference (ISSC). IEEE, pp 1–6
  • Chan JY-L, Bea KT, Leow SMH, Phoong SW, Cheng WK (2022) State of the art: a review of sentiment analysis based on sequential transfer learning. Artif Intell Rev 1–32
  • Chen P-I, Lin S-J. Automatic keyword prediction using google similarity distance. Expert Syst Appl. 2010; 37 (3):1928–1938. doi: 10.1016/j.eswa.2009.07.016. [ CrossRef ] [ Google Scholar ]
  • Chen K, Zhang Z, Long J, Zhang H. Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Syst Appl. 2016; 66 :245–260. doi: 10.1016/j.eswa.2016.09.009. [ CrossRef ] [ Google Scholar ]
  • Chen Y, Wang J, Li P, Guo P. Single document keyword extraction via quantifying higher-order structural features of word co-occurrence graph. Comput Speech Lang. 2019; 57 :98–107. doi: 10.1016/j.csl.2019.01.007. [ CrossRef ] [ Google Scholar ]
  • Chen Z, Xue Y, Xiao L, Chen J, Zhang H (2021) Aspect-based sentiment analysis using graph convolutional networks and co-attention mechanism. In: International conference on neural information processing. Springer, pp 441–448
  • Chong WY, Selvaretnam B, Soon L-K (2014) Natural language processing for sentiment analysis: an exploratory analysis on tweets. In: 2014 4th international conference on artificial intelligence with applications in engineering and technology. IEEE, pp 212–217
  • Chopra M, Singh SK, Aggarwal K, Gupta A (2022) Predicting catastrophic events using machine learning models for natural language processing. In: Data mining approaches for big data and sentiment analysis in social media. IGI Global, pp 223–243
  • Cortis K, Freitas A, Daudert T, Huerlimann M, Zarrouk M, Handschuh S, Davis B (2017) Semeval-2017 task 5: Fine-grained sentiment analysis on financial microblogs and news. In: 11th International workshop on semantic evaluations (SemEval-2017): Proceedings of the workshop, Stroudsburg, PA, USA. Association for Computational Linguistics (ACL), pp 519–535
  • Cruz Mata F, Troyano Jiménez JA, de Salamanca Enríquez, Ros F, Ortega Rodríguez FJ, García Vallejo CA. ‘Long autonomy or long delay?’ The importance of domain in opinion mining. Expert Sys Appl. 2013; 40 (8):3174–3184. doi: 10.1016/j.eswa.2012.12.031. [ CrossRef ] [ Google Scholar ]
  • Dai A, Hu X, Nie J, Chen J. Learning from word semantics to sentence syntax by graph convolutional networks for aspect-based sentiment analysis. Int J Data Sci Anal. 2022; 14 (1):17–26. doi: 10.1007/s41060-022-00315-2. [ CrossRef ] [ Google Scholar ]
  • Dang Y, Zhang Y, Chen H. A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. IEEE Intell Syst. 2009; 25 (4):46–53. doi: 10.1109/MIS.2009.105. [ CrossRef ] [ Google Scholar ]
  • Dang CN, Moreno-García MN, De la Prieta F. Using hybrid deep learning models of sentiment analysis and item genres in recommender systems for streaming services. Electronics. 2021; 10 (20):2459. doi: 10.3390/electronics10202459. [ CrossRef ] [ Google Scholar ]
  • Darena F, Zizka J, Burda K (2012) Grouping of customer opinions written in natural language using unsupervised machine learning. In: 2012 14th international symposium on symbolic and numeric algorithms for scientific computing. IEEE, pp 265–270
  • Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web. ACM, pp 519–528
  • de Oliveira Carosia AE, Coelho GP, da Silva AEA. Investment strategies applied to the Brazilian stock market: a methodology based on sentiment analysis with deep learning. Expert Syst Appl. 2021; 184 :115470. doi: 10.1016/j.eswa.2021.115470. [ CrossRef ] [ Google Scholar ]
  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R. Indexing by latent semantic analysis. J Am Soc Inf Sci. 1990; 41 (6):391–407. doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9. [ CrossRef ] [ Google Scholar ]
  • Devika R, Subramaniyaswamy V. A semantic graph-based keyword extraction model using ranking method on big social data. Wirel Netw. 2021; 27 (8):5447–5459. doi: 10.1007/s11276-019-02128-x. [ CrossRef ] [ Google Scholar ]
  • Devika M, Sunitha C, Ganesh A. Sentiment analysis: a comparative study on different approaches. Procedia Comput Sci. 2016; 87 :44–49. doi: 10.1016/j.procs.2016.05.124. [ CrossRef ] [ Google Scholar ]
  • Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT. ACL, pp 4171–4186
  • Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 international conference on web search and data mining, vol 39. Elsevier, pp 231–240
  • Duari S, Bhatnagar V. scake: semantic connectivity aware keyword extraction. Inf Sci. 2019; 477 :100–117. doi: 10.1016/j.ins.2018.10.034. [ CrossRef ] [ Google Scholar ]
  • Fahrni A, Klenner M (2008) Old wine or warm beer: target-specific sentiment analysis of adjectives. University of Zurich, pp 60–63
  • Fazlourrahman B, Aparna B, Shashirekha H (2022) Coffitt-covid-19 fake news detection using fine-tuned transfer learning approaches. In: Congress on intelligent systems, vol 111. Springer, pp 879–890
  • Feldman R. Techniques and applications for sentiment analysis. Commun ACM. 2013; 56 (4):82–89. doi: 10.1145/2436256.2436274. [ CrossRef ] [ Google Scholar ]
  • Gamon M (2004) Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In: COLING 2004: proceedings of the 20th international conference on computational linguistics. ACL and AFNLP, pp 841–847
  • Ghosal D, Majumder N, Poria S, Chhaya N, Gelbukh A (2019) Dialoguegcn: a graph convolutional neural network for emotion recognition in conversation. In: EMNLP-IJCNLP 2019-2019 conference on empirical methods in natural language processing and 9th international joint conference on natural language processing, proceedings of the conference, pp 154–164
  • Golbeck J. Analyzing the Social Web. ACM: Newnes; 2013. [ Google Scholar ]
  • Govindarajan M (2022) Approaches and applications for sentiment analysis: a literature review. In: Data mining approaches for big data and sentiment analysis in social media. IGI Global, pp 1–23
  • HaCohen-Kerner Y (2003) Automatic extraction of keywords from abstracts. In: International conference on knowledge-based and intelligent information and engineering systems. Springer, pp 843–849
  • Haddi E, Liu X, Shi Y. The role of text pre-processing in sentiment analysis. Procedia Comput Sci. 2013; 17 :26–32. doi: 10.1016/j.procs.2013.05.005. [ CrossRef ] [ Google Scholar ]
  • Hart L (2013) The linguistics of sentiment analysis. University Honors Theses. 10.15760/honors.19
  • Hazarika D, Poria S, Zadeh A, Cambria E, Morency L-P, Zimmermann R (2018) Conversational memory network for emotion recognition in dyadic dialogue videos. In: Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting, vol 2018. NIH Public Access, p 2122 [ PMC free article ] [ PubMed ]
  • Hemalatha I, Varma GS, Govardhan A. Sentiment analysis tool using machine learning algorithms. Int J Emerg Trends Technol Comput Sci. 2013; 2 (2):105–109. [ Google Scholar ]
  • Hidayat THJ, Ruldeviyani Y, Aditama AR, Madya GR, Nugraha AW, Adisaputra MW. Sentiment analysis of twitter data related to Rinca island development using doc2vec and SVM and logistic regression as classifier. Procedia Comput Sci. 2022; 197 :660–667. doi: 10.1016/j.procs.2021.12.187. [ CrossRef ] [ Google Scholar ]
  • Hidayatullah AF, Cahyaningtyas S, Hakim AM (2021) Sentiment analysis on twitter using neural network: Indonesian presidential election 2019 dataset. In: IOP conference series: materials science and engineering, vol 1077. IOP Publishing, p 012001
  • Hitesh M, Vaibhav V, Kalki YA, Kamtam SH, Kumari S (2019) Real-time sentiment analysis of 2019 election tweets using word2vec and random forest model. In: 2019 2nd international conference on intelligent communication and computational techniques (ICCT). IEEE, pp 146–151
  • Hodson J, Veletsianos G, Houlden S. Public responses to covid-19 information from the public health office on twitter and Youtube: implications for research practice. J Inf Technol Polit. 2022; 19 (2):156–164. doi: 10.1080/19331681.2021.1945987. [ CrossRef ] [ Google Scholar ]
  • Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics, vol 1. ACL
  • Hsu C-W, Lin C-J. A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw. 2002; 13 (2):415–425. doi: 10.1109/72.991427. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hu Y, Li W. Document sentiment classification by exploring description model of topical terms. Comput Speech Lang. 2011; 25 (2):386–403. doi: 10.1016/j.csl.2010.07.004. [ CrossRef ] [ Google Scholar ]
  • Hu M, Liu B (2004) Mining opinion features in customer reviews. In: AAAI, vol 4. AAAI, pp 755–760
  • Huang C, Zhao Q. Sensitive information detection method based on attention mechanism-based Elmo. J Comput Appl. 2022; 42 (7):2009–2014. [ Google Scholar ]
  • Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on empirical methods in natural language processing. ACL, pp 216–223
  • Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol 8. AAAI, pp 216–225
  • Jain AP, Dandannavar P (2016) Application of machine learning techniques to sentiment analysis. In: 2016 2nd international conference on applied and theoretical computing and communication technology (iCATccT). IEEE, pp 628–632
  • Jain S, Gupta V (2022) Sentiment analysis: a recent survey with applications and a proposed ensemble algorithm. In: Computational intelligence in data mining. Springer, pp 13–25
  • Jain PK, Quamer W, Saravanan V, Pamula R (2022) Employing BERT-DCNN with sentic knowledge base for social media sentiment analysis. J Ambient Intell Human Comput 1–13
  • Ji X, Chun S, Wei Z, Geller J. Twitter sentiment classification for measuring public health concerns. Soc Netw Anal Min. 2015; 5 (1):1–25. doi: 10.1007/s13278-015-0253-5. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jianqiang Z (2015) Pre-processing boosting twitter sentiment analysis? In: 2015 IEEE international conference on Smart City/SocialCom/SustainCom (SmartCity). IEEE, pp 748–753
  • Jianqiang Z, Xiaolin G. Comparison research on text pre-processing methods on twitter sentiment analysis. IEEE Access. 2017; 5 :2870–2879. doi: 10.1109/ACCESS.2017.2672677. [ CrossRef ] [ Google Scholar ]
  • Jiao W, Lyu M, King I. Real-time emotion recognition via attention gated hierarchical memory network. Proceedings of the AAAI conference on artificial intelligence. 2020; 34 :8002–8009. doi: 10.1609/aaai.v34i05.6309. [ CrossRef ] [ Google Scholar ]
  • John SM, Kartheeban K (2019) Sentiment scoring and performance metrics examination of various supervised classifiers. Int J Innov Technol Explor Eng 9(2S2), 1120–1126
  • Joshi M, Rosé C (2009) Generalizing dependency features for opinion mining. In: Proceedings of the ACL-IJCNLP 2009 conference short papers. ACL, pp 313–316
  • Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 355–363
  • Katrekar A, AVP BDA (2005) An introduction to sentiment analysis. GlobalLogic Inc 1–6
  • Kaur R, Kautish S (2022) Multimodal sentiment analysis: a survey and comparison. In: Research anthology on implementing sentiment analysis across multiple disciplines. IGI Global, pp 1846–1870
  • Keramatfar A, Amirkhani H, Bidgoly AJ. Modeling tweet dependencies with graph convolutional networks for sentiment analysis. Cognit Comput. 2022 doi: 10.1007/s12559-021-09986-8. [ CrossRef ] [ Google Scholar ]
  • Khan MT, Ma Y, Kim J-j (2016) Term ranker: a graph-based re-ranking approach. In: The twenty-ninth international flairs conference. AAAI
  • Kharde V, Sonawane P, et al. Sentiment analysis of twitter data: a survey of techniques. Int J Comput Appl. 2016; 975 :0975–8887. [ Google Scholar ]
  • Kim S-M, Hovy E (2004) Determining the sentiment of opinions. In: COLING 2004: Proceedings of the 20th international conference on computational linguistics. ACL, pp 1367–1373
  • Kim K, Lee J. Sentiment visualization and classification via semi-supervised nonlinear dimensionality reduction. Pattern Recogn. 2014; 47 (2):758–768. doi: 10.1016/j.patcog.2013.07.022. [ CrossRef ] [ Google Scholar ]
  • Kim H, Howland P, Park H, Christianini N. Dimension reduction in text classification with support vector machines. J Mach Learn Res. 2005; 6 (1):37–53. [ Google Scholar ]
  • Kim J, Kim H-U, Adamowski J, Hatami S, Jeong H. Comparative study of term-weighting schemes for environmental big data using machine learning. Environ Model Softw. 2022; 157 :105536. doi: 10.1016/j.envsoft.2022.105536. [ CrossRef ] [ Google Scholar ]
  • Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
  • Kiritchenko S, Zhu X, Mohammad SM. Sentiment analysis of short informal texts. J Artif Intell Res. 2014; 50 :723–762. doi: 10.1613/jair.4272. [ CrossRef ] [ Google Scholar ]
  • Kiritchenko S, Zhu X, Cherry C, Mohammad S (2014b) Nrc-canada-2014: Detecting aspects and sentiment in customer reviews. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). ACL, pp 437–442
  • Kolkur S, Dantal G, Mahe R. Study of different levels for sentiment analysis. Int J Curr Eng Technol. 2015; 5 (2):768–770. [ Google Scholar ]
  • König AC, Brill E (2006) Reducing the human overhead in text categorization. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 598–603
  • Kontopoulos E, Berberidis C, Dergiades T, Bassiliades N. Ontology-based sentiment analysis of twitter posts. Expert Syst Appl. 2013; 40 (10):4065–4074. doi: 10.1016/j.eswa.2013.01.001. [ CrossRef ] [ Google Scholar ]
  • Kummer O, Savoy J (2012) Feature weighting strategies in sentiment analysis. In: SDAD 2012: the first international workshop on sentiment discovery from affective data, pp 48–55
  • Kwon K, Choi C-H, Lee J, Jeong J, Cho W-S (2015) A graph based representative keywords extraction model from news articles. In: Proceedings of the 2015 international conference on big data applications and services. ACM, pp 30–36
  • Lahiri S, Choudhury SR, Caragea C (2014) Keyword and keyphrase extraction using centrality measures on collocation networks. arXiv preprint arXiv:1401.6571
  • Lan M, Tan CL, Su J, Lu Y. Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans Pattern Anal Mach Intell. 2008; 31 (4):721–735. doi: 10.1109/TPAMI.2008.110. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Le B, Nguyen H (2015) Twitter sentiment analysis using machine learning techniques. In: Advanced computational methods for knowledge engineering. Springer, pp 279–289
  • Li Y-M, Li T-Y. Deriving market intelligence from microblogs. Decis Support Syst. 2013; 55 (1):206–217. doi: 10.1016/j.dss.2013.01.023. [ CrossRef ] [ Google Scholar ]
  • Li H, Lu W (2017) Learning latent sentiment scopes for entity-level sentiment analysis. In: Thirty-first AAAI conference on artificial intelligence. AAAI, pp 3482–3489
  • Li N, Wu DD. Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decis Support Syst. 2010; 48 (2):354–368. doi: 10.1016/j.dss.2009.09.003. [ CrossRef ] [ Google Scholar ]
  • Li S, Zhang H, Xu W, Chen G, Guo J (2010) Exploiting combined multi-level model for document sentiment analysis. In: 2010 20th international conference on pattern recognition. IEEE, pp 4141–4144
  • Li S-K, Guan Z, Tang L-Y, Chen Z. Exploiting consumer reviews for product feature ranking. J Comput Sci Technol. 2012; 27 (3):635–649. doi: 10.1007/s11390-012-1250-z. [ CrossRef ] [ Google Scholar ]
  • Li X, Xie H, Chen L, Wang J, Deng X. News impact on stock price return via sentiment analysis. Knowl-Based Syst. 2014; 69 :14–23. doi: 10.1016/j.knosys.2014.04.022. [ CrossRef ] [ Google Scholar ]
  • Li S, Zhou L, Li Y. Improving aspect extraction by augmenting a frequency-based method with web-based similarity measures. Inf Process Manag. 2015; 51 (1):58–67. doi: 10.1016/j.ipm.2014.08.005. [ CrossRef ] [ Google Scholar ]
  • Li X, Li J, Wu Y. A global optimization approach to multi-polarity sentiment analysis. PLoS ONE. 2015; 10 (4):0124672. doi: 10.1371/journal.pone.0124672. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cognit Comput. 2017; 9 (6):843–851. doi: 10.1007/s12559-017-9492-2. [ CrossRef ] [ Google Scholar ]
  • Li W, Shao W, Ji S, Cambria E. Bieru: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing. 2022; 467 :73–82. doi: 10.1016/j.neucom.2021.09.057. [ CrossRef ] [ Google Scholar ]
  • Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl-Based Syst. 2022; 235 :107643. doi: 10.1016/j.knosys.2021.107643. [ CrossRef ] [ Google Scholar ]
  • Litvak M, Last M, Aizenman H, Gobits I, Kandel A (2011) Degext—a language-independent graph-based keyphrase extractor. In: Advances in intelligent web mastering—3 vol 86. Springer, pp 121–130
  • Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. In: Mining text data. Springer, pp 415–463
  • Liu Y, Loh HT, Sun A. Imbalanced text classification: a term weighting approach. Expert Syst Appl. 2009; 36 (1):690–701. doi: 10.1016/j.eswa.2007.10.042. [ CrossRef ] [ Google Scholar ]
  • Liu H, He J, Wang T, Song W, Du X. Combining user preferences and user opinions for accurate recommendation. Electron Commer Res Appl. 2013; 12 (1):14–23. doi: 10.1016/j.elerap.2012.05.002. [ CrossRef ] [ Google Scholar ]
  • Loper E, Bird S (2002) Nltk: The natural language toolkit. In: Proceedings of the ACL-02 workshop on effective tools and methodologies for teaching natural language processing and computational linguistics, vol 1. ACM, pp 63–70
  • Loughran T, McDonald B. The use of word lists in textual analysis. J Behav Financ. 2015; 16 (1):1–11. doi: 10.1080/15427560.2015.1000335. [ CrossRef ] [ Google Scholar ]
  • Lu Q, Zhu Z, Zhang G, Kang S, Liu P. Aspect-gated graph convolutional networks for aspect-based sentiment analysis. Appl Intell. 2021; 51 (7):4408–4419. doi: 10.1007/s10489-020-02095-3. [ CrossRef ] [ Google Scholar ]
  • Lu Q, Sun X, Sutcliffe R, Xing Y, Zhang H. Sentiment interaction and multi-graph perception with graph convolutional networks for aspect-based sentiment analysis. Knowl-Based Syst. 2022; 256 :109840. doi: 10.1016/j.knosys.2022.109840. [ CrossRef ] [ Google Scholar ]
  • Luo F, Li C, Cao Z (2016) Affective-feature-based sentiment analysis using svm classifier. In: 2016 IEEE 20th international conference on computer supported cooperative work in design (CSCWD). IEEE, pp 276–281
  • Ma Y, Song R, Gu X, Shen Q, Xu H (2022) Multiple graph convolutional networks for aspect-based sentiment analysis. Appl Intell 1–14
  • Majumder N, Poria S, Hazarika D, Mihalcea R, Gelbukh A, Cambria E (2019) Dialoguernn: An attentive RNN for emotion detection in conversations. In: Proceedings of the AAAI conference on artificial intelligence, vol 33. IEEE, pp 6818–6825
  • Malliaros FD, Skianis K (2015) Graph-based term weighting for text categorization. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015. ACM, pp 1473–1479
  • Malo P, Sinha A, Korhonen P, Wallenius J, Takala P. Good debt or bad debt: detecting semantic orientations in economic texts. J Am Soc Inf Sci. 2014; 65 (4):782–796. [ Google Scholar ]
  • Manning CD, Surdeanu M, Bauer J, Finkel JR, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the Association for Computational Linguistics: system demonstrations. ACL, pp 55–60
  • Mäntylä MV, Graziotin D, Kuutila M. The evolution of sentiment analysis-a review of research topics, venues, and top cited papers. Comput Sci Rev. 2018; 27 :16–32. doi: 10.1016/j.cosrev.2017.10.002. [ CrossRef ] [ Google Scholar ]
  • Mao R, Liu Q, He K, Li W, Cambria E (2022) The biases of pre-trained language models: An empirical study on prompt-based sentiment analysis and emotion detection. IEEE Trans Affect Comput
  • Mars A, Gouider MS. Big data analysis to features opinions extraction of customer. Procedia Comput Sci. 2017; 112 :906–916. doi: 10.1016/j.procs.2017.08.114. [ CrossRef ] [ Google Scholar ]
  • Mathew L, Bindu V (2020) A review of natural language processing techniques for sentiment analysis using pre-trained models. In: 2020 fourth international conference on computing methodologies and communication (ICCMC). IEEE, pp 340–345
  • McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM conference on recommender systems. ACM, pp 165–172
  • Medelyan O, Witten IH (2006) Thesaurus based automatic keyphrase indexing. In: Proceedings of the 6th ACM/IEEE-CS joint conference on digital libraries. ACM, pp 296–297
  • Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J. 2014; 5 (4):1093–1113. doi: 10.1016/j.asej.2014.04.011. [ CrossRef ] [ Google Scholar ]
  • Mehta P, Pandya S. A review on sentiment analysis methodologies, practices and applications. Int J Sci Technol Res. 2020; 9 (2):601–609. [ Google Scholar ]
  • Miller GA, Beckwith R, Fellbaum C, Gross D, Miller KJ. Introduction to wordnet: an on-line lexical database. Int J Lexicogr. 1990; 3 (4):235–244. doi: 10.1093/ijl/3.4.235. [ CrossRef ] [ Google Scholar ]
  • Mohammad S (2012) # emotional tweets. In: * SEM 2012: The first joint conference on lexical and computational semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the sixth international workshop on semantic evaluation (SemEval 2012). ACL, pp 246–255
  • Mohammad SM, Kiritchenko S. Using hashtags to capture fine emotion categories from tweets. Comput Intell. 2015; 31 (2):301–326. doi: 10.1111/coin.12024. [ CrossRef ] [ Google Scholar ]
  • Mohammad S, Turney P (2010) Emotions evoked by common words and phrases: Using mechanical Turk to create an emotion lexicon. In: Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text. ACL, pp 26–34
  • Mohammad SM, Turney PD. Crowdsourcing a word-emotion association lexicon. Comput Intell. 2013; 29 (3):436–465. doi: 10.1111/j.1467-8640.2012.00460.x. [ CrossRef ] [ Google Scholar ]
  • Mohammad S, Dunne C, Dorr B (2009) Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. In: Proceedings of the 2009 conference on empirical methods in natural language processing. ACL, pp 599–608
  • Mohammad SM, Kiritchenko S, Zhu X (2013) Nrc-canada: building the state-of-the-art in sentiment analysis of tweets. In: Second joint conference on lexical and computational semantics (* SEM), Volume 2: Proceedings of the seventh international workshop on semantic evaluation (SemEval 2013), vol 2. ACL, pp 321–327
  • Moreno-Ortiz A, Fernández-Cruz J, Hernández CPC (2020) Design and evaluation of Sentiecon: a fine-grained economic/financial sentiment lexicon from a corpus of business news. In: Proceedings of The 12th language resources and evaluation conference. ACL, pp 5065–5072
  • Mostafa MM. More than words: social networks’ text mining for consumer brand sentiments. Expert Syst Appl. 2013; 40 (10):4241–4251. doi: 10.1016/j.eswa.2013.01.019. [ CrossRef ] [ Google Scholar ]
  • Mothe J, Ramiandrisoa F, Rasolomanana M (2018) Automatic keyphrase extraction using graph-based methods. In: Proceedings of the 33rd annual ACM symposium on applied computing. ACM, pp 728–730
  • Mullen T, Collier N (2004) Sentiment analysis using support vector machines with diverse information sources. In: Proceedings of the 2004 conference on empirical methods in natural language processing. ACL, pp 412–418
  • Nagarajan R, Nair S, Aruna P, Puviarasan N. Keyword extraction using graph based approach. Int J Adv Res Comput Sci Softw Eng. 2016; 6 (10):25–29. [ Google Scholar ]
  • Narayanan R, Liu B, Choudhary A (2009) Sentiment analysis of conditional sentences. In: Proceedings of the 2009 conference on empirical methods in natural language processing. ACL and AFNLP, pp 180–189
  • Nasar Z, Jaffry SW, Malik MK. Textual keyword extraction and summarization: state-of-the-art. Inf Process Manage. 2019; 56 (6):102088. doi: 10.1016/j.ipm.2019.102088. [ CrossRef ] [ Google Scholar ]
  • Nguyen TD, Kan M-Y (2007) Keyphrase extraction in scientific publications. In: International conference on Asian digital libraries. Springer, pp 317–326
  • Nguyen H, Nguyen M-L (2017) A deep neural architecture for sentence-level sentiment classification in twitter social networking. In: International conference of the Pacific Association for Computational Linguistics. Springer, pp 15–27
  • Nielsen FÅ (2011) A new anew: evaluation of a word list for sentiment analysis in microblogs. In: 1st workshop on making sense of Microposts: big things come in small packages, pp 93–98
  • Nielsen FÅ (2017) afinn project
  • O’Keefe T, Koprinska I (2009) Feature selection and weighting methods in sentiment analysis. In: Proceedings of the 14th Australasian document computing symposium, Sydney, pp 67–74
  • Oliveira N, Cortez P, Areal N (2014) Automatic creation of stock market lexicons for sentiment analysis using stocktwits data. In: Proceedings of the 18th international database engineering & applications symposium. ACM, pp 115–123
  • Ouyang Y, Guo B, Zhang J, Yu Z, Zhou X. Sentistory: multi-grained sentiment analysis and event summarization with crowdsourced social media data. Pers Ubiquit Comput. 2017; 21 (1):97–111. doi: 10.1007/s00779-016-0977-x. [ CrossRef ] [ Google Scholar ]
  • Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. ACM
  • Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, Morristown, NJ, USA. Association for Computational Linguistics, pp 79–86
  • Passi K, Motisariya J (2022) Twitter sentiment analysis of the 2019 indian election. In: IOT with smart systems. Springer, pp 805–814
  • Patil P, April Yalagi P. Sentiment analysis levels and techniques: a survey. Int J Innov Eng Technol. 2016; 6 :523. [ Google Scholar ]
  • Patil G, Galande V, Kekan V, Dange K. Sentiment analysis using support vector machine. Int J Innov Res Comput Commun Eng. 2014; 2 (1):2607–2612. [ Google Scholar ]
  • Pavitha N, Pungliya V, Raut A, Bhonsle R, Purohit A, Patel A, Shashidhar R (2022) Movie recommendation and sentiment analysis using machine learning. In: Global transitions proceedings. Elsevier
  • Pitogo VA, Ramos CDL (2020) Social media enabled e-participation: a lexicon-based sentiment analysis using unsupervised machine learning. In: Proceedings of the 13th international conference on theory and practice of electronic governance, pp 518–528. ACM
  • Poria S, Gelbukh A, Hussain A, Howard N, Das D, Bandyopadhyay S. Enhanced senticnet with affective labels for concept-based opinion mining. IEEE Intell Syst. 2013; 28 (2):31–38. doi: 10.1109/MIS.2013.4. [ CrossRef ] [ Google Scholar ]
  • Poria S, Cambria E, Hazarika D, Majumder N, Zadeh A, Morency L-P (2017) Context-dependent sentiment analysis in user-generated videos. In: Proceedings of the 55th annual meeting of the Association for Computational Linguistics (volume 1: Long Papers). ACL, pp 873–883
  • Prasad AG, Sanjana S, Bhat SM, Harish B (2017) Sentiment analysis for sarcasm detection on streaming short text data. In: 2017 2nd International conference on knowledge engineering and applications (ICKEA). IEEE, pp 1–5
  • Prastyo PH, Sumi AS, Dian AW, Permanasari AE. Tweets responding to the Indonesian government’s handling of covid-19: sentiment analysis using svm with normalized poly kernel. J Inf Syst Eng Bus Intell. 2020; 6 (2):112–122. doi: 10.20473/jisebi.6.2.112-122. [ CrossRef ] [ Google Scholar ]
  • Priyadarshini I, Cotton C. A novel lstm-cnn-grid search-based deep neural network for sentiment analysis. J Supercomput. 2021; 77 (12):13911–13932. doi: 10.1007/s11227-021-03838-w. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Quan C, Ren F. Unsupervised product feature extraction for feature-oriented opinion determination. Inf Sci. 2014; 272 :16–28. doi: 10.1016/j.ins.2014.02.063. [ CrossRef ] [ Google Scholar ]
  • Rabelo JC, Prudêncio RB, Barros FA (2012) Using link structure to infer opinions in social networks. In: 2012 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 681–685
  • Rajput A. Natural language processing, sentiment analysis, and clinical analytics. In: Lytras MD, Sarirete A, editors. Innovation in health informatics. Academic Press: Elsevier; 2020. pp. 79–97. [ Google Scholar ]
  • Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl-Based Syst. 2015; 89 :14–46. doi: 10.1016/j.knosys.2015.06.015. [ CrossRef ] [ Google Scholar ]
  • Ravinuthala MKV et al (2016) Thematic text graph: a text representation technique for keyword weighting in extractive summarization system. Int J Inf Eng Electron Bus 8(4)
  • Read J, Carroll J (2009) Weakly supervised techniques for domain-independent sentiment classification. In: Proceedings of the 1st international CIKM workshop on topic-sentiment analysis for mass opinion. ACM, pp 45–52
  • Ren F, Sohrab MG. Class-indexing-based term weighting for automatic text classification. Inf Sci. 2013; 236 :109–125. doi: 10.1016/j.ins.2013.02.029. [ CrossRef ] [ Google Scholar ]
  • Ren R, Wu DD, Liu T. Forecasting stock market movement direction using sentiment analysis and support vector machine. IEEE Syst J. 2018; 13 (1):760–770. doi: 10.1109/JSYST.2018.2794462. [ CrossRef ] [ Google Scholar ]
  • Reyes A, Rosso P. Making objective decisions from subjective data: detecting irony in customer reviews. Decis Support Syst. 2012; 53 (4):754–760. doi: 10.1016/j.dss.2012.05.027. [ CrossRef ] [ Google Scholar ]
  • Rui H, Liu Y, Whinston A. Whose and what chatter matters? The effect of tweets on movie sales. Decis Support Syst. 2013; 55 (4):863–870. doi: 10.1016/j.dss.2012.12.022. [ CrossRef ] [ Google Scholar ]
  • Saif H, Fernández M, He Y, Alani H (2014) On stopwords, filtering and data sparsity for sentiment analysis of Twitter, pp 810–817
  • Salari N, Shohaimi S, Najafi F, Nallappan M, Karishnarajah I. A novel hybrid classification model of genetic algorithms, modified k-nearest neighbor and developed backpropagation neural network. PLoS ONE. 2014; 9 (11):112987. doi: 10.1371/journal.pone.0112987. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manage. 1988; 24 (5):513–523. doi: 10.1016/0306-4573(88)90021-0. [ CrossRef ] [ Google Scholar ]
  • Santos G, Mota VF, Benevenuto F, Silva TH. Neutrality may matter: sentiment analysis in reviews of AIRBNB, booking, and Couchsurfing in Brazil and USA. Soc Netw Anal Min. 2020; 10 (1):1–13. doi: 10.1007/s13278-020-00656-5. [ CrossRef ] [ Google Scholar ]
  • Sarzynska-Wawer J, Wawer A, Pawlak A, Szymanowska J, Stefaniak I, Jarkiewicz M, Okruszek L. Detecting formal thought disorder by deep contextualized word representations. Psychiatry Res. 2021; 304 :114135. doi: 10.1016/j.psychres.2021.114135. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Saxena A, Reddy H, Saxena P (2022) Introduction to sentiment analysis covering basics, tools, evaluation metrics, challenges, and applications. In: Principles of social networking. Springer, pp 249–277
  • Schouten K, Frasincar F. Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng. 2015; 28 (3):813–830. doi: 10.1109/TKDE.2015.2485209. [ CrossRef ] [ Google Scholar ]
  • Sebastiani F, Debole F (2003) Supervised term weighting for automated text categorization. In: Proceeding the 18th ACM symposium on applied computing. ACM, pp 784–788
  • Serrano-Guerrero J, Olivas JA, Romero FP, Herrera-Viedma E. Sentiment analysis: a review and comparative analysis of web services. Inf Sci. 2015; 311 :18–38. doi: 10.1016/j.ins.2015.03.040. [ CrossRef ] [ Google Scholar ]
  • Sharma A, Dey S (2012) A comparative study of feature selection and machine learning techniques for sentiment analysis. In: Proceedings of the 2012 ACM research in applied computation symposium, pp 1–7
  • Shi W, Zheng W, Yu JX, Cheng H, Zou L. Keyphrase extraction using knowledge graphs. Data Sci Eng. 2017; 2 (4):275–288. doi: 10.1007/s41019-017-0055-z. [ CrossRef ] [ Google Scholar ]
  • Shimada K, Hashimoto D, Endo T (2009) A graph-based approach for sentiment sentence extraction. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 38–48
  • Sidorov G, Miranda-Jiménez S, Viveros-Jiménez F, Gelbukh A, Castro-Sánchez N, Velásquez F, Díaz-Rangel I, Suárez-Guerra S, Trevino A, Gordon J (2013) Empirical study of machine learning based approach for opinion mining in tweets. In: Mexican international conference on artificial intelligence. Springer, pp 1–14
  • Smith H, Cipolli W. The Instagram/Facebook ban on graphic self-harm imagery: a sentiment analysis and topic modeling approach. Policy Internet. 2022; 14 (1):170–185. doi: 10.1002/poi3.272. [ CrossRef ] [ Google Scholar ]
  • Sokolova M, Lapalme G (2007) Performance measures in classification of human communications. In: Conference of the Canadian Society for computational studies of intelligence. Springer, pp 159–170
  • Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manage. 2009; 45 (4):427–437. doi: 10.1016/j.ipm.2009.03.002. [ CrossRef ] [ Google Scholar ]
  • Solangi YA, Solangi ZA, Aarain S, Abro A, Mallah GA, Shah A (2018) Review on natural language processing (NLP) and its toolkits for opinion mining and sentiment analysis. In: 2018 IEEE 5th international conference on engineering technologies and applied sciences (ICETAS). IEEE, pp 1–4
  • Soubraylu S, Rajalakshmi R. Hybrid convolutional bidirectional recurrent neural network based sentiment analysis on movie reviews. Comput Intell. 2021; 37 (2):735–757. doi: 10.1111/coin.12400. [ CrossRef ] [ Google Scholar ]
  • Sousa MG, Sakiyama K, de Souza Rodrigues L, Moraes PH, Fernandes ER, Matsubara ET (2019) Bert for stock market sentiment analysis. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI). IEEE, pp 1597–1601
  • Stagner R. The cross-out technique as a method in public opinion analysis. J Soc Psychol. 1940; 11 (1):79–90. doi: 10.1080/00224545.1940.9918734. [ CrossRef ] [ Google Scholar ]
  • Staiano J, Guerini M (2014) Depechemood: a lexicon for emotion analysis from crowd-annotated news. ACL, pp 427–433. arXiv preprint arXiv:1405.1605
  • Stone PJ, Hunt EB (1963) A computer approach to content analysis: studies using the general inquirer system. In: Proceedings of the May 21-23, 1963, Spring Joint Computer Conference. ACL, pp 241–256
  • Subrahmanian VS, Reforgiato D. Ava: adjective-verb-adverb combinations for sentiment analysis. IEEE Intell Syst. 2008; 23 (4):43–50. doi: 10.1109/MIS.2008.57. [ CrossRef ] [ Google Scholar ]
  • Taboada M. Sentiment analysis: an overview from linguistics. Ann Rev Linguist. 2016; 2 :325–347. doi: 10.1146/annurev-linguistics-011415-040518. [ CrossRef ] [ Google Scholar ]
  • Tamilselvam S, Nagar S, Mishra A, Dey K (2017) Graph based sentiment aggregation using conceptnet ontology. In: Proceedings of the eighth international joint conference on natural language processing (Volume 1: Long Papers), vol 1. ACL, pp 525–535
  • Tan S, Zhang J. An empirical study of sentiment analysis for Chinese documents. Expert Syst Appl. 2008; 34 (4):2622–2629. doi: 10.1016/j.eswa.2007.05.028. [ CrossRef ] [ Google Scholar ]
  • Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P (2011) User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, pp 1397–1405
  • Tan LK-W, Na J-C, Theng Y-L, Chang K. Phrase-level sentiment polarity classification using rule-based typed dependencies and additional complex phrases consideration. J Comput Sci Technol. 2012; 27 (3):650–666. doi: 10.1007/s11390-012-1251-y. [ CrossRef ] [ Google Scholar ]
  • Tang D, Wei F, Qin B, Yang N, Liu T, Zhou M. Sentiment embeddings with applications to sentiment analysis. IEEE Trans Knowl Data Eng. 2015; 28 (2):496–509. doi: 10.1109/TKDE.2015.2489653. [ CrossRef ] [ Google Scholar ]
  • Tembhurne JV, Diwan T. Sentiment analysis in textual, visual and multimodal inputs using recurrent neural networks. Multimedia Tools Appl. 2021; 80 (5):6871–6910. doi: 10.1007/s11042-020-10037-x. [ CrossRef ] [ Google Scholar ]
  • Thelwall M, Buckley K. Topic-based sentiment analysis for the social web: the role of mood and issue-related words. J Am Soc Inform Sci Technol. 2013; 64 (8):1608–1617. doi: 10.1002/asi.22872. [ CrossRef ] [ Google Scholar ]
  • Theng Y-L (2004) Design and usability of digital libraries: case studies in the Asia Pacific: case studies in the Asia Pacific. IGI Global, pp 129–152
  • Tian Y, Chen G, Song Y (2021) Aspect-based sentiment analysis with type-aware graph convolutional networks and layer ensemble. In: Proceedings of the 2021 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, pp 2910–2922
  • Titov I, McDonald R (2008) A joint model of text and aspect ratings for sentiment summarization. In: Proceedings of ACL-08: HLT. ACL, pp 308–316
  • Trinh S, Nguyen L, Vo M (2018) Combining lexicon-based and learning-based methods for sentiment analysis for product reviews in Vietnamese language. In: International conference on computer and information science. Springer, pp 57–75
  • Tripathy A, Agrawal A, Rath SK. Classification of sentimental reviews using machine learning techniques. Procedia Comput Sci. 2015; 57 :821–829. doi: 10.1016/j.procs.2015.07.523. [ CrossRef ] [ Google Scholar ]
  • Tsai AC-R, Wu C-E, Tsai RT-H, Hsu JY. Building a concept-level sentiment dictionary based on commonsense knowledge. IEEE Intell Syst. 2013; 28 (2):22–30. doi: 10.1109/MIS.2013.25. [ CrossRef ] [ Google Scholar ]
  • Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 417–424
  • Vakali A, Chatzakou D, Koutsonikola V, Andreadis G (2013) Social data sentiment analysis in smart environments-extending dual polarities for crowd pulse capturing. In: International conference on data management technologies and applications, vol 2. SCITEPRESS, pp 175–182
  • Valakunde N, Patwardhan M (2013) Multi-aspect and multi-class based document sentiment analysis of educational data catering accreditation process. In: 2013 International conference on cloud & ubiquitous computing & emerging technologies. IEEE, pp 188–192
  • Valdivia A, Luzíón MV, Herrera F (2017) Neutrality in the sentiment analysis problem based on fuzzy majority. In: 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, pp 1–6
  • Valdivia A, Luzón MV, Cambria E, Herrera F. Consensus vote models for detecting and filtering neutrality in sentiment analysis. Inf Fusion. 2018; 44 :126–135. doi: 10.1016/j.inffus.2018.03.007. [ CrossRef ] [ Google Scholar ]
  • Vega-Oliveros DA, Gomes PS, Milios EE, Berton L. A multi-centrality index for graph-based keyword extraction. Inf Process Manag. 2019; 56 (6):102063. doi: 10.1016/j.ipm.2019.102063. [ CrossRef ] [ Google Scholar ]
  • Verma S. Sentiment analysis of public services for smart society: literature review and future research directions. Gov Inf Q. 2022; 39 :101708. doi: 10.1016/j.giq.2022.101708. [ CrossRef ] [ Google Scholar ]
  • Vidanagama D, Silva A, Karunananda A. Ontology based sentiment analysis for fake review detection. Expert Syst Appl. 2022; 206 :117869. doi: 10.1016/j.eswa.2022.117869. [ CrossRef ] [ Google Scholar ]
  • Wakade S, Shekar C, Liszka KJ, Chan C-C (2012) Text mining for sentiment analysis of twitter data. In: Proceedings of the international conference on information and knowledge engineering (IKE). The Steering Committee of The World Congress in Computer Science, Computer  … , pp 1–6
  • Wang Z, Joo V, Tong C, Xin X, Chin HC (2014) Anomaly detection through enhanced sentiment analysis on social media data. In: 2014 IEEE 6th international conference on cloud computing technology and science. IEEE, pp 917–922
  • Wang T, Cai Y, Leung H-f, Cai Z, Min H (2015) Entropy-based term weighting schemes for text categorization in VSM. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI). IEEE, pp 325–332
  • Wang J, Li C, Xia C. Improved centrality indicators to characterize the nodal spreading capability in complex networks. Appl Math Comput. 2018; 334 :388–400. [ Google Scholar ]
  • Wang Z, Ho S-B, Cambria E. Multi-level fine-scaled sentiment sensing with ambivalence handling. Int J Uncertain Fuzz Knowl-Based Syst. 2020; 28 (04):683–697. doi: 10.1142/S0218488520500294. [ CrossRef ] [ Google Scholar ]
  • Wang X, Li F, Zhang Z, Xu G, Zhang J, Sun X. A unified position-aware convolutional neural network for aspect based sentiment analysis. Neurocomputing. 2021; 450 :91–103. doi: 10.1016/j.neucom.2021.03.092. [ CrossRef ] [ Google Scholar ]
  • Wang J, Zhang Y, Yu L-C, Zhang X. Contextual sentiment embeddings via bi-directional GRU language model. Knowl-Based Syst. 2022; 235 :107663. doi: 10.1016/j.knosys.2021.107663. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Gao P, Chu X. Sentiment analysis from customer-generated online videos on product review using topic modeling and multi-attention BLSTM. Adv Eng Inform. 2022; 52 :101588. doi: 10.1016/j.aei.2022.101588. [ CrossRef ] [ Google Scholar ]
  • Wankhade M, Rao ACS, Kulkarni C (2022) A survey on sentiment analysis methods, applications, and challenges. Artif Intell Rev 1–50
  • Wawre SV, Deshmukh SN. Sentiment classification using machine learning techniques. IJSR. 2016; 5 (4):819–821. doi: 10.21275/v5i4.NOV162724. [ CrossRef ] [ Google Scholar ]
  • Wiebe JM (1990) Recognizing subjective sentences: a computational investigation of narrative text. State University of New York at Buffalo
  • Wiebe J, Mihalcea R (2006) Word sense and subjectivity. In: Proceedings of the 21st International conference on computational linguistics and 44th annual meeting of the Association for Computational Linguistics, pp 1065–1072
  • Wilson T, Wiebe J (2003) Annotating opinions in the world press. In: Proceedings of the fourth SIGdial workshop of discourse and dialogue. ACL, pp 13–22
  • Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing. ACL, pp 347–354
  • Wu Y, Zhang Q, Huang X-J, Wu L (2011) Structural opinion mining for graph-based sentiment representation. In: Proceedings of the 2011 conference on empirical methods in natural language processing. ACL, pp 1332–1341
  • Xia R, Zong C, Li S. Ensemble of feature sets and classification algorithms for sentiment classification. Inf Sci. 2011; 181 (6):1138–1152. doi: 10.1016/j.ins.2010.11.023. [ CrossRef ] [ Google Scholar ]
  • Yadav A, Vishwakarma DK. Sentiment analysis using deep learning architectures: a review. Artif Intell Rev. 2020; 53 (6):4335–4385. doi: 10.1007/s10462-019-09794-5. [ CrossRef ] [ Google Scholar ]
  • Yadav CS, Sharan A, Joshi ML (2014) Semantic graph based approach for text mining. In: 2014 international conference on issues and challenges in intelligent computing techniques (ICICT). IEEE, pp 596–601
  • Yan X, Huang T (2015) Tibetan sentence sentiment analysis based on the maximum entropy model. In: 2015 10th international conference on broadband and wireless computing, communication and applications (BWCCA). IEEE, pp 594–597
  • Yan Z, Xing M, Zhang D, Ma B. Exprs: an extended pagerank method for product feature extraction from online consumer reviews. Inf Manag. 2015; 52 (7):850–858. doi: 10.1016/j.im.2015.02.002. [ CrossRef ] [ Google Scholar ]
  • Yavari A, Hassanpour H, Rahimpour Cami B, Mahdavi M. Election prediction based on sentiment analysis using twitter data. Int J Eng. 2022; 35 (2):372–379. doi: 10.5829/IJE.2022.35.02B.13. [ CrossRef ] [ Google Scholar ]
  • Ye Q, Zhang Z, Law R. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst Appl. 2009; 36 (3):6527–6535. doi: 10.1016/j.eswa.2008.07.035. [ CrossRef ] [ Google Scholar ]
  • Yenter A, Verma A (2017) Deep cnn-lstm with combined kernels from multiple branches for imdb review sentiment analysis. In: 2017 IEEE 8th annual ubiquitous computing, electronics and mobile communication conference (UEMCON). IEEE, pp 540–546
  • Yu B, Zhang S (2022) A novel weight-oriented graph convolutional network for aspect-based sentiment analysis. J Supercomput 1–26
  • Yu X, Liu Y, Huang X, An A. Mining online reviews for predicting sales performance: a case study in the movie domain. IEEE Trans Knowl Data Eng. 2010; 24 (4):720–734. doi: 10.1109/TKDE.2010.269. [ CrossRef ] [ Google Scholar ]
  • Yu Y, Duan W, Cao Q. The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis Support Syst. 2013; 55 (4):919–926. doi: 10.1016/j.dss.2012.12.028. [ CrossRef ] [ Google Scholar ]
  • Zad S, Heidari M, Jones JH, Uzuner O (2021) A survey on concept-level sentiment analysis techniques of textual data. In: 2021 IEEE World AI IoT Congress (AIIoT). IEEE, pp 0285–0291
  • Zainuddin N, Selamat A (2014) Sentiment analysis using support vector machine. In: 2014 international conference on computer, communications, and control technology (I4CT), pp 333–337. IEEE
  • Zhan J, Loh HT, Liu Y. Gather customer concerns from online product reviews-a text summarization approach. Expert Syst Appl. 2009; 36 (2):2107–2115. doi: 10.1016/j.eswa.2007.12.039. [ CrossRef ] [ Google Scholar ]
  • Zhang K, Xu H, Tang J, Li J (2006) Keyword extraction using support vector machine. In: International conference on web-age information management. Springer, pp 85–96
  • Zhang L, Ghosh R, Dekhil M, Hsu M, Liu B (2011) Combining lexicon-based and learning-based methods for twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011 89, 1–8. HP Laboratories
  • Zhang W, Xu H, Wan W. Weakness finder: find product weakness from Chinese reviews by using aspects based sentiment analysis. Expert Syst Appl. 2012; 39 (11):10283–10291. doi: 10.1016/j.eswa.2012.02.166. [ CrossRef ] [ Google Scholar ]
  • Zhang H, Gan W, Jiang B (2014) Machine learning and lexicon based methods for sentiment classification: a survey. In: 2014 11th web information system and application conference. IEEE, pp 262–265
  • Zhang Y, Zhou Y, Yao J (2020) Feature extraction with tf-idf and game-theoretic shadowed sets. In: International conference on information processing and management of uncertainty in knowledge-based systems, pp 722–733. Springer
  • Zhang Q, Yi GY, Chen L-P, He W (2021) Text mining and sentiment analysis of covid-19 tweets. arXiv preprint arXiv:2106.15354
  • Zhang K, Zhang K, Zhang M, Zhao H, Liu Q, Wu W, Chen E (2022) Incorporating dynamic semantics into pre-trained language model for aspect-based sentiment analysis. In: Findings of the Association for Computational Linguistics: ACL 2022. ACL, pp 3599–3610
  • Zhao WX, Jiang J, He J, Song Y, Achanauparp P, Lim E-P, Li X (2011) Topical keyphrase extraction from twitter. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies, pp 379–388
  • Zhao Z, Tang M, Tang W, Wang C, Chen X. Graph convolutional network with multiple weight mechanisms for aspect-based sentiment analysis. Neurocomputing. 2022; 500 :124–134. doi: 10.1016/j.neucom.2022.05.045. [ CrossRef ] [ Google Scholar ]
  • Zhou J, Tian J, Wang R, Wu Y, Xiao W, He L (2020) Sentix: a sentiment-aware pre-trained model for cross-domain sentiment analysis. In: Proceedings of the 28th international conference on computational linguistics. ACL, pp 568–579
  • Zhu X, Kiritchenko S, Mohammad S (2014) Nrc-canada-2014: recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). ACL, pp 443–447
  • Zhu X, Zhu L, Guo J, Liang S, Dietze S. Gl-gcn: Global and local dependency guided graph convolutional networks for aspect-based sentiment classification. Expert Syst Appl. 2021; 186 :115712. doi: 10.1016/j.eswa.2021.115712. [ CrossRef ] [ Google Scholar ]
  • Zhu H, Zheng Z, Soleymani M, Nevatia R (2022) Self-supervised learning for sentiment analysis via image-text matching. In: ICASSP 2022–2022 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1710–1714
  • Zin HM, Mustapha N, Murad MAA, Sharef NM (2017) The effects of pre-processing strategies in sentiment analysis of online movie reviews. In: AIP conference proceedings, vol 1891. AIP Publishing LLC, p. 020089
  • Privacy Policy

Research Method

Home » Sentiment Analysis – Tools, Techniques and Examples

Sentiment Analysis – Tools, Techniques and Examples

Table of Contents

Sentiment Analysis

Sentiment Analysis

Sentiment analysis, also referred to as opinion mining, is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information.

In other words, it’s the process of determining the emotional tone or subjective opinion within large amounts of text. This could be used on social media posts, customer reviews, or any other text data where people are expressing their opinions or feelings.

Sentiment analysis can classify text as being positive, negative, or neutral. More advanced sentiment analysis methods can also categorize text into more specific emotional states like “happy,” “frustrated,” “excited,” etc.

Sentiment Analysis Methods

Sentiment analysis methods can be broadly divided into three categories: rule-based, machine learning-based, and hybrid.

Rule-based methods

This approach uses a set of manually crafted rules to identify sentiment. This often involves creating or using a sentiment lexicon—a list of words and phrases each assigned a sentiment score (positive, negative, or neutral). The overall sentiment of a text is then determined based on the scores of the individual words or phrases it contains. Rule-based methods might also take into account more complex linguistic features, like negations (“not good”) and intensifiers (“very good”).

Machine learning-based methods

These methods involve training a machine learning model on a dataset of text where each piece of text is labeled with its sentiment. The model learns to associate features of the text (like the words it contains, the order of the words, etc.) with the sentiment. When given new, unlabeled text, it can then predict the sentiment based on these learned associations. The machine learning model used could be a traditional algorithm like Naive Bayes, Support Vector Machines, or a Decision Tree, or a more complex neural network model like a Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), or a transformer model like BERT or GPT.

Hybrid methods

As the name suggests, hybrid methods combine rule-based and machine-learning-based approaches. They might use a rule-based method to generate features that are fed into a machine-learning model, or use a machine-learning model to predict sentiment, which is then refined using a set of rules. The idea is to try to get the best of both worlds—the linguistic knowledge encapsulated in the rules, and the ability of the machine learning model to learn complex patterns in the data .

How does sentiment analysis work?

The overall process can be divided into several steps:

  • Data Collection : The first step involves collecting data for sentiment analysis. This could be social media posts, customer reviews, survey responses, or any other text data where people are expressing their opinions or feelings.
  • Text Preprocessing : The collected raw text data often needs to be cleaned and standardized before analysis. This could involve removing irrelevant data (like HTML tags or URLs), converting all text to lowercase, correcting spelling mistakes, removing stop words (commonly used words like “is”, “and”, “the”, etc. that don’t carry much meaning), and other techniques to make the data more uniform. This stage often also involves tokenization (breaking the text into individual words or tokens), and sometimes lemmatization (reducing words to their base or root form).
  • Feature Extraction : In this stage, meaningful features are extracted from the preprocessed text data. The simplest approach is the bag of words model, where the text is represented as a set of its words, disregarding grammar and word order but keeping the frequency of each word. More complex approaches could involve word embeddings (where each word is represented as a vector in multi-dimensional space), or even sentence or paragraph embeddings.
  • Model Training (Machine Learning Approach) : If using a machine learning approach, the next step is to use the features extracted from the text to train a model. The model learns to associate certain features with positive, negative, or neutral sentiments based on the training data. Different algorithms could be used in this stage, including logistic regression, support vector machines, decision trees, or even deep learning models like recurrent neural networks or transformers.
  • Sentiment Classification (Rule-Based Approach) : If using a rule-based approach, instead of training a model, a set of manually crafted rules are used to determine the sentiment of the text. For example, the text might be classified as positive if it contains more positive words than negative.
  • Evaluation : The final step involves evaluating the performance of the sentiment analysis system, usually by comparing its predictions to a set of manually labeled data.

Sentiment Analysis Techniques

Sentiment Analysis Techniques are as follows:

Lexicon-based Sentiment Analysis

This is a simple and straightforward technique in which the sentiment of a text is determined by the words it contains. A sentiment lexicon is a list of lexical features (e.g., words) which are labeled according to their semantic orientation as either positive, negative, or neutral. In this technique, the sentiment of a text is calculated by identifying the sentiment words and the way they’re combined.

Machine Learning-based Sentiment Analysis

This technique uses machine learning algorithms to classify text as positive, negative, or neutral. This is usually done by training a model on a pre-labeled dataset and then using this model to classify new, unseen data. Algorithms used could be traditional ones like Naive Bayes, Support Vector Machines, Decision Trees, or more complex methods like Neural Networks and Deep Learning techniques (e.g., Convolutional Neural Networks, Recurrent Neural Networks, or Transformer-based models like BERT and GPT).

Hybrid Sentiment Analysis

This method combines the lexicon and machine learning-based approaches. For example, it might use a lexicon-based approach to help label data for machine learning, or use machine learning to automate and improve lexicon-based sentiment analysis.

Aspect-Based Sentiment Analysis (ABSA)

In ABSA, the goal is not only to understand the sentiment of the text but also to understand the specific aspects or features that the sentiment is associated with. For example, in a product review, the user might express positive sentiment about the battery life of a phone (aspect: battery life) but negative sentiment about its weight (aspect: weight).

Emotion Detection

This goes beyond basic sentiment analysis and aims to detect specific emotions expressed in a text, such as happiness, anger, sadness, etc. This could be done using emotion-specific lexicons or more complex machine learning models.

Social Media Sentiment Analysis

Social media platforms like Twitter and Facebook provide a rich source of text for sentiment analysis. Special techniques might be needed to handle the short, informal, and often misspelled or abbreviated text common on these platforms.

Multilingual Sentiment Analysis

This is the application of sentiment analysis techniques to text in multiple languages. This often requires language-specific resources like lexicons and labeled data, as well as techniques for handling translation and cultural differences in how sentiment is expressed.

Sarcasm Detection

This is a particularly challenging area of sentiment analysis, as sarcastic comments often say the opposite of what they mean, making them difficult to interpret correctly. Techniques for sarcasm detection often rely on context and common patterns in the way sarcasm is used.

Sentiment Analysis Tool

There are numerous sentiment analysis tools and libraries available that cater to different needs. Some are designed for researchers and data scientists and require programming skills, while others are commercial platforms designed for businesses. Here are a few examples:

  • Natural Language Toolkit (NLTK) : A popular Python library for natural language processing. It includes functionality for sentiment analysis, along with many other NLP tasks.
  • TextBlob : Another Python library that provides a simple API for diving into common NLP tasks such as part-of-speech tagging, noun phrase extraction, and sentiment analysis.
  • VADER Sentiment Analysis : VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that’s specifically attuned to sentiments expressed in social media.
  • Stanford CoreNLP : A Java-based toolkit providing various NLP tools that support many languages. It includes a sentiment analysis tool.
  • Spacy : A Python NLP library that can be extended with separate machine learning models allowing more complex sentiment analysis.
  • Google Cloud Natural Language API : A cloud-based tool that uses machine learning to analyze text. It provides sentiment analysis, entity analysis, entity sentiment analysis, and more.
  • IBM Watson Tone Analyzer : It provides sentiment analysis and also detects seven tones in written text: anger, fear, joy, sadness, confident, analytical, and tentative.
  • Microsoft Azure Text Analytics API : Part of Azure’s Cognitive Services, it provides sentiment analysis, key phrase extraction, and language detection.
  • Amazon Comprehend : This is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. It provides sentiment analysis functionality and supports multiple languages.
  • MonkeyLearn : An AI platform that allows you to classify and extract actionable data from raw text. It has pre-trained models for sentiment analysis and also allows you to train custom models.

What are the challenges in sentiment analysis?

Sentiment analysis, while a powerful tool, is not without its challenges. Some of the main challenges include:

  • Sarcasm and Irony : Sarcasm and irony involve saying something but meaning the opposite, which can be very difficult for sentiment analysis tools to correctly interpret. For example, a statement like “Oh great, just what I needed” might be labeled as positive by a sentiment analysis tool, while a human would recognize the sarcasm and label it as negative.
  • Contextual Ambiguity : The sentiment of a word or phrase can depend heavily on its context. For example, “unpredictable” might be negative when used to describe a car’s handling but positive when used to describe a book’s plot.
  • Domain-Specific Language : Different fields or industries might use language in unique ways, including jargon and slang. A word that’s positive in one domain might be negative in another, and a sentiment analysis tool trained on general language data might not be able to accurately analyze text from a specific domain.
  • Negations and Double Negatives : Phrases with negations or double negatives can be tricky for sentiment analysis. For example, “not bad” is a positive sentiment, and “not uninteresting” is also generally positive.
  • Emotionally Complex Statements : Sentiment analysis often classifies text as positive, negative, or neutral, but human emotions are more complex. A text could contain multiple emotions, or emotions that don’t fit neatly into the positive-negative scale, and a simple sentiment analysis might not capture this complexity.
  • Language and Cultural Differences : Sentiment can be expressed differently in different languages and cultures. For example, the same phrase translated into different languages might have different sentiment due to cultural nuances. A sentiment analysis tool trained on English-language data might not work as well on other languages.
  • Lack of Labeled Data : Machine learning-based sentiment analysis tools require large amounts of labeled data for training, and it can be time-consuming and expensive to create this data. There might also be a lack of labeled data in specific domains or languages.

Applications of Sentiment Analysis

sentiment analysis is used in a variety of fields and for a wide range of applications, leveraging the fact that much of our communication and expression is now in digital text form. Here are some notable applications:

  • Business and Customer Insights : Companies can use sentiment analysis to monitor customer reviews of their products and services on different platforms to understand what their customers like or dislike. This can guide improvements and innovation.
  • Social Media Monitoring : Sentiment analysis can help monitor social media platforms to understand public sentiment about a brand, a product, or a service. This can also help in crisis management, as spikes in negative sentiment can be early indicators of a problem.
  • Market Research and Analysis : By gauging public sentiment on social media and other online platforms, companies can gain insights into market trends and consumer behaviors, helping them strategically plan their marketing efforts.
  • Political Campaigns and Polls : Politicians and political parties can use sentiment analysis to understand public opinion about them or about key issues, allowing them to adjust their campaigns or policies accordingly.
  • Financial Market Analysis : Some traders use sentiment analysis to predict market trends. For example, negative sentiments from company reports, financial news, or social media discussions could potentially signal a fall in stock prices.
  • Healthcare and Public Health : Sentiment analysis can be used to understand public sentiment about health interventions, disease outbreaks, or health behaviors, which can inform public health efforts.
  • Product Analytics : Sentiment analysis can be used to analyze user reviews and feedback about software products. It can help to identify common pain points or highly appreciated features, guiding product development.
  • Human Resources and Employee Feedback : Sentiment analysis can be used to analyze employee feedback or comments, helping HR identify common themes, improve employee satisfaction, and reduce churn.
  • Entertainment Industry : Sentiment analysis can be used to gauge public opinion about movies, music, games, and other entertainment products. For example, movie producers can use sentiment analysis to predict how well a movie will be received.
  • Automated Customer Service : Sentiment analysis can be used in chatbots and other automated systems to detect the sentiment of user inputs and adjust responses accordingly.

Advantages of Sentiment Analysis

Sentiment analysis offers several key benefits, especially in our digitally connected world where vast amounts of textual data are generated every day. Here are some of the primary advantages:

  • Customer Insights : Sentiment analysis allows companies to gain a deeper understanding of their customers’ perceptions, opinions, and feelings about their products or services. This information can inform business strategies, guide product improvements, and enhance overall customer experience.
  • Brand Monitoring : Companies can use sentiment analysis to keep track of their brand’s reputation in real-time. By analyzing sentiments in social media posts, reviews, and comments, companies can detect shifts in public opinion and respond proactively.
  • Competitive Analysis : By applying sentiment analysis on social media conversations or product reviews related to competitors, companies can gain insights into strengths and weaknesses of competitors’ offerings, helping in strategic decision-making.
  • Crisis Management : Sentiment analysis can help in identifying negative sentiments in real-time, which can act as an early warning system for crises or issues that need immediate attention.
  • Market Research : Sentiment analysis can be used to gauge public opinion on a large scale, which is invaluable for market research. Companies can get insights into consumer reactions towards product launches, marketing campaigns, or events.
  • Improved Customer Service : By integrating sentiment analysis in customer service, companies can prioritize responses based on sentiment scores. Customers with negative sentiments can be prioritized to improve their experience and mitigate potential churn.
  • Efficient and Scalable : Manual analysis of textual data can be incredibly time-consuming, particularly when dealing with large volumes of data. Sentiment analysis automates this process, making it more efficient and scalable.
  • Enhanced Employee Feedback Analysis : Organizations can use sentiment analysis to understand employee feedback, identify areas of improvement, and foster a better workplace environment.

Disadvantages of Sentiment Analysis

Despite its many advantages, sentiment analysis also has its limitations. Here are some of the key disadvantages or challenges:

  • Difficulty with Sarcasm and Irony : Automated sentiment analysis systems can struggle with understanding sarcasm and irony, which often involve saying one thing but meaning the opposite. This can lead to misinterpretation of sentiments.
  • Understanding Context : The sentiment value of certain phrases may change based on the context in which they’re used. Sentiment analysis algorithms may find it challenging to correctly interpret such context-dependent phrases.
  • Handling Negations : Sentiment analysis systems might struggle with phrases that include negations. For example, the phrase “not great” is negative, but a simplistic sentiment analysis algorithm might interpret it as positive because it contains the word “great”.
  • Lack of Nuance : Most sentiment analysis tools categorize text into positive, negative, or neutral sentiments. Human emotions, however, are far more complex and nuanced. As a result, such tools might oversimplify the sentiment.
  • Cultural and Linguistic Differences : Sentiment analysis tools might struggle with slang, idioms, and language-specific expressions. In addition, they might not account for cultural differences in expressing sentiments.
  • Need for Large Amounts of Labeled Data : Machine learning-based sentiment analysis tools require large amounts of labeled data for training. Creating this data can be time-consuming and expensive, and there might be a lack of labeled data in specific domains or languages.
  • Accuracy : The accuracy of sentiment analysis can vary depending on the complexity of the text and the quality of the tool used. Misinterpretations can lead to misleading conclusions.
  • Spam and Bots : In areas like social media analysis, it’s often difficult to distinguish between genuine user content and content generated by spam or bots. This can influence the sentiment analysis results.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Prescriptive Analytics

Prescriptive Analytics – Techniques, Tools and...

Big Data

What is Big Data? Types, Tools and Examples

Big Data Analytics

Big Data Analytics -Types, Tools and Methods

Blockchain Research

Blockchain Research – Methods, Types and Examples

Descriptive Analytics

Descriptive Analytics – Methods, Tools and...

Predictive Analytics

Predictive Analytics – Techniques, Tools and...

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 25 June 2024

Unifying aspect-based sentiment analysis BERT and multi-layered graph convolutional networks for comprehensive sentiment dissection

  • Kamran Aziz 1 ,
  • Donghong Ji 1 ,
  • Prasun Chakrabarti 2 ,
  • Tulika Chakrabarti 3 ,
  • Muhammad Shahid Iqbal 4 &
  • Rashid Abbasi 5  

Scientific Reports volume  14 , Article number:  14646 ( 2024 ) Cite this article

Metrics details

  • Data mining
  • Machine learning

Aspect-Based Sentiment Analysis (ABSA) represents a fine-grained approach to sentiment analysis, aiming to pinpoint and evaluate sentiments associated with specific aspects within a text. ABSA encompasses a set of sub-tasks that together facilitate a detailed understanding of the multifaceted sentiment expressions. These tasks include aspect and opinion terms extraction (ATE and OTE), classification of sentiment at the aspect level (ALSC), the coupling of aspect and opinion terms extraction (AOE and AOPE), and the challenging integration of these elements into sentiment triplets (ASTE). Our research introduces a comprehensive framework capable of addressing the entire gamut of ABSA sub-tasks. This framework leverages the contextual strengths of BERT for nuanced language comprehension and employs a biaffine attention mechanism for the precise delineation of word relationships. To address the relational complexity inherent in ABSA, we incorporate a Multi-Layered Enhanced Graph Convolutional Network (MLEGCN) that utilizes advanced linguistic features to refine the model’s interpretive capabilities. We also introduce a systematic refinement approach within MLEGCN to enhance word-pair representations, which leverages the implicit outcomes of aspect and opinion extractions to ascertain the compatibility of word pairs. We conduct extensive experiments on benchmark datasets, where our model significantly outperforms existing approaches. Our contributions establish a new paradigm for sentiment analysis, offering a robust tool for the nuanced extraction of sentiment information across diverse text corpora. This work is anticipated to have significant implications for the advancement of sentiment analysis technology, providing deeper insights into consumer preferences and opinions for a wide range of applications.

Similar content being viewed by others

analysis sentiment research

A hybrid dependency-based approach for Urdu sentiment analysis

analysis sentiment research

Character gated recurrent neural networks for Arabic sentiment analysis

analysis sentiment research

Sentence-level sentiment analysis based on supervised gradual machine learning

Introduction.

Aspect Based Sentiment Analysis represents a granular approach to parsing sentiments in text, focusing on the specific aspects or features discussed and the sentiment directed towards them 1 , 2 , 3 , 4 . It begins with ATE, which identifies the nouns or phrases that represent the focal points of sentiment within the text 5 , 6 , 7 . Then, OTE locates the adjectives or adverbs that express feelings or attitudes towards these aspects 8 , 9 , 10 . Moving beyond identification, ALSC categorizes the sentiment towards each aspect as positive, negative, or neutral 11 , 12 , 13 . Aspect-oriented Opinion Extraction then associates these sentiments with the corresponding aspects 14 , 15 , while Aspect Extraction and Sentiment Classification combines the processes of ATE and ALSC for efficiency 16 . Aspect-Opinion Pair Extraction is the process of pairing each aspect with its qualifying opinion, and the most integrative task, ASTE, combines aspects, opinions, and sentiments into a comprehensive triplet for each aspect mentioned 17 , 18 , 19 , 20 . The Figure  1 presents an example sentence annotated with universal dependency and part of speech, while Table  1 displays the outcomes of various subcomponents of sentiment analysis for this particular review.

figure 1

Universal dependency and part-of-speech tagging for the given example.

tasks intricately mines text data to identify sentiments toward specific aspects mentioned within 21 , 22 . In evaluating a smartphone review like “The camera delivers stunning images, but the battery life is quite disappointing,” ABSA performs a series of sophisticated sub-tasks: ATE identifies the features “camera” and “battery life” under scrutiny; OTE captures the corresponding evaluative terms “stunning” and “disappointing”; Aspect-Level Sentiment Classification (ALSC) assigns sentiments, labeling the camera’s as positive and the battery’s as negative. AOE links these sentiments to their respective aspects, crafting a direct association between “stunning” and “camera” and between “disappointing” and “battery life”. Aspect Extraction and Sentiment Classification (AESC) streamlines the process by tagging “camera” with a positive sentiment and “battery life” with a negative sentiment in one step. AOPE then pairs aspects with their opinions, forming the pairs (“camera”, “stunning”) and (“battery life”, “disappointing”), which is critical for pinpointing precise consumer attitudes. Finally, Aspect Sentiment Triplet Extraction integrates these elements, producing a comprehensive sentiment overview with triplets (“camera”, “stunning”, positive) and (“battery life”, “disappointing”, negative), offering granular insights into the multifaceted nature of consumer feedback 22 , 23 , 24 , 25 , 26 .

The implementation of ABSA is fraught with challenges that stem from the complexity and nuances of human language 27 , 28 . One significant hurdle is the inherent ambiguity in sentiment expression, where the same term can convey different sentiments in different contexts. Moreover, sarcasm and irony pose additional difficulties, as they often invert the literal sentiment of terms, requiring sophisticated detection techniques to interpret correctly 29 . Another challenge is co-reference resolution, where pronouns and other referring expressions must be accurately linked to the correct aspects to maintain sentiment coherence 30 , 31 . Additionally, the detection of implicit aspects, where sentiments are expressed without explicitly mentioning the aspect, necessitates a deep understanding of implied meanings within the text. Furthermore, multilingual and cross-domain ABSA require models that can transfer knowledge and adapt to various languages and domains, given that sentiment indicators and aspect expressions can vary significantly across cultural and topical boundaries 32 , 33 , 34 , 35 . The continuous evolution of language, especially with the advent of internet slang and new lexicons in online communication, calls for adaptive models that can learn and evolve with language use over time. These challenges necessitate ongoing research and development of more sophisticated ABSA models that can navigate the intricacies of sentiment analysis with greater accuracy and contextual sensitivity.

To effectively navigate the complex landscape of ABSA, the field has increasingly relied on the advanced capabilities of deep learning. Neural sequential models like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) have set the stage by adeptly capturing the semantics of textual reviews 36 , 37 , 38 . These models contextualize the sequence of words, identifying the sentiment-bearing elements within. The Transformer architecture, with its innovative self-attention mechanisms, along with Embeddings from Language Models (ELMo), has further refined the semantic interpretation of texts 39 , 40 , 41 . These advancements have provided richer, more nuanced semantic insights that significantly enhance sentiment analysis. However, despite these advancements, challenges arise when dealing with the complex syntactic relationships inherent in language-connections between aspect terms, opinion expressions, and sentiment polarities 42 , 43 , 44 . To bridge this gap, Tree hierarchy models like Tree LSTM and Graph Convolutional Networks (GCN) have emerged, integrating syntactic tree structures into their learning frameworks 45 , 46 . This incorporation has led to a more granular analysis that combines semantic depth with syntactic precision, allowing for a more accurate sentiment interpretation in complex sentence constructions. Furthermore, the integration of external syntactic knowledge into these models has shown to add another layer of understanding, enhancing the models’ performance and leading to a more sophisticated sentiment analysis process.

In our approach to ABSA, we introduce an advanced model that incorporates a biaffine attention mechanism to determine the relationship probabilities among words within sentences. This mechanism generates a multi-dimensional vector where each dimension corresponds to a specific type of relationship, effectively forming a relation adjacency tensor for the sentence. To accurately capture the intricate connections within the text, our model converts sentences into a multi-channel graph. This graph treats words as nodes and the elements of the relation adjacency tensor as edges, thereby mapping the complex network of word relationships. Our model stands out by integrating a wide array of linguistic features. These include lexical and syntactic information such as part-of-speech tags, types of syntactic dependencies, tree-based distances, and relative positions between pairs of words. Each set of features is transformed into edges within the multi-channel graph, substantially enriching the model’s linguistic comprehension. This comprehensive integration of linguistic features is novel in the context of the ABSA task, particularly in the ASTE task, where such an approach has seldom been applied. Additionally, we implement a refining strategy that utilizes the outcomes of aspect and opinion extractions to enhance the representation of word pairs. This strategy allows for a more precise determination of whether word pairs correspond to aspect-opinion relationships within the context of the sentence. Overall, our model is adept at navigating all seven sub-tasks of ABSA, showcasing its versatility and depth in understanding and analyzing sentiment at a granular level.

We present an advanced neural network architecture that addresses every sub-task associated with Aspect-Based Sentiment Analysis (ABSA). This model establishes a new benchmark for the integration of syntactic and semantic data. It markedly improves the accuracy in detecting aspect and opinion terms along with their corresponding sentiment classifications.

Our research integrates an extensive range of linguistic features, such as syntactic dependencies and part-of-speech patterns, into the ABSA framework. This integration substantially enhances the model’s ability to capture the nuances of language, leading to improved sentiment analysis accuracy.

We have crafted a novel refining strategy that utilizes the initial results of aspect and opinion extractions. This strategy refines the representation of word pairs, sharpening the alignment between aspects and their corresponding opinions. This step is vital for the precise detection of sentiment orientations and intensities, which lie at the heart of ABSA.

The remainder of this paper is organized as follows: In Sect. Related work , we discuss the relevant literature and prior work in the domain. Section Proposed framework delves into the proposed framework for the model our proposed methodology, encompassing the techniques and algorithms we employed, Sect. Experiments showcases the experimental results, and the evaluation is presented in Sect. Model analysis we perform an ablation study. Finally, Sect. Conclusion concludes the paper, summarizing our contributions and suggesting potential avenues for future research.

Related work

In this segment, we explore the landscape of Aspect Based Sentiment Analysis research, focusing on both individual tasks and integrated sub-tasks. We begin by delving into early research that highlights the application of graph neural network models in ABSA. This is followed by an examination of studies that leverage attention mechanisms and pre-trained language models, showcasing their impact and evolution in the field of ABSA.

Aspect based sentiment analysis and its subtasks

The field of ABSA has garnered significant attention over the past ten years, paralleling the rise of e-commerce platforms. Xue and Li present a streamlined convolutional neural network model with gating mechanisms for ABSA, offering improved accuracy and efficiency over traditional LSTM and attention-based methods, particularly in aspect-category and aspect-term sentiment analysis 47 . Ma et al. enhance ABSA by integrating commonsense knowledge into an LSTM with a hierarchical attention mechanism, leading to a novel ’Sentic LSTM’ that outperforms existing models in targeted sentiment tasks 48 . Yu et al. propose a multi-task learning framework, the Multiplex Interaction Network (MIN), for ABSA, emphasizing the importance of ATE and OTE. Their approach, which adeptly handles interactions among subtasks, showcases flexibility and robustness, especially in scenarios where certain subtasks are missing, and their model’s proficiency in both ATE and OTE stands out in extensive benchmark testing 49 . Dai et al. demonstrate that fine-tuned RoBERTa (FT-RoBERTa) models, with their intrinsic understanding of sentiment-word relationships, can enhance ABSA and achieve state-of-the-art results across multiple languages 50 . Chen et al. propose a Hierarchical Interactive Network (HI-ASA) for joint aspect-sentiment analysis, which excels in capturing the interplay between aspect extraction and sentiment classification. This method, integrating a cross-stitch mechanism for feature blending and mutual information for output constraint, showcases the effectiveness of interactive tasks, particularly in Aspect Extraction and Sentiment Classification (AESC) 51 . Zhao et al. address the challenge of extracting aspect-opinion pairs in ABSA by introducing an end-to-end Pair-wise Aspect and Opinion Terms Extraction (PAOTE) method. This approach diverges from traditional sequence tagging by considering the task through the lens of joint term and relation extraction, utilizing a multi-task learning framework that supervises term extraction via span boundaries while concurrently identifying pair-wise relations. Their extensive testing indicates that this model sets a new benchmark, surpassing previous state-of-the-art methods 52 , 53 .

Innovations in ABSA have introduced models that outpace traditional methods in efficiency and accuracy. New techniques integrating commonsense knowledge into advanced LSTM frameworks have improved targeted sentiment analysis 54 . Multi-task learning models now effectively juggle multiple ABSA subtasks, showing resilience when certain data aspects are absent. Pre-trained models like RoBERTa have been adapted to better capture sentiment-related syntactic nuances across languages. Interactive networks bridge aspect extraction with sentiment classification, offering more complex sentiment insights. Additionally, novel end-to-end methods for pairing aspect and opinion terms have moved beyond sequence tagging to refine ABSA further. These strides are streamlining sentiment analysis and deepening our comprehension of sentiment expression in text 55 , 56 , 57 , 58 , 59 .

Innovative approaches to sentiment analysis leveraging attention mechanisms

Attention mechanisms have gained traction in deep learning(DL) models addressing ABSA sub-components, recognized for their effectiveness in semantically linking aspects with contextual words. In addressing aspect-based sentiment classification, Liu et al. identified a gap in current neural attention models, which tend to highlight sentiment words without adequately linking them to the relevant aspects within a sentence. This shortcoming becomes particularly evident in sentences with multiple aspects and complex structures. They introduced a novel attention-based model that incorporates dual mechanisms: a sentence-level attention for global aspect relevance, and a context-level attention that accounts for the sequence and interrelation of words. Their empirical results showed that this dual mechanism approach significantly improves performance over existing models 60 . Lin et al. advanced the interpretability of sentence embeddings by leveraging a self-attention mechanism. Their novel approach represents embeddings as 2-D matrices, allowing each row to focus on distinct segments of a sentence. This not only enhances the model’s performance on tasks such as author profiling, sentiment classification, and textual entailment, but also provides an intuitive method for visualizing the parts of the sentence that contribute to the embedding’s formation 61 .Chen et al. explored the integration of Graph Convolutional Networks (GCN) with co-attention mechanisms to enhance aspect-based sentiment analysis (ABSA). Their model effectively utilizes both semantic and syntactic information to filter out irrelevant context, demonstrating significant improvements in identifying the sentiment polarity of specific aspects within sentences 62 . Wang et al. targeted the challenge of discerning sentiment polarity towards specific aspects in text, a task complicated by the subtleties of language and the presence of multiple aspects within a single sentence. Their solution involves a novel encoding of syntactic information into a unified aspect-oriented dependency tree structure. By deploying a relational graph attention network (R-GAT) that operates on this refined tree structure, their method more accurately identifies connections between aspects and opinion words, leading to notable improvements in sentiment analysis performance on prominent datasets 63 .

Attention mechanisms have revolutionized ABSA, enabling models to home in on text segments critical for discerning sentiment toward specific aspects 64 . These models excel in complex sentences with multiple aspects, adjusting focus to relevant segments and improving sentiment predictions. Their interpretability and enhanced performance across various ABSA tasks underscore their significance in the field 65 , 66 , 67 .

Syntax-driven approaches to aspect-level sentiment analysis

Zhang and Qian’s model improves aspect-level sentiment analysis by using hierarchical syntactic and lexical graphs to capture word co-occurrences and differentiate dependency types, outperforming existing methods on benchmarks 68 . In the field of ALSC, Zheng et al. have highlighted the importance of syntactic structures for understanding sentiments related to specific aspects. Their novel neural network model, RepWalk, leverages replicated random walks on syntax graphs to better capture the informative contextual words crucial for sentiment analysis. This method has shown superior performance over existing models on multiple benchmark datasets, underscoring the value of incorporating syntactic structure into sentiment classification representations 69 . Zhang and Li’s research advances aspect-level sentiment classification by introducing a proximity-weighted convolution network that captures syntactic relationships between aspects and context words. Their model enhances LSTM-derived contexts with syntax-aware weights, effectively distinguishing sentiment for multiple aspects and improving the overall accuracy of sentiment predictions 70 . Huang and Li’s work enhances aspect-level sentiment classification by integrating syntactic structure and pre-trained language model knowledge. Employing a graph attention network on dependency trees alongside BERT’s subword features, their approach achieves refined context-aspect interactions, leading to more precise sentiment polarity determinations in complex sentences 71 . Xu, Pang, Wu, Cai, and Peng’s research focuses on leveraging comprehensive syntactic structures to improve aspect-level sentiment analysis. They introduce “Scope” as a novel concept to outline structural text regions pertinent to specific targets. Their hybrid graph convolutional network (HGCN) merges insights from both constituency and dependency tree analyses, enhancing sentiment-relation modeling and effectively sifting through noisy opinion words 72 . Xiao et al. enhance aspect-based sentiment classification by introducing a graph neural network model that leverages a part-of-speech guided syntactic dependency graph and a syntactic distance attention layer, significantly outperforming traditional methods on public datasets 73 . Incorporating syntax-aware techniques, the Enhanced Multi-Channel Graph Convolutional Network (EMC-GCN) for ASTE stands out by effectively leveraging word relational graphs and syntactic structures. Its use of biaffine attention to construct relation-aware representations, combined with a unique refining strategy for syntactically-informed word-pair representations, results in significant improvements over existing methods as evidenced by benchmark dataset performances 19 .

The integration of syntactic structures into ABSA has significantly improved the precision of sentiment attribution to relevant aspects in complex sentences 74 , 75 . Syntax-aware models excel in handling sentences with multiple aspects, leveraging grammatical relationships to enhance sentiment discernment. These models not only deliver superior performance but also offer better interpretability, making them invaluable for applications requiring clear rationale. The adoption of syntax in ABSA underscores the progression toward more human-like language processing in artificial intelligence 76 , 77 , 78 .

While existing literature lays a solid groundwork for Aspect-Based Sentiment Analysis, our model addresses critical limitations by advancing detection and classification capabilities in complex linguistic contexts. Our Multi-Layered Enhanced Graph Convolutional Network (MLEGCN) integrates a biaffine attention mechanism and a sophisticated graph-based approach to enhance nuanced text interpretation. This model effectively handles multiple sentiments within a single context and dynamically adapts to various ABSA sub-tasks, improving both theoretical and practical applications of sentiment analysis. This not only overcomes the simplifications seen in prior models but also broadens ABSA’s applicability to diverse real-world datasets, setting new standards for accuracy and adaptability in the field.

Proposed framework

In this section, we introduce the formal definitions pertinent to the sub-tasks of ABSA. Figure  3 is the overall architecture for Fine-grained Sentiments Comprehensive Model for Aspect-Based Analysis. Following these definitions, we then formally outline the problem based on these established terms.

Given an input sentence \(S = \{w_1, w_2, \ldots , w_n\}\) comprising \(n\) words, our model excels in performing seven subtasks of ABSA. It identifies \(a\) as an aspect term and \(o\) as an opinion term, while \(s\) represents the sentiment polarity associated with the aspect. This sentiment polarity is classified within a label set \(X = \{\text {POS}, \text {NEU}, \text {NEG}\}\) , encompassing three sentiment polarities: positive, neutral, and negative. The model processes the sentence to discern and interpret these specific elements.

Aspect Term Extraction (ATE): Extracts all aspect terms from the given sentence \(S\) .

ATE: \(A = \{a_i | a_i \in S\}\)

Opinion Term Extraction (OTE): Identifies all opinion terms within the sentence \(S\) .

OTE: \(O = \{o_j | o_j \in S\}\)

Aspect Level Sentiment Classification (ALSC): Predicts the sentiment polarity of each aspect term in \(S\) , with polarities defined in \(X\) .

ALSC: \(S_A = \{s(a_i) | a_i \in A, s(a_i) \in X\}\)

Aspect-Opinion Pair Extraction (AOE): Extracts pairs of aspect terms and their corresponding opinion terms from \(S\) .

AOE: \(AO = \{(a_i, o_j) | a_i \in A, o_j \in O\}\)

Aspect and Sentiment Co-Extraction (AESC): Simultaneously identifies aspect terms and their sentiments from \(S\) .

AESC: \(AS = \{(a_i, s(a_i)) | a_i \in A, s(a_i) \in X\}\)

Aspect-Opinion Pairing (AOP): Finds pairs of aspect and opinion terms that are related within \(S\) .

AOP: \(AOM = \{(a_i, o_j) | a_i \in A, o_j \in O, \text {related}\}\)

Aspect-Sentiment-Triplet Extraction (ASTE): Forms triplets from \(S\) that consist of an aspect term, opinion term, and sentiment polarity.

ASTE: \(T = \{(a_i, o_j, s_k) | a_i \in A, o_j \in O, s_k \in X\}\)

Relation definition and table filling

figure 2

Table filling for ABSA in a sentence is illustrated. Each cell denotes a word pair with a relation or label.

The study employs a framework that categorizes word relationships within sentences into ten distinct types, following the methodology introduced by Chen et al. 19 . Four specific labels—{B-A, I-A, B-O, I-O}—are applied to accurately identify terms that represent aspects and opinions. This refined strategy enhances boundary definition within the model, offering improvements over the GTS approach previously outlined by Wu et al 79 . The ‘B’ and ‘I’ labels signify the start and continuation of a term, respectively. Additionally, the suffixes -A and -O are used to categorize a term as either an aspect or an opinion. In Table  1 , the A and O relations assist in determining whether pairs of distinct words pertain to the same aspect or opinion term. Moreover, the sentiment relations—{POS, NEU, NEG}—serve a dual purpose: they confirm whether word pairs correspond and also ascertain the sentiment polarity linked with aspect-opinion pairs. By implementing the table-filling technique, as detailed by Miwa & Sasaki 80 and Gupta et al. 81 , a relation table is constructed for each sentence with annotations. This process is exemplified in Figure  2 , which illustrates the word pairs along with their designated relations, with each table cell denoting a specific word-to-word relationship (Figure  3 ).

figure a

Comprehensive ABSA Algorithm with Classified Subtasks.

Model layers and formation

Input layer.

BERT, short for Bidirectional Encoder Representations from Transformers , was introduced by Devlin et al. in 2019. This model has been widely recognized for its outstanding performance on various natural language processing tasks. BERT utilizes a deep learning technique known as the Transformer, which employs attention mechanisms to capture contextual information from all words in a sentence, irrespective of their positions 82 . When BERT processes an input sentence \(S\) , which consists of a sequence of tokens \(S = \{w_1, w_2, \ldots , w_n\}\) where \(n\) is the number of tokens, it generates a corresponding sequence of hidden states \(H = \{h_1, h_2, \ldots , h_n\}\) . These hidden states are derived from the last layer of the Transformer block within BERT, capturing the nuanced contextual relationships between the input tokens. This representation power of BERT enables it to serve as an effective sentence encoder for various downstream tasks, providing enriched feature representations that can significantly enhance the performance of natural language understanding systems.

figure 3

The overall architecture fine-grained sentiments comprehensive model for aspect-based analysis.

Attention module

In our model, we employ a biaffine attention module to determine the relational probability distribution between word pairs in a sentence. The effectiveness of biaffine attention in syntactic dependency parsing is well-documented 83 . The biaffine attention mechanism is defined by several key equations, as outlined below:

Equation ( 1 ) defines the transformation of hidden state \(h_i\) through the attention module:

Equation ( 2 ) similarly transforms hidden state \(h_j\) :

Equation ( 3 ) calculates the interaction score \(g_{i,j}\) for word pairs:

Equation ( 4 ) normalizes these scores to determine the relation probability \(r_{i,j,k}\) for each relation type:

Finally, Equation ( 5 ) applies the biaffine attention to obtain the adjacency tensor \(R\) :

These equations collectively model the relations between words in a sentence, where \(m\) represents the number of relation types, and each relation type corresponds to a channel in the adjacency tensor \(R\) . The trainable parameters \(U^1\) , \(U^2\) , and \(b\) , along with the concatenation operation \(\oplus\) , are integral to this process.

Multi-layered enhanced graph convolutional network (MLEGCN)

The MLEGCN represents a significant development over traditional Graph Convolutional Networks (GCN), designed to process graph-structured data more effectively in natural language processing tasks. Originating from the adaptation of Convolutional Neural Networks (CNNs) to graph data 84 , 85 , the MLEGCN enhances this model by introducing mechanisms that capture complex relational dynamics within sentences.

In the MLEGCN framework, each node in the graph corresponds to a word, while edges reflect the syntactic dependencies between these words. This setup facilitates in-depth modeling of sentence structures. The connections between nodes are represented using an adjacency matrix \(A \in \mathbb {R}^{n \times n}\) , where \(n\) is the number of words in a sentence. In this matrix, \(A_{ij} = 1\) indicates a direct syntactic link between the words corresponding to nodes \(i\) and \(j\) , and \(A_{ij} = 0\) otherwise.

A significant enhancement in MLEGCN is the integration of soft edges, which express the probabilistic strengths of connections between node pairs. This concept is inspired by advancements in attention mechanisms 86 , allowing the network to adjust the influence of each connection dynamically. The model incorporates a multi-channel adjacency tensor \(R^{ba} \in \mathbb {R}^{n \times n \times m}\) , where each channel \(m\) corresponds to a unique type of relational dynamic, modulated through a biaffine attention module.

The computational operations in MLEGCN are detailed as follows:

In Equation  6 , \(R^{ba}_{:,:,k}\) represents the \(k\) -th relational channel within \(R^{ba}\) . \(W_k\) and \(b_k\) denote the weight and bias specific to that channel. The function \(\sigma\) is an activation function, such as ReLU, used to introduce non-linearity into the network. The function \(f(\cdot )\) , a pooling operation, combines the hidden representations from all channels to produce a unified node representation.

Through its channel-specific convolutions, MLEGCN is able to differentiate and analyze various types of word relationships. This capability allows for a more nuanced understanding of language, making MLEGCN particularly effective for tasks like sentiment analysis, entity recognition, and syntactic parsing. The consolidated output \({\hat{H}}^{ba}\) , derived by pooling across channels (Equation  7 ), provides a holistic view of the word relationships, crucial for performing complex downstream tasks.

Enhanced understanding of syntactic features

Chen et al. 2022’s innovative framework employs a comprehensive suite of linguistic features that critically examine the interrelations between word pairs within sentences. These features, which include combinations of part-of-speech tags, varieties of syntactic dependencies, tree-based hierarchical distances, and relative positioning within the sentence, contribute to the detailed understanding of language structure.

In practical terms, the model initiates an examination of each word pair \((w_i, w_j)\) by assigning a self-dependency feature that signifies the inherent syntactic role associated with the words. This is operationalized through the initialization of four adjacency tensors: \(R_{\text {psc}}\) for part-of-speech combinations, \(R_{\text {dep}}\) for syntactic dependencies, \(R_{\text {tbd}}\) for tree-based distances, and \(R_{\text {rpd}}\) for relative positions-each offering a different perspective on the sentence structure.

Focusing on the syntactic dependency dimension as an illustrative case, when there is a recognized dependency type such as ’nsubj’ (nominal subject) between \(w_i\) and \(w_j\) , the corresponding location in the tensor \(R_{\text {dep}}\) is embedded with a vector representation of ’nsubj’. This embedding is retrieved from a dynamically learned table, encapsulating the relationship’s essence. Conversely, the absence of a dependency connection is indicated by a zero vector at the respective tensor indices.

The tensors undergo a process of graph convolutions, refining the raw node representations into enriched forms \({\hat{H}}_{\text {psc}}, {\hat{H}}_{\text {dep}}, {\hat{H}}_{\text {tbd}},\) and \({\hat{H}}_{\text {rpd}}\) . Through techniques such as average pooling and concatenation, these representations are synthesized into holistic node and edge descriptors for the sentence as given by:

Here, \(H\) encapsulates the ensemble of node representations \(\{h_1, h_2, \ldots , h_n\}\) , while \(R\) aggregates the edge representations \(\{r_{1,1}, r_{1,2}, \ldots , r_{n,n}\}\) which collectively enhance the model’s proficiency in recognizing and interpreting complex linguistic constructs, thereby substantially improving its applicability in diverse NLP tasks.

Figure  4 illustrates the matrices corresponding to the syntactic features utilized by the model. The Part-of-Speech Combinations and Dependency Relations matrices reveal the frequency and types of grammatical constructs present in a sample sentence. Similarly, the Tree-based Distances and Relative Position Distance matrices display numerical representations of word proximities and their respective hierarchical connections within the same sentence. These visualizations underscore the framework’s capacity to capture and quantify the syntactic essence of language.

figure 4

Matrices depicting the syntactic features leveraged by the framework for analyzing word pair relationships in a sentence, illustrating part-of-speech combinations, dependency relations, tree-based distances, and relative positions.

Correlation constraints

To ensure the precise delineation of word relationships within a sentence, the model enforces a constraint on the adjacency tensor, which originates from the biaffine attention framework. This constraint is quantified by the following expression in equation  10 :

Where in the formula:

\(I(\cdot )\) stands for the indicator function.

\(y_{ij}\) denotes the verified relationship type between the word pair \((w_i, w_j)\) .

\(C\) represents the comprehensive set of all potential relationship types.

\(r_{ij|c}\) is the model’s forecasted probability score for the relationship type \(c\) between the word pair \((w_i, w_j)\) .

In addition, this relational constraint is similarly applied to four other adjacency tensors, each linked to distinct linguistic features. These tensors are labeled as \(L_{psc}\) , \(L_{dep}\) , \(L_{tbd}\) , and \(L_{rpd}\) , correlating with individual linguistic feature sets.

Systematic refinement and prediction module

The predictive capabilities of our model are heavily reliant on accurately determining the sentiment relationship between word pairs \((w_i, w_j)\) . This process begins with the combination of individual node representations \(h_i\) and \(h_j\) , along with the edge representation \(r_{ij}\) , as illustrated in equation  11 87 . To enhance this initial representation, we introduce a systematic refinement strategy that utilizes additional self-referential edge representations \(r_{ii}\) and \(r_{jj}\) . These are crucial in contexts where words may have self-related sentiment implications that affect their interaction with other words in the sentence.

Refinement Strategy Rationale and Mechanism:

Enhanced Contextual Understanding: The inclusion of \(r_{ii}\) and \(r_{jj}\) allows our model to incorporate not only direct relational dynamics between \(w_i\) and \(w_j\) but also each word’s relationship with itself. This dual consideration is critical, especially in complex sentences where aspects and opinions can be nuanced.

Aspect and Opinion Extraction Influences: When \(w_i\) is an aspect and \(w_j\) an opinion, the combined representation \(s_{ij}\) is enriched by this systematic refinement approach. We leverage outcomes from aspect and opinion extractions to better assess and predict the potential sentiment (positive, neutral, or negative) based on empirical observations that aspects and opinions typically generate strong sentiment indicators.

figure 5

Illustration of systematic refinement.

This refined representation \(s_{ij}\) is processed through a linear layer followed by a softmax activation to calculate the probabilities of the sentiment label distribution also depicted in Figure  5 :

Impact of Refinement on Prediction Accuracy: To validate the effectiveness of our refinement strategy, we conducted error analysis comparing model outputs with and without the inclusion of self-referential edges. Our findings reveal that models incorporating \(r_{ii}\) and \(r_{jj}\) consistently perform better in scenarios involving implicit sentiment relations and complex aspect-opinion structures. Specifically, error rates decrease significantly in cases involving subtle sentiment expressions, underscoring the importance of our systematic refinement strategy. Equation  11 refines the representation of word pairs by integrating additional context that enhances the model’s sensitivity to nuanced linguistic features. Equation  12 then leverages this refined representation to predict the most likely sentiment label for each word pair, demonstrating a tangible improvement in the model’s ability to discern and classify sentiment relationships accurately. This enhancement is crucial for robust performance across diverse datasets and is supported by quantitative improvements in prediction accuracy in our experimental results section.

Loss function

The loss function we aim to minimize is given by:

\(L_p\) is the standard cross-entropy loss for the ASTE task, defined as:

\(\alpha\) and \(\beta\) are coefficients that balance the influence of the different components of the loss function.

\(L_{ba}\) , \(L_{psc}\) , \(L_{dep}\) , \(L_{tbd}\) , and \(L_{rpd}\) represent additional loss components, addressing specific constraints and aspects of the task.

The structure of \(L\) combines the primary task-specific loss with additional terms that incorporate constraints and auxiliary objectives, each weighted by their respective coefficients.

Experiments

The study presents a detailed examination of a method’s efficacy when applied to two distinct benchmark datasets within the field of ABSA. These datasets are associated with the Semantic Evaluation (SemEval) challenges that occurred over the course of three consecutive years-2014 through 2016 88 , 89 .

The first of these datasets, referred to herein as Dataset 1 (D1), was introduced in a study by Wu et al. under the 2020a citation. The second dataset, known as Dataset 2 (D2), is the product of annotations by Xu et al. in 2020. It represents an enhanced and corrected version of an earlier dataset put forth by Peng et al. in 2020, aiming to rectify previous inaccuracies 79 , 90 , 91 .

Comprehensive metrics and statistical breakdowns of these two datasets are thoughtfully compiled in a section of the paper designated as Table  2 . This table likely offers an in-depth look at the datasets, including the volume of data points, the assortment and balance of sentiment classifications, the variety of aspects evaluated, and other critical data that are essential for determining the strength and effectiveness of the ABSA methodology under review.

Additional resources and tools relevant to this study can be found at the following GitHub repositories: ( https://github.com/xuuuluuu/SemEval-Triplet-data/tree/master/ASTE-Data-V2-EMNLP2020 , https://github.com/huggingface/transformers , https://github.com/NJUNLP/GTS ).

Implementation details

In our research, we have implemented the BERT-base-uncased version 5 as the core sentence encoder. To optimize this encoder, we employ the AdamW optimizer , as proposed by Loshchilov and Hutter (2018) 92 . This optimizer is specifically configured with a learning rate of \(2 \times 10^{-5}\) , a setting that is particularly tailored for fine-tuning the BERT component. For other trainable aspects of our model, a distinct learning rate of \(10^{-3}\) is utilized. This bifurcation in learning rates is a strategic decision, ensuring that while the BERT model is fine-tuned with precision, other model components are trained more aggressively. Additionally, we set the dropout rate at 0.5 to mitigate the risk of overfitting, a common concern in deep learning models.

The architecture of our model is built with a keen eye on dimensionality, where the hidden state sizes for BERT and the Graph Convolutional Network (GCN) are set to 768 and 300, respectively. This difference reflects the varied complexity and nature of the data each component handles. Our model, termed MLEGCN , diverges from the traditional EMC-GCN framework. It undergoes an extensive training regime spanning 100 epochs, with each training batch comprising 16 samples. This epoch count and batch size are meticulously chosen to balance computational efficiency with effective learning. To manage the influence of relation constraints within our model, we meticulously tune two hyperparameters: \(\alpha\) is set to 0.1 and \(\beta\) to 0.01. This fine-tuning is crucial for balancing the relation dynamics in the model. It is noteworthy that the number of channels in our model is directly equivalent to the predefined number of relations, a design choice influenced by the immutable nature of these relation constraints.

For parsing and preparing the input sentences, we employ the Stanza tool , developed by Qi et al. (2020). Stanza is renowned for its robust parsing capabilities, which is critical for preparing the textual data for processing by our model. We ensure that the model parameters are saved based on the optimal performance observed in the development set, a practice aimed at maximizing the efficacy of the model in real-world applications 93 . Furthermore, to present a comprehensive and reliable analysis of our model’s performance, we average the results from five distinct runs, each initialized with a different random seed. This method provides a more holistic view of the model’s capabilities, accounting for variability and ensuring the robustness of the reported results.

We evaluate the proposed method against a diverse set of baseline models, as detailed in Table  3 . While many baseline models focus solely on specific subsets of the tasks associated with Aspect-Based Sentiment Analysis (ABSA), only a few provide comprehensive solutions for all associated sub-tasks.

OTE-MTL 94 conceptualizes ABSA as a process of extracting opinion triplets and utilizes a multi-task learning approach with distinct detection heads along with a sentiment dependency parser.

Li-Unified+ 95 introduces a unified model for target-based sentiment analysis, employing dual RNNs to predict unified tags and determine target boundaries.

RINANTE+ 96 uses rules derived from dependency parsing outputs to extract aspect and opinion terms, applying these rules on auxiliary data and refining the approach through a neural model.

TS 97 addresses the extraction of aspect sentiment triplets, advocating a two-step methodology for the prediction and association of aspects, opinions, and sentiments.

CMLA+ 98 offers a comprehensive solution for the simultaneous extraction of aspect and opinion terms using a multi-layer attention mechanism.

EMC-GCN 19 incorporates word relationships within a multi-channel graph structure, representing these relationships as nodes and edges for extracting aspect sentiment triplets.

SPAN-ASTE 99 explores the interaction between complete spans of aspects and opinions to predict sentiment relationships essential for triplet extraction.

IMN-BERT 100 learns multiple tasks associated with ABSA at both token and document levels simultaneously, using a multi-task network approach.

JET-BERT 101 employs an end-to-end model for triplet extraction with a position-aware tagging scheme to capture complex interactions among triplets.

DMRC 102 tackles all ABSA tasks in a unified framework, jointly training two BERT MRC models with shared parameters.

BMRC 103 conceptualizes ASTE as a multi-turn MRC problem, deploying a bidirectional MRC architecture to identify sentiment triplets.

BART-ABSA 104 converts ABSA tasks into a generative model framework using BART for an integrated approach.

SE-GCN 105 presents a ’Syntax-Enhanced Graph Convolutional Network’, which integrates semantic and syntactic insights through graph convolution and attention mechanisms, thereby improving performance across various benchmarks.

Performance evaluation and comparative analysis

Our experimental evaluation on the D1 dataset presented in Table 4 included a variety of models handling tasks such as OTE, AESC, AOP, and ASTE. These models were assessed on their precision, recall, and F1-score metrics, providing a comprehensive view of their performance in Aspect Based Sentiment Analysis.

The “Ours” model showcased consistent high performance across all tasks, especially notable in its F1-scores. This indicates a well-balanced approach to precision and recall, crucial for nuanced tasks in natural language processing. SE-GCN also emerged as a top performer, particularly excelling in F1-scores, which suggests its efficiency in dealing with the complex challenges of sentiment analysis.

In the specific task of OTE, models like SE-GCN, BMRC, and “Ours” achieved high F1-scores, indicating their effectiveness in accurately identifying opinion terms within texts. For AESC, “Ours” and SE-GCN performed exceptionally well, demonstrating their ability to effectively extract and analyze aspects and sentiments in tandem.

In the Aspect-Opinion Pairing task, “Ours” and SE-GCN showed remarkable proficiency, suggesting their adeptness at correctly pairing aspects with corresponding opinions. Additionally, in the ASTE task, our model demonstrated superior performance, underlining its capability in intricately extracting linked aspect-sentiment entities.

When comparing our model to traditional models like Li-Unified+ and RINANTE+, it is evident that “Ours” outperforms them in almost all metrics. This superiority could be attributed to more advanced or specialized methodologies employed in our model. RACL-BERT also showed significant performance in certain tasks, likely benefiting from the advanced contextual understanding provided by BERT embeddings. The TS model, while not topping any category, showed consistent performance across tasks, suggesting its robustness.

An interesting observation from the results is the trade-off between precision and recall in several models. This indicates potential areas for improvement in future research. The selection of a model for practical applications should consider specific needs, such as the importance of precision over recall or vice versa.

These results indicate that there is room for enhancement in the field, particularly in balancing precision and recall. Future research could explore integrating context-aware embeddings and sophisticated neural network architectures to enhance performance in Aspect Based Sentiment Analysis.

In conclusion, our model demonstrates excellent performance across various tasks in ABSA on the D1 dataset, suggesting its potential for comprehensive and nuanced sentiment analysis in natural language processing. However, the choice of the model for specific applications should be aligned with the unique requirements of the task, considering the inherent trade-offs in precision, recall, and the complexities of natural language understanding. This study opens avenues for further research to enhance the accuracy and effectiveness of sentiment analysis models.

In Table  5 , we observe a detailed comparison of various models for ASTE across four datasets: Lap14, Res14, Res16, and Res15. The evaluation metrics-Precision (P), Recall (R), and F1-score (F1)-provide a comprehensive view of each model’s performance in complex sentiment analysis tasks. Notably, SE-GCN stands out in the Lap14 dataset, achieving the highest F1-score (59.72), which reflects its effective handling of sentiment relationships. However, our model demonstrates exceptional consistency across all datasets, either closely matching or surpassing SE-GCN in terms of F1-scores. This is particularly evident in the Res14 and Res15 datasets, where our model records the highest F1-scores, showcasing its precision and robustness in sentiment analysis.

While other models like SPAN-ASTE and BART-ABSA show competitive performances, they are slightly outperformed by the leading models. In the Res16 dataset, our model continues its dominance with the highest F1-score (71.49), further establishing its efficacy in ASTE tasks. This performance indicates a refined balance in identifying and linking aspects and sentiments, a critical aspect of effective sentiment analysis. In contrast, models such as RINANTE+ and TS, despite their contributions, show room for improvement, especially in achieving a better balance between precision and recall.

The results presented in Table  5 emphasize the varying efficacy of models across different datasets. Each dataset’s unique characteristics, including the complexity of language and the nature of expressed aspects and sentiments, significantly impact model performance. The consistent top-tier performance of our model across diverse datasets highlights its adaptability and nuanced understanding of sentiment dynamics. Such adaptability is crucial in real-world scenarios, where data variability is a common challenge. Overall, these findings from Table  5 underscore the significance of developing versatile and robust models for Aspect Based Sentiment Analysis, capable of adeptly handling a variety of linguistic and contextual complexities.

Model analysis

Ablation study.

The ablation study results reveal several important insights about the contributions of various components to the performance of our model. Firstly, it is evident that the complete model configuration comprising refinement processes, syntactic features, and the integration of the MLEGCN and attention modules-consistently yields the highest F1 scores across both the Res14 and Lap14 datasets. This underscores the synergy between the components, suggesting that each plays a crucial role in the model’s ability to effectively process and analyze linguistic data. Particularly, the removal of the refinement process results in a uniform decrease in performance across all model variations and datasets, albeit relatively slight. This suggests that while the refinement process significantly enhances the model’s accuracy, its contribution is subtle, enhancing the final stages of the model’s predictions by refining and fine-tuning the representations.

Table  6 More pronounced are the effects observed from the removal of syntactic features and the MLEGCN and attention mechanisms. The exclusion of syntactic features leads to varied impacts on performance, with more significant declines noted in tasks that likely require a deeper understanding of linguistic structures, such as AESC, AOPE, and ASTE. This indicates that syntactic features are integral to the model’s ability to parse complex syntactic relationships effectively. Even more critical appears the role of the MLEGCN and attention mechanisms, whose removal results in the most substantial decreases in F1 scores across nearly all tasks and both datasets. This substantial performance drop highlights their pivotal role in enhancing the model’s capacity to focus on and interpret intricate relational dynamics within the data. The attention mechanisms, in particular, are crucial for weighting the importance of different elements within the input data, suggesting that their ability to direct the model’s focus is essential for tasks requiring nuanced understanding and interpretation.

These observations from the ablation study not only validate the design choices made in constructing the model but also highlight areas for further refinement and exploration. The consistent performance degradation observed upon the removal of these components confirms their necessity and opens up avenues for further enhancing these aspects of the model. Future work could explore more sophisticated or varied attention mechanisms and delve deeper into optimizing syntactic feature extraction and integration to boost the model’s performance, particularly in tasks that heavily rely on these components.

Syntactic features qualitative analysis

These visualizations serve as a form of qualitative analysis for the model’s syntactic feature representation in Figure 6 . The observable patterns in the embedding spaces provide insights into the model’s capacity to encode syntactic roles, dependencies, and relationships inherent in the linguistic data. For instance, the discernible clusters in the POS embeddings suggest that the model has learned distinct representations for different grammatical categories, which is crucial for tasks reliant on POS tagging. Moreover, the spread and arrangement of points in the dependency embeddings indicate the model’s ability to capture a variety of syntactic dependencies, a key aspect for parsing and related NLP tasks. Such qualitative observations complement our quantitative findings, together forming a comprehensive evaluation of the model’s performance.

figure 6

Comprehensive visualization of the embeddings for four key syntactic features.

The presented case study offers a meticulous examination of our model’s capabilities in Aspect-Based Sentiment Analysis (ABSA) against established benchmarks such as BART-ABSA and BMRC presented in Table  7 . Through a diverse array of product reviews, our model consistently demonstrates superior accuracy in deciphering complex aspect-sentiment relationships. For example, in Review 3, our model accurately captures the nuanced sentiment ’superb’ associated with ’noise cancellation’ and the negative sentiment ’short’ tied to ’battery life,’ aligning perfectly with the ground truth. This precision is attributed to our model’s advanced linguistic feature extraction and refined sentiment contextualization, which outperforms the competing models, particularly in cases where the sentiment is subtle or the aspect term is compound. Moreover, the case study underscores the models’ error patterns, where BART-ABSA occasionally falters in associating sentiments with the correct aspects, and BMRC sometimes misinterprets complex sentiment expressions. In contrast, our model exhibits a robust understanding of intricate linguistic cues, leading to its enhanced performance. These case study insights not only reaffirm our model’s adeptness at tackling the multifaceted nature of sentiment analysis but also highlight its potential to serve as a formidable tool in understanding and quantifying nuanced customer feedback across various product domains.

This research presents a pioneering framework for ABSA, significantly advancing the field. The model uniquely combines a biaffine attention mechanism with a MLEGCN, adeptly handling the complexities of syntactic and semantic structures in textual data. This approach allows for precise extraction and interpretation of aspects, opinions, and sentiments. The model’s proficiency in addressing all ABSA sub-tasks, including the challenging ASTE, is demonstrated through its integration of extensive linguistic features. The systematic refinement strategy further enhances its ability to align aspects with corresponding opinions, ensuring accurate sentiment analysis. Overall, this work sets a new standard in sentiment analysis, offering potential for various applications like market analysis and automated feedback systems. It paves the way for future research into combining linguistic insights with deep learning for more sophisticated language understanding.

Data availability

The datasets analyzed during the current study are available in the Wu et al and Xu et al repositories, https://github.com/NJUNLP/GTS , https://github.com/xuuuluuu/SemEval-Triplet-data/tree/master/ASTE-Data-V2-EMNLP2020 .

Ruder, S., Ghaffari, P. & Breslin, J. G. A hierarchical model of reviews for aspect-based sentiment analysis. CoRR (2016). arXiv:1609.02745 .

Mohammad, A.-S., Al-Ayyoub, M., Al-Sarhan, H. & Jararweh, Y. Using aspect-based sentiment analysis to evaluate arabic news affect on readers. In 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC) , 436–441 (IEEE, 2015).

Phan, M. H. & Ogunbona, P. O. Modelling context and syntactical features for aspect-based sentiment analysis. In Jurafsky, D., Chai, J., Schluter, N. & Tetreault, J. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , 3211–3220, https://doi.org/10.18653/v1/2020.acl-main.293 (Association for Computational Linguistics, Online, 2020).

Xu, H., Liu, B., Shu, L. & Yu, P. S. BERT post-training for review reading comprehension and aspect-based sentiment analysis. CoRR (2019). arXiv:1904.02232 .

Chen, Z. & Qian, T. Enhancing aspect term extraction with soft prototypes. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2107–2117 (2020).

Toh, Z. & Wang, W. Dlirec: Aspect term extraction and term polarity classification system. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014) , 235–240 (2014).

Li, X., Bing, L., Li, P., Lam, W. & Yang, Z. Aspect term extraction with history attention and selective transformation. CoRR (2018). arXiv:1805.00760 .

Wu, C., Wu, F., Wu, S., Yuan, Z. & Huang, Y. A hybrid unsupervised method for aspect term and opinion target extraction. Knowl.-Based Syst. 148 , 66–73. https://doi.org/10.1016/j.knosys.2018.01.019 (2018).

Article   Google Scholar  

Dai, H. & Song, Y. Neural aspect and opinion term extraction with mined rules as weak supervision. CoRR (2019). arXiv:1907.03750 .

Kumar, A. et al. Aspect term extraction for opinion mining using a hierarchical self-attention network. Neurocomputing 465 , 195–204 (2021).

Tian, Y., Chen, G. & Song, Y. Enhancing aspect-level sentiment analysis with word dependencies. In Merlo, P., Tiedemann, J. & Tsarfaty, R. (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , 3726–3739, https://doi.org/10.18653/v1/2021.eacl-main.326 (Association for Computational Linguistics, Online, 2021).

Schouten, K. & Frasincar, F. Survey on aspect-level sentiment analysis. IEEE Trans. Knowl. Data Eng. 28 , 813–830. https://doi.org/10.1109/TKDE.2015.2485209 (2016).

Sun, K., Zhang, R., Mensah, S., Mao, Y. & Liu, X. Aspect-level sentiment analysis via convolution over dependency tree. In Inui, K., Jiang, J., Ng, V. & Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , 5679–5688, https://doi.org/10.18653/v1/D19-1569 (Association for Computational Linguistics, Hong Kong, China, 2019).

Zhou, J. et al. Moit: A novel task for mining opinions towards implicit targets. Eng. Appl. Artif. Intell. 126 , 106841. https://doi.org/10.1016/j.engappai.2023.106841 (2023).

Marrese-Taylor, E., Velásquez, J. D. & Bravo-Marquez, F. A novel deterministic approach for aspect-based opinion mining in tourism products reviews. Expert Syst. Appl. 41 , 7764–7775. https://doi.org/10.1016/j.eswa.2014.05.045 (2014).

Dey, S. Aspect extraction and sentiment classification of mobile apps using app-store reviews. CoRR (2017). arXiv:1712.03430 .

Yuan, L., Wang, J., Yu, L.-C. & Zhang, X. Encoding syntactic information into transformers for aspect-based sentiment triplet extraction. IEEE Trans. Affect. Comput. https://doi.org/10.1109/TAFFC.2023.3291730 (2023).

Liu, S., Li, K. & Li, Z. A robustly optimized BMRC for aspect sentiment triplet extraction. In Carpuat, M., de Marneffe, M.-C. & Meza Ruiz, I. V. (eds.) Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 272–278, https://doi.org/10.18653/v1/2022.naacl-main.20 (Association for Computational Linguistics, Seattle, United States, 2022).

Chen, H., Zhai, Z., Feng, F., Li, R. & Wang, X. Enhanced multi-channel graph convolutional network for aspect sentiment triplet extraction. In Muresan, S., Nakov, P. & Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2974–2985, https://doi.org/10.18653/v1/2022.acl-long.212 (Association for Computational Linguistics, Dublin, Ireland, 2022).

Aziz, K. et al. Urduaspectnet: Fusing transformers and dual gcn for urdu aspect-based sentiment detection. ACM Trans. Asian Low-Resour. Lang. Inf. Process. https://doi.org/10.1145/3663367 (2024). Just Accepted.

Fei, H. et al. On the robustness of aspect-based sentiment analysis: Rethinking model, data, and training. ACM Trans. Inf. Syst. https://doi.org/10.1145/3564281 (2022).

Shi, L., Han, D., Han, J., Qiao, B. & Wu, G. Dependency graph enhanced interactive attention network for aspect sentiment triplet extraction. Neurocomputing 507 , 315–324. https://doi.org/10.1016/j.neucom.2022.07.067 (2022).

Liu, J. et al. Unified instance and knowledge alignment pretraining for aspect-based sentiment analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing 31 , 2629–2642. https://doi.org/10.1109/TASLP.2023.3290431 (2023).

Yang, H., Zhang, C. & Li, K. Pyabsa: A modularized framework for reproducible aspect-based sentiment analysis. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , CIKM ’23, 5117-5122, https://doi.org/10.1145/3583780.3614752 (Association for Computing Machinery, New York, NY, USA, 2023).

Chen, C., Teng, Z., Wang, Z. & Zhang, Y. Discrete opinion tree induction for aspect-based sentiment analysis. In Muresan, S., Nakov, P. & Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2051–2064, https://doi.org/10.18653/v1/2022.acl-long.145 (Association for Computational Linguistics, Dublin, Ireland, 2022).

Mao, R. & Li, X. Bridging towers of multi-task learning with a gating mechanism for aspect-based sentiment analysis and sequential metaphor identification. In Proceedings of the AAAI conference on artificial intelligence 35 , 13534–13542 (2021).

Nazir, A., Rao, Y., Wu, L. & Sun, L. Issues and challenges of aspect-based sentiment analysis: A comprehensive survey. IEEE Trans. Affect. Comput. 13 , 845–863. https://doi.org/10.1109/TAFFC.2020.2970399 (2022).

Liu, H., Chatterjee, I., Zhou, M., Lu, X. S. & Abusorrah, A. Aspect-based sentiment analysis: A survey of deep learning methods. IEEE Trans. Comput. Soc. Syst. 7 , 1358–1375. https://doi.org/10.1109/TCSS.2020.3033302 (2020).

Hoang, M., Bihorac, O. A. & Rouces, J. Aspect-based sentiment analysis using BERT. In Hartmann, M. & Plank, B. (eds.) Proceedings of the 22nd Nordic Conference on Computational Linguistics , 187–196 (Linköping University Electronic Press, Turku, Finland, 2019).

Pandey, S. V. & Deorankar, A. V. A study of sentiment analysis task and it’s challenges. In 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT) , 1–5, https://doi.org/10.1109/ICECCT.2019.8869160 (2019).

T, B. S. et al. Asvm: Adaboost with svm-based classifier implementation for aspect-based opinion mining to appraise products. In 2022 International Conference on Inventive Computation Technologies (ICICT) , 1034–1041, https://doi.org/10.1109/ICICT54344.2022.9850655 (2022).

Liu, P., Zhang, L. & Gulla, J. A. Multilingual review-aware deep recommender system via aspect-based sentiment analysis. ACM Trans. Inf. Syst. https://doi.org/10.1145/3432049 (2021).

Jafarian, H., Taghavi, A. H., Javaheri, A. & Rawassizadeh, R. Exploiting bert to improve aspect-based sentiment analysis performance on persian language. In 2021 7th International Conference on Web Research (ICWR) , 5–8, https://doi.org/10.1109/ICWR51868.2021.9443131 (2021).

Zhang, W., He, R., Peng, H., Bing, L. & Lam, W. Cross-lingual aspect-based sentiment analysis with aspect term code-switching. In Moens, M.-F., Huang, X., Specia, L. & Yih, S. W.-t. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , 9220–9230, https://doi.org/10.18653/v1/2021.emnlp-main.727 (Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 2021).

Xiao, L. et al. Atlantis: Aesthetic-oriented multiple granularities fusion network for joint multimodal aspect-based sentiment analysis. Information Fusion 102304 (2024).

Ali, W., Yang, Y., Qiu, X., Ke, Y. & Wang, Y. Aspect-level sentiment analysis based on bidirectional-gru in siot. IEEE Access 9 , 69938–69950. https://doi.org/10.1109/ACCESS.2021.3078114 (2021).

Zhang, X., Yu, L. & Tian, S. Bgat: Aspect-based sentiment analysis based on bidirectional GRU and graph attention network. J. Intell. Fuzzy Syst. 44 , 3115–3126 (2023).

Nagelli, A. & Saleena, B. Optimal trained bi-long short term memory for aspect based sentiment analysis with weighted aspect extraction. J. Web Eng. 21 , 2115–2148 (2022).

Google Scholar  

Üveges, I. & Ring, O. Hunembert: A fine-tuned bert-model for classifying sentiment and emotion in political communication. IEEE Access 11 , 60267–60278. https://doi.org/10.1109/ACCESS.2023.3285536 (2023).

Du, K., Xing, F. & Cambria, E. Incorporating multiple knowledge sources for targeted aspect-based financial sentiment analysis. ACM Trans. Manage. Inf. Syst. https://doi.org/10.1145/3580480 (2023).

Lengkeek, M., van der Knaap, F. & Frasincar, F. Leveraging hierarchical language models for aspect-based sentiment analysis on financial data. Inf. Process. Manag. 60 , 103435. https://doi.org/10.1016/j.ipm.2023.103435 (2023).

Tian, Y., Chen, G. & Song, Y. Aspect-based sentiment analysis with type-aware graph convolutional networks and layer ensemble. In Toutanova, K. et al. (eds.) Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2910–2922, https://doi.org/10.18653/v1/2021.naacl-main.231 (Association for Computational Linguistics, Online, 2021).

Li, R. et al. Dual graph convolutional networks for aspect-based sentiment analysis. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , 6319–6329, https://doi.org/10.18653/v1/2021.acl-long.494 (Association for Computational Linguistics, Online, 2021).

Veyseh, A. P. B. et al. Improving aspect-based sentiment analysis with gated graph convolutional networks and syntax-based regulation. CoRR (2020). arXiv:2010.13389 .

Khan, J., Ahmad, N., Khalid, S., Ali, F. & Lee, Y. Sentiment and context-aware hybrid DNN with attention for text sentiment classification. IEEE Access 11 , 28162–28179. https://doi.org/10.1109/ACCESS.2023.3259107 (2023).

Zhang, Q., Wang, S. & Li, J. A contrastive learning framework with tree-lstms for aspect-based sentiment analysis. Neural Processing Letters 1–18 (2023).

Xue, W. & Li, T. Aspect based sentiment analysis with gated convolutional networks. CoRR (2018). arXiv:1805.07043 .

Ma, Y., Peng, H. & Cambria, E. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive lstm. In Proceedings of the AAAI conference on artificial intelligence , vol. 32 (2018).

Yu, G. et al. Making flexible use of subtasks: A multiplex interaction network for unified aspect-based sentiment analysis. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , 2695–2705 (2021).

Dai, J., Yan, H., Sun, T., Liu, P. & Qiu, X. Does syntax matter? A strong baseline for aspect-based sentiment analysis with roberta. CoRR (2021). arXiv:2104.04986 .

Chen, W., Du, J., Zhang, Z., Zhuang, F. & He, Z. A hierarchical interactive network for joint span-based aspect-sentiment analysis (2022). arXiv:2208.11283 .

Zhao, H., Huang, L., Zhang, R., Lu, Q. & Xue, H. SpanMlt: A span-based multi-task learning framework for pair-wise aspect and opinion terms extraction. In Jurafsky, D., Chai, J., Schluter, N. & Tetreault, J. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , 3239–3248, https://doi.org/10.18653/v1/2020.acl-main.296 (Association for Computational Linguistics, Online, 2020).

Mao, R., Liu, Q., He, K., Li, W. & Cambria, E. The biases of pre-trained language models: An empirical study on prompt-based sentiment analysis and emotion detection. IEEE transactions on affective computing (2022).

Chen, P., Sun, Z., Bing, L. & Yang, W. Recurrent attention network on memory for aspect sentiment analysis. In Palmer, M., Hwa, R. & Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , 452–461, https://doi.org/10.18653/v1/D17-1047 (Association for Computational Linguistics, Copenhagen, Denmark, 2017).

Tang, D., Qin, B. & Liu, T. Aspect level sentiment classification with deep memory network. arXiv preprint arXiv:1605.08900 (2016).

Medhat, W., Hassan, A. & Korashy, H. Sentiment analysis algorithms and applications: A survey. Ain Shams Eng J 5 , 1093–1113 (2014).

Brody, S. & Elhadad, N. An unsupervised aspect-sentiment model for online reviews. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics , 804–812 (2010).

Feldman, R. Techniques and applications for sentiment analysis. Commun. ACM 56 , 82–89 (2013).

Zhang, L., Wang, S. & Liu, B. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8 , e1253 (2018).

Liu, Q., Zhang, H., Zeng, Y., Huang, Z. & Wu, Z. Content attention model for aspect based sentiment analysis. In Proceedings of the 2018 World Wide Web Conference , WWW ’18, 1023-1032, https://doi.org/10.1145/3178876.3186001 (International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 2018).

Lin, Z. et al. A structured self-attentive sentence embedding. CoRR (2017). arXiv:1703.03130 .

Chen, Z., Xue, Y., Xiao, L., Chen, J. & Zhang, H. Aspect-based sentiment analysis using graph convolutional networks and co-attention mechanism. In Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part VI 28 , 441–448 (Springer, 2021).

Wang, K., Shen, W., Yang, Y., Quan, X. & Wang, R. Relational graph attention network for aspect-based sentiment analysis. CoRR (2020). arXiv:2004.12362 .

Galassi, A., Lippi, M. & Torroni, P. Attention in natural language processing. IEEE Trans Neural Netw Learn Syst 32 , 4291–4308 (2020).

Huang, Z., Zhao, H., Peng, F., Chen, Q. & Zhao, G. Aspect category sentiment analysis with self-attention fusion networks. In Database Systems for Advanced Applications: 25th International Conference, DASFAA 2020, Jeju, South Korea, September 24–27, 2020, Proceedings, Part III 25 , 154–168 (Springer, 2020).

Zhang, C., Li, Q. & Song, D. Aspect-based sentiment classification with aspect-specific graph convolutional networks. CoRR (2019). arXiv:1909.03477 .

Chaudhari, S., Mithal, V., Polatkan, G. & Ramanath, R. An attentive survey of attention models. ACM Trans Intell Syst Technol (TIST) 12 , 1–32 (2021).

Zhang, M. & Qian, T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In Webber, B., Cohn, T., He, Y. & Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 3540–3549, https://doi.org/10.18653/v1/2020.emnlp-main.286 (Association for Computational Linguistics, Online, 2020).

Zheng, Y., Zhang, R., Mensah, S. & Mao, Y. Replicate, walk, and stop on syntax: an effective neural network model for aspect-level sentiment classification. In Proceedings of the AAAI conference on artificial intelligence 34 , 9685–9692 (2020).

Zhang, C., Li, Q. & Song, D. Syntax-aware aspect-level sentiment classification with proximity-weighted convolution network. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval , SIGIR’19, 1145-1148, https://doi.org/10.1145/3331184.3331351 (Association for Computing Machinery, New York, NY, USA, 2019).

Huang, L., Sun, X., Li, S., Zhang, L. & Wang, H. Syntax-aware graph attention network for aspect-level sentiment classification. In Scott, D., Bel, N. & Zong, C. (eds.) Proceedings of the 28th International Conference on Computational Linguistics , 799–810, https://doi.org/10.18653/v1/2020.coling-main.69 (International Committee on Computational Linguistics, Barcelona, Spain (Online), 2020).

Xu, L., Pang, X., Wu, J., Cai, M. & Peng, J. Learn from structural scope: Improving aspect-level sentiment analysis with hybrid graph convolutional networks. Neurocomputing 518 , 373–383. https://doi.org/10.1016/j.neucom.2022.10.071 (2023).

Xiao, L. et al. Exploring fine-grained syntactic information for aspect-based sentiment classification with dual graph neural networks. Neurocomputing 471 , 48–59 (2022).

Zhang, R., Chen, Q., Zheng, Y., Mensah, S. & Mao, Y. Aspect-level sentiment analysis via a syntax-based neural network. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 , 2568–2583. https://doi.org/10.1109/TASLP.2022.3190731 (2022).

Xiao, L. et al. Cross-modal fine-grained alignment and fusion network for multimodal aspect-based sentiment analysis. Information Processing & Management 60 , 103508. https://doi.org/10.1016/j.ipm.2023.103508 (2023).

Liang, S., Wei, W., Mao, X.-L., Wang, F. & He, Z. BiSyn-GAT: Bi-syntax aware graph attention network for aspect-based sentiment analysis. In Findings of the Association for Computational Linguistics: ACL 2022 , https://doi.org/10.18653/v1/2022.findings-acl.144 (Association for Computational Linguistics, 2022).

Huang, B. et al. Crf-gcn: An effective syntactic dependency model for aspect-level sentiment analysis. Knowl.-Based Syst. 260 , 110125 (2023).

Bao, X., Wang, Z., Jiang, X., Xiao, R. & Li, S. Aspect-based sentiment analysis with opinion tree generation. In IJCAI 2022 , 4044–4050 (2022).

Wu, Z. et al. Grid tagging scheme for aspect-oriented fine-grained opinion extraction. In Cohn, T., He, Y. & Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020 , 2576–2585, https://doi.org/10.18653/v1/2020.findings-emnlp.234 (Association for Computational Linguistics, Online, 2020).

Miwa, M. & Sasaki, Y. Modeling joint entity and relation extraction with table representation. In Moschitti, A., Pang, B. & Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 1858–1869, https://doi.org/10.3115/v1/D14-1200 (Association for Computational Linguistics, Doha, Qatar, 2014).

Gupta, P., Schütze, H. & Andrassy, B. Table filling multi-task recurrent neural network for joint entity and relation extraction. In Matsumoto, Y. & Prasad, R. (eds.) Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers , 2537–2547 (The COLING 2016 Organizing Committee, Osaka, Japan, 2016).

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C. & Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) , 4171–4186, https://doi.org/10.18653/v1/N19-1423 (Association for Computational Linguistics, Minneapolis, Minnesota, 2019).

Dozat, T. & Manning, C. D. Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734 (2016).

Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. CoRR (2016). arXiv:1609.02907 .

Guo, Z., Zhang, Y. & Lu, W. Attention guided graph convolutional networks for relation extraction. In Korhonen, A., Traum, D. & Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , 241–251, https://doi.org/10.18653/v1/P19-1024 (Association for Computational Linguistics, Florence, Italy, 2019).

Read, J., Pfahringer, B., Holmes, G. & Frank, E. Classifier chains for multi-label classification. Mach. Learn. 85 , 333–359. https://doi.org/10.1007/s10994-011-5256-5 (2011).

Article   MathSciNet   Google Scholar  

Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S. & Androutsopoulos, I. SemEval-2015 task 12: Aspect based sentiment analysis. In Nakov, P., Zesch, T., Cer, D. & Jurgens, D. (eds.) Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) , 486–495, https://doi.org/10.18653/v1/S15-2082 (Association for Computational Linguistics, Denver, Colorado, 2015).

Pontiki, M. et al. Semeval-2016 task 5: Aspect based sentiment analysis. In ProWorkshop on Semantic Evaluation (SemEval-2016) , 19–30 (Association for Computational Linguistics, 2016).

Peng, H. et al. Knowing what, how and why: A near complete solution for aspect-based sentiment analysis. Proceedings of the AAAI Conference on Artificial Intelligence 34 , 8600–8607. https://doi.org/10.1609/aaai.v34i05.6383 (2020).

Xu, L., Li, H., Lu, W. & Bing, L. Position-aware tagging for aspect sentiment triplet extraction. In Webber, B., Cohn, T., He, Y. & Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2339–2349, https://doi.org/10.18653/v1/2020.emnlp-main.183 (Association for Computational Linguistics, Online, 2020).

Loshchilov, I. & Hutter, F. Fixing weight decay regularization in adam (2018).

Qi, P., Zhang, Y., Zhang, Y., Bolton, J. & Manning, C. D. Stanza: A python natural language processing toolkit for many human languages. In Celikyilmaz, A. & Wen, T.-H. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations , 101–108, https://doi.org/10.18653/v1/2020.acl-demos.14 (Association for Computational Linguistics, Online, 2020).

Zhang, C., Li, Q., Song, D. & Wang, B. A multi-task learning framework for opinion triplet extraction. In Cohn, T., He, Y. & Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020 , 819–828, https://doi.org/10.18653/v1/2020.findings-emnlp.72 (Association for Computational Linguistics, Online, 2020).

Li, X., Bing, L., Li, P. & Lam, W. A unified model for opinion target extraction and target sentiment prediction. 33 , 6714–6721. https://doi.org/10.1609/aaai.v33i01.33016714 (2019).

Dai, H. & Song, Y. Neural aspect and opinion term extraction with mined rules as weak supervision. In Korhonen, A., Traum, D. & Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , 5268–5277, https://doi.org/10.18653/v1/P19-1520 (Association for Computational Linguistics, Florence, Italy, 2019).

Wang, W., Pan, S. J., Dahlmeier, D. & Xiao, X. Coupled multi-layer attentions for co-extraction of aspect and opinion terms. 31 , https://doi.org/10.1609/aaai.v31i1.10974 (2017).

Xu, L., Chia, Y. K. & Bing, L. Learning span-level interactions for aspect sentiment triplet extraction. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , 4755–4766, https://doi.org/10.18653/v1/2021.acl-long.367 (Association for Computational Linguistics, Online, 2021).

He, R., Lee, W. S., Ng, H. T. & Dahlmeier, D. An interactive multi-task learning network for end-to-end aspect-based sentiment analysis. In Korhonen, A., Traum, D. & Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , 504–515, https://doi.org/10.18653/v1/P19-1048 (Association for Computational Linguistics, Florence, Italy, 2019).

Mao, Y., Shen, Y., Yu, C. & Cai, L. A joint training dual-mrc framework for aspect based sentiment analysis. Proceedings of the AAAI Conference on Artificial Intelligence 35 , 13543–13551. https://doi.org/10.1609/aaai.v35i15.17597 (2021).

Chen, S., Wang, Y., Liu, J. & Wang, Y. Bidirectional machine reading comprehension for aspect sentiment triplet extraction. Proceedings of the AAAI Conference on Artificial Intelligence 35 , 12666–12674. https://doi.org/10.1609/aaai.v35i14.17500 (2021).

Yan, H., Dai, J., Ji, T., Qiu, X. & Zhang, Z. A unified generative framework for aspect-based sentiment analysis. In Zong, C., Xia, F., Li, W. & Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , 2416–2429, https://doi.org/10.18653/v1/2021.acl-long.188 (Association for Computational Linguistics, Online, 2021).

Shi, J., Li, W., Bai, Q., Yang, Y. & Jiang, J. Syntax-enhanced aspect-based sentiment analysis with multi-layer attention. Neurocomputing 557 , 126730. https://doi.org/10.1016/j.neucom.2023.126730 (2023).

Download references

Acknowledgements

The authors are grateful for the support provided by the National Natural Science Foundation of China (NSFC), which funded this research under Project No. 62176187.

Author information

Authors and affiliations.

Key Laboratory of Aerospace Information Security and Trusted Computing Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China

Kamran Aziz & Donghong Ji

Department of Computer Science and Engineering, Sir Padampat Singhania University, Udaipur, 313601, Rajasthan, India

Prasun Chakrabarti

Department of Chemistry, Sir Padampat Singhania University, Udaipur, 313601, Rajasthan, India

Tulika Chakrabarti

Department of Computer science and Information Technology, Women university, Bagh, Azad Jammu and Kashmir, Pakistan

Muhammad Shahid Iqbal

School of Computer Science and Artifical Intelligent, Wenzhou University, Wenzhou, 325035, China

Rashid Abbasi

You can also search for this author in PubMed   Google Scholar

Contributions

K.A. conceived the study, conducted the majority of the experiments, and wrote the main manuscript text. D.J. provided critical feedback and helped shape the research, analysis, and manuscript. P.C. contributed to the design and implementation of the research, and T.C. assisted in data analysis and interpretation. M.S.I. and R.A. contributed to the preparation of figures and data visualization. All authors reviewed and approved the manuscript.

Corresponding author

Correspondence to Donghong Ji .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Aziz, K., Ji, D., Chakrabarti, P. et al. Unifying aspect-based sentiment analysis BERT and multi-layered graph convolutional networks for comprehensive sentiment dissection. Sci Rep 14 , 14646 (2024). https://doi.org/10.1038/s41598-024-61886-7

Download citation

Received : 28 January 2024

Accepted : 10 May 2024

Published : 25 June 2024

DOI : https://doi.org/10.1038/s41598-024-61886-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

analysis sentiment research

Send us an email

Sentiment analysis examples: How marketers are unlocking consumer insights

Written by by Ronnie Gomez

Published on  June 20, 2024

Reading time  5 minutes

Table of Contents

We’ve all heard the phrase “the customer is always right”. The standard brick-and-mortar shopping experiences of the past made that a pretty straightforward idea. Today, consumers can share their opinion anywhere—in store, on review pages, on their personal media profiles, on your brand’s social media profiles. The list goes on.

Advancements in sentiment analysis technology have made it so businesses can keep up with their customers at scale by quickly synthesizing audience thoughts and feelings about brands, industries, trending topics and more.

In this article, we’re sharing four sentiment analysis examples designed to help you think bigger about its potential role in your business strategy.

Why is sentiment analysis important for businesses?

By understanding the sentiment behind customer interactions , businesses can learn what they are doing well and what they need to improve. This insight enables teams throughout the organization to make better decisions and enhance their products and services.

Here are three of the most meaningful impacts of sentiment analysis in business.

It strengthens customer service

Sentiment analysis helps businesses monitor customer satisfaction trends and pinpoint areas for improvement. Marketers can use these insights to craft strategies that boost customer loyalty and retention.

We know consistently analyzing customer feedback allows businesses to address issues before they escalate, boosting satisfaction and trust. That said, customers are human. Their feedback has nuances, whether it be in the form of regional slang, spelling or even the use of emojis.

An X post celebrating a product from the skin care brand Topicals. The post uses slang and emojis.

Aspect-based sentiment analysis tools enable businesses to detect nuances and identify trends in large volumes of customer feedback, so brands can take appropriate action.

It improves brand reputation

Sentiment data helps brands understand the timeless question: What do people think about us? What used to take countless focus groups and user surveys can now be determined from intuitive reporting dashboards.

The Sentiment Trends table, found in the performance tab of Sprout Social's Listening tool. The table displays trends in sentiment over time.

Through analyzing the emotional tone of online conversations, companies can precisely identify their brand strengths and areas to improve. Equipped with this insight, they can develop targeted marketing plans, enhance product or service offerings‌ and eventually, strengthen their brand’s image.

Businesses that consistently manage their brand reputation will attract and retain customers, nurture loyalty and fuel business growth.

It provides consumer insight

Sentiment data isn’t just about knowing your current customers; it’s also your ticket to winning more in the future.

Sentiment analysis presents businesses with a unique opportunity to dive into the minds of their target audience, gaining a deeper understanding of their needs and expectations. This knowledge acts as a guiding compass, helping businesses develop products and services that precisely meet consumer demands, leading to increased growth and profitability.

Approaching sentiment data from an industry lens also allows brands to benchmark their offerings against key competitors, making it easier to identify opportunities to swoop in and win their customers.

3 sentiment analysis example to inspire your approach

The following sentiment analysis examples showcase how other brands use sentiment analysis data to refine their approach to product development, customer care and more. Use them to inspire your audience research strategy and playbook.

1. Dig into product-specific trends in consumer perception

If your business offers a variety of different products, services or in-house brands, you can use sentiment analysis data to uncover which offerings are outperforming expectations and why. Here’s a real-world social media sentiment analysis example that proves it.

A food and beverage company used sentiment analysis tools available in Sprout’s Social Listening solution to determine which snack bar flavors consumers preferred most. While the team knew a few products were more popular than others, they didn’t know why.

To achieve this, they used Listening Topic Themes , which are additional groupings applied to a Listening Topic’s messages to compare, filter and analyze data. The team created a Theme for each flavor of their snack bar to compare Listening data across categories. By filtering the surfaced messages by sentiment, they could break down the distribution of positive, negative‌ and neutral messages shared under each Theme.

A preview of Sprout’s Listening dashboard highlighting Sentiment Summary and Sentiment Trends.

After reviewing the data aggregated by Sprout’s Listening tool, the team found that chocolate and chocolate-related flavors generated a significantly higher ratio of positive sentiment. The positive sentiment stemmed from consumer preferences toward product consistency. Their fans liked chewy over crunchy.

This information was then passed on to the research and development team, who used it to inform future product releases.  

2. Perform risk assessments ahead of high profile campaigns and initiatives

As internet conversations grow more vast and complex, brand safety has become more crucial than ever. Even the most well-resourced public relations teams can’t proactively identify brand safety issues without smart tools that guide them on where to look.

Big Machine Label Group uses sentiment analysis tools to stay on top of what people are saying online, both good and bad.

“[Sentiment analysis] has been helpful as far as trying to understand how people are reacting to us and our artists. It’s also extremely helpful to have a listening tool where we can flag certain conversations internally and amongst the team, including conversations that could be delicate.” – Matt Brum, Director of Digital Strategy and Social Media, BMLG

  Conducting a sentiment analysis on campaign themes and topics before major launches is the only surefire way to consider your message from every possible angle. By analyzing messages and conversations that drive negative sentiment, you can identify potential issues early. This can mean the difference between a well-executed campaign and a well-executed crisis management plan.

3. Identify customer experience opportunities

If you’re only paying attention to the customer feedback that comes in the form of glowing reviews or angry DMs, you’re not getting the full story. Customers share feedback every time they interact with your brand. Even a simple heart-eyes emoji left in your comment section can say a lot.

Conducting sentiment analysis on incoming social messages and reviews provides a more holistic view of your customer experience . Brands using Sprout do this by using the sentiment analysis features available in the Smart Inbox and Review Management tools.

Sprout’s sentiment analysis is built using a machine learning technique called a Deep Neural Network (DNN). When ‌you receive a new review or social message, the DNN computes a probability score to determine whether the content is positive, negative or neutral. Messages are automatically given a sentiment label, but users can change those classifications if needed.

Brands use this tool in collaboration with the Tagging feature to better understand the sentiment driving trends in customer feedback. For example, an automotive brand can create a Tag for feedback specific to their dealership experience. From there, they can filter Tagged messages by sentiment classification for deeper insights—like all the feedback about a positive dealership experience.

Feeling inspired by these sentiment analysis examples?

You know what they say about inspiration. It’s like a spark; you have to act on it before it fades.

Schedule a demo of Sprout Social’s social media listening tools for a personalized assessment of your social strategy. Our team will put you on the path to getting more from your audience sentiment data, so you can drive even more impact from social, faster.

Sentiment analysis, or opinion mining, is an AI technique that determines whether the sentiment in a piece of data is positive, negative‌ or neutral. This method uses algorithms that collaborate with other AI tasks, such as named entity recognition (NER), natural language processing (NLP)‌ and machine learning (ML), to quickly and efficiently assess sentiment in data.

Sentiment analysis uses algorithms that, when combined with other AI tasks such as named entity recognition (NER), natural language processing (NLP)‌ and machine learning (ML), can quickly and accurately determine the sentiment of the data.

[Toolkit] Communications Toolkit to Safeguard Your Brand

Find Your Next Social Media Management Tool With This Scorecard

How to ladder up your brand’s social media maturity

3 Social media executives share what it takes to build a long-term career in social

  • Social Listening
  • Social Media Strategy

Reddit social listening: What it is and strategies for using it

  • Leveling Up

How to create better social listening queries

  • Future of Marketing

The role of sentiment analysis in marketing

Harnessing X (Twitter) sentiment analysis for strategic business insights

  • Now on slide

Build and grow stronger relationships on social

Sprout Social helps you understand and reach your audience, engage your community and measure performance with the only all-in-one social media management platform built for connection.

ACM Digital Library home

  • Advanced Search

Sentiment Analysis and Corpus: Cognitive Perspective and Overhead-accuracy Tradeoff

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, index terms.

Computing methodologies

Artificial intelligence

Natural language processing

Machine learning

Machine learning approaches

Kernel methods

Support vector machines

Information systems

Information retrieval

Retrieval tasks and goals

Sentiment analysis

Recommendations

An automatic non-english sentiment lexicon builder using unannotated corpus.

Sentiment lexicons in the English language are widely accessible while in many other languages, these resources are extremely deficient. Current techniques and methods for sentiment analysis focus mainly on the English language, whereas other languages ...

Extracting domain-specific opinion words for sentiment analysis

In this paper, we consider opinion word extraction, one of the key problems in sentiment analysis. Sentiment analysis (or opinion mining) is an important research area within computational linguistics. Opinion words, which form an opinion lexicon, ...

Sentiment analysis of urdu language: handling phrase-level negation

The paper investigates and proposes the treatment of the effect of the phrase-level negation on the sentiment analysis of the Urdu text based reviews. The negation acts as the valence shifter and flips or switches the inherent sentiments of the ...

Information

Published in.

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

Google, USA

Association for Computing Machinery

New York, NY, United States

Publication History

Check for updates, author tags.

  • Corpus of Contemporary American
  • sentiment analysis
  • support vector machine
  • natural language processing
  • Research-article

Funding Sources

  • Natural Science Foundation of Zhejiang Province
  • Humanities and Social Sciences Research Project of the Ministry of Education

Contributors

Other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 134 Total Downloads
  • Downloads (Last 12 months) 123
  • Downloads (Last 6 weeks) 11

View Options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

View this article in Full Text.

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Regional Economic Sentiment: Constructing Quantitative Estimates from the Beige Book and Testing Their Ability to Forecast Recessions

Ilias Filippou, Christian Garciga, James Mitchell, My T. Nguyen

June 25, 2024

Federal Reserve Research: Cleveland

We use natural language processing methods to quantify the sentiment expressed in the Federal Reserve’s anecdotal summaries of current economic conditions in the national and 12 Federal Reserve District-level economies as published eight times per year in the Beige Book since 1970. We document that both national and District-level economic sentiment tend to rise and fall with the US business cycle. But economic sentiment is extremely heterogeneous across Districts, and we find that national economic sentiment is not always the simple aggregation of District-level sentiment. We show that the heterogeneity in District-level economic sentiment can be used, over and above the information contained in national economic sentiment, to better forecast US recessions.

Read the paper

Dancing in the syntax forest: fast, accurate and explainable sentiment analysis with SALSA

  • Gómez-Rodríguez, Carlos
  • Imran, Muhammad
  • Vilares, David
  • Solera, Elena
  • Kellert, Olga

Sentiment analysis is a key technology for companies and institutions to gauge public opinion on products, services or events. However, for large-scale sentiment analysis to be accessible to entities with modest computational resources, it needs to be performed in a resource-efficient way. While some efficient sentiment analysis systems exist, they tend to apply shallow heuristics, which do not take into account syntactic phenomena that can radically change sentiment. Conversely, alternatives that take syntax into account are computationally expensive. The SALSA project, funded by the European Research Council under a Proof-of-Concept Grant, aims to leverage recently-developed fast syntactic parsing techniques to build sentiment analysis systems that are lightweight and efficient, while still providing accuracy and explainability through the explicit use of syntax. We intend our approaches to be the backbone of a working product of interest for SMEs to use in production.

  • Computer Science - Computation and Language;

IMAGES

  1. Quick Introduction to Sentiment Analysis

    analysis sentiment research

  2. Sentiment Analysis: Types, Tools, and Use Cases

    analysis sentiment research

  3. Sentiment Analysis: All You Need to Know

    analysis sentiment research

  4. Best Sentiment Analysis In Machine Learning In 2022

    analysis sentiment research

  5. Introduction to Sentiment Analysis: Concept, Working, and Application

    analysis sentiment research

  6. Sentiment analysis dataset

    analysis sentiment research

COMMENTS

  1. What Is Sentiment Analysis?

    Sentiment analysis, or opinion mining, is the process of analyzing large volumes of text to determine whether it expresses a positive sentiment, a negative sentiment or a neutral sentiment. Companies now have access to more data about their customers than ever before, presenting both an opportunity and a challenge: analyzing the vast amounts of ...

  2. Sentiment Analysis and How to Leverage It

    Sentiment analysis is a powerful tool that offers a number of advantages, but like any research method, it has some limitations. Advantages of sentiment analysis: Accurate, unbiased results; Enhanced insights; More time and energy available for staff do to higher-level tasks; Consistent measures you can use to track sentiment over time

  3. What is Sentiment Analysis?

    Sentiment analysis in research is a powerful tool for understanding insights in the context of how research participants feel about a particular object, concept, or phenomenon. In this article, we will examine how sentiments expressed in data can provide critical insights about individual perspectives.

  4. Sentiment Analysis Guide

    Sentiment Analysis Research & Courses. After learning the basics of sentiment analysis, and understanding how it can help you, you might want to delve further into the topic: Sentiment Analysis Papers. The literature around sentiment analysis is massive; there are more than 55,700 scholarly articles, papers, theses, books, and abstracts out there.

  5. A survey on sentiment analysis methods, applications, and challenges

    Sentiment analysis is the process of gathering and analyzing people's opinions, thoughts, and impressions regarding various topics, products, subjects, and services. ... and dominance in order to establish a baseline for future research on sentiment and attentiveness. Footnote 10. LingPipe can work on a wide range of activities, such as ...

  6. Sentiment Analysis: A Complete Guide [Updated for 2023]

    Sentiment analysis, also known as opinion mining, is the process of determining the emotions behind a piece of text. Sentiment analysis aims to categorize the given text as positive, negative, or neutral. Furthermore, it then identifies and quantifies subjective information about those texts with the help of: 2.

  7. Sentiment analysis

    Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, ... The research revealed that there is a positive correlation between favorites and retweets in terms of sentiment valence. Others have examined the impact of YouTube on the dissemination of ...

  8. Sentiment Analysis: Concept, Analysis and Applications

    12. Sentiment analysis is contextual mining of text which identifies and extracts subjective information in source material, and helping a business to understand the social sentiment of their brand, product or service while monitoring online conversations. However, analysis of social media streams is usually restricted to just basic sentiment ...

  9. More than a Feeling: Accuracy and Application of Sentiment Analysis

    This makes accuracy, i.e., the share of correct sentiment predictions out of all predictions, also known as hit rate, a critical concern for sentiment research. Hartmann et al. (2019) were among the first to conduct a systematic comparison of the accuracy of sentiment analysis methods for marketing applications.

  10. How to Conduct Sentiment Analysis

    This How-to Guide describes what sentiment analysis is, when it might be used, how sentiment analysis software works, and how sentiment analysis can be applied in academic research projects. Essentially, the task of sentiment analysis software is to guess the emotions or opinions expressed in (usually) texts.

  11. Sentiment Analysis

    Sentiment analysis is a critical NLP technique for understanding the sentiment of text, and is essential when looking at customer feedback. ... Market research. Sentiment analysis can help companies identify emerging trends, analyze competitors, and probe new markets. Companies may want to analyze reviews on competitors' products or services.

  12. What Is Sentiment Analysis And How It Works

    Sentiment analysis employs AI techniques, using natural language processing and machine learning, to determine if data sentiment is positive, negative or neutral. It simplifies countless social media interactions so that you can understand and analyze customer sentiment easily.

  13. Introduction (Chapter 1)

    Summary. Sentiment analysis, also called opinion mining, is the field of study that analyzes people's opinions, sentiments, appraisals, attitudes, and emotions toward entities and their attributes expressed in written text. The entities can be products, services, organizations, individuals, events, issues, or topics.

  14. What is Sentiment Analysis? Guide, Tools, Examples

    Market Research and Competitive Analysis. In market research, sentiment analysis is a powerful tool for understanding consumer preferences, market trends, and competitive landscapes. For instance, analyzing sentiment in product reviews and online forums can reveal emerging trends, feature preferences, and competitor strengths and weaknesses.

  15. Top 15 sentiment analysis tools to consider in 2024

    6. Buffer. Buffer offers easy-to-use social media management tools that help with publishing, analyzing performance and engagement. One of the tool's features is tagging the sentiment in posts as 'negative, 'question' or 'order' so brands can sort through conversations, and plan and prioritize their responses. 7.

  16. Sentiment Analysis

    Some subcategories of research in sentiment analysis include: multimodal sentiment analysis, aspect-based sentiment analysis, fine-grained opinion analysis, language specific sentiment analysis. More recently, deep learning techniques, such as RoBERTa and T5, are used to train high-performing sentiment classifiers that are evaluated using ...

  17. What is Sentiment Analysis? Guide, Explanation, How-to, Examples

    Sentiment analysis (also known as emotions AI, opinion mining, or affective rating) systematically analyzes and classifies text to determine a tone of positivity, negativity, or neutrality. Simply put, it is the process of using computerized systems to determine the emotional tone and context of words used in customer feedback.

  18. Sentiment analysis: The Complete Guide

    Sentiment analysis turns market research from a guessing game into a fact-based strategy session. By dissecting positive and negative words from other's opinions from reviews and surveys, companies can pivot or persevere with confidence. Sentiment Analysis Tools and Software.

  19. What is Sentiment Analysis?

    Sentiment analysis technologies allow the public relations team to be aware of related ongoing stories. The team can evaluate the underlying mood to address complaints or capitalize on positive trends. Market research. A sentiment analysis system helps businesses improve their product offerings by learning what works and what doesn't.

  20. Sentiment analysis: A survey on design framework, applications and

    Sentiment analysis is a solution that enables the extraction of a summarized opinion or minute sentimental details regarding any topic or context from a voluminous source of data. Even though several research papers address various sentiment analysis methods, implementations, and algorithms, a paper that includes a thorough analysis of the ...

  21. Sentiment Analysis

    Crisis Management: Sentiment analysis can help in identifying negative sentiments in real-time, which can act as an early warning system for crises or issues that need immediate attention. Market Research: Sentiment analysis can be used to gauge public opinion on a large scale, which is invaluable for market research. Companies can get insights ...

  22. Unifying aspect-based sentiment analysis BERT and multi ...

    Aspect Based Sentiment Analysis represents a granular approach to parsing sentiments in text, focusing on the specific aspects or features discussed and the sentiment directed towards them 1,2,3,4 ...

  23. Sentiment Analysis: A Comparative Study on Different Approaches

    Sentiment analysis (SA) is an intellectual process of extricating user's feelings and emotions. It is one of the pursued field of Natural Language Processing (NLP). ... A lot of research work is being held in the field of sentiment analysis due to its significance in the marketing level competition and the changing needs of the people ...

  24. Sentiment Analysis Examples and Use Cases

    3 sentiment analysis example to inspire your approach. The following sentiment analysis examples showcase how other brands use sentiment analysis data to refine their approach to product development, customer care and more. Use them to inspire your audience research strategy and playbook. 1. Dig into product-specific trends in consumer perception

  25. Measuring News Sentiment

    This paper demonstrates state-of-the-art text sentiment analysis tools while developing a new time-series measure of economic sentiment derived from economic and financial newspaper articles from January 1980 to April 2015. ... Lastly, we provide two applications to the economic research on sentiment. First, we show that daily news sentiment is ...

  26. Ensuring Transparency in Using ChatGPT for Public Sentiment Analysis

    Sentiment analysis of public services for smart society: Literature review and future research directions. Government Information Quarterly 39, 3 (2022), 101708.

  27. Sentiment Analysis and Corpus: Cognitive Perspective and Overhead

    Therefore, sentiment analysis on user comments and product discussions, such as Amazon reviews, becomes increasingly useful and important. In this paper, the effect of corpus on sentiment analysis of the Amazon review dataset with the aid of support vector machine is studied. ... Annals of Operations Research 300 (2020), 493-513. Crossref ...

  28. Regional Economic Sentiment: Constructing Quantitative Estimates from

    Federal Reserve Research: Cleveland. We use natural language processing methods to quantify the sentiment expressed in the Federal Reserve's anecdotal summaries of current economic conditions in the national and 12 Federal Reserve District-level economies as published eight times per year in the Beige Book since 1970.

  29. Improving stock market prediction accuracy using sentiment and

    The utilization of sentiment analysis as a method for predicting stock market trends has gained significant attention recently, especially during economic crises. This research aims to assess the predictive accuracy of sentiment analysis in the stock market by constructing a reinforced model that integrates both sentiment and technical analysis. While prior studies have concentrated on social ...

  30. Dancing in the syntax forest: fast, accurate and explainable sentiment

    Sentiment analysis is a key technology for companies and institutions to gauge public opinion on products, services or events. However, for large-scale sentiment analysis to be accessible to entities with modest computational resources, it needs to be performed in a resource-efficient way. While some efficient sentiment analysis systems exist, they tend to apply shallow heuristics, which do ...