BERT: The Aspect-Based Sentiment Analyser

March 24, 2026

While BERT is great for sentiment analysis in natural language processing, it has a few limitations too.

Sentiment analysis of a complex sentence often requires a contextual understanding of the entire sentence rather than understanding the words in isolation or from left-to-right. Such contextual understanding requires simultaneous bi-directional reading of the sentence. Bidirectional Encoder Representations from Transformers, i.e., BERT, is considered the gold standard for Aspect-Based Sentiment Analysis (ABSA) because of its unique ability to understand context and directionality of a sentence and has become non-negotiable for it.

ABSA is an advanced NLP technique that can analyse texts beyond general sentiment (e.g., “This phone is okay”) to specific feature-level insights (e.g., “The battery is great (+), but the camera is blurry (-)”).

The ABSA concept can be divided into four key terms.

Aspect Terms (AT): The specific words used to describe a feature (e.g., ‘attended’, ‘cost’).

Aspect Categories (AC): The categories of the aspect term (e.g., ‘Service’, ‘Hardware’).

Opinion Terms (OT): The adjectives used to qualify the aspect term (e.g., ‘slow’, ‘brilliant’).

Sentiment polarity: The final classification—Positive, Negative, or Neutral.

Implementation pipeline

Step 1: Aspect Extraction (AE)

Identify the aspect term. For this, rule-based or supervised methods can be used. In the rule-based approach, POS (parts of speech) tagging is used to find noun or noun phrases as aspects. Secondly, in a supervised method, a BERT-based named entry recognition (NER) model trained to label aspect terms is used.

Step 2: Sentiment Classification (SC)

To identify the sentiment with respect to the aspect term, feed the model with both the sentence and the aspect as a pair: [CLS] Sentence [SEP] Aspect [SEP].

Step 3: Triplet Extraction (ASTE)

Extract sentiment as a triplet: (Aspect, Sentiment, Opinion); for example: (‘mobile’, ‘positive’, ‘good’).

To train a model, the following industry-standard datasets can be used.

Dataset name	Domain	Description
SemEval-2014 (Task 4)	Laptops and restaurants	The ‘gold standard’ for ABSA research. Highly annotated.
AWARE (2025)	Smartphone apps	Modern dataset covering Productivity, Social, and Games.
M-ABSA (2025)	Multilingual	Covers 21 languages and 7 domains (travel, movies, etc).
Amazon/Yelp ABSA	E-commerce	Subsets of Amazon reviews specifically labelled for feature-level sentiment.

Python implementation

Here is a simple Python implementation of the above ABSA approach for the review text: “I bought a spectacular camera but it has a terrible battery.”

import os
os.environ[“HF_HUB_DISABLE_SYMLINKS_WARNING”] = “1”
# import your ABSA or Transformers libraries
from transformers import AutoModel
import spacy
from transformers import pipeline
# Load SpaCy for Aspect/Opinion Extraction (Step 1 & 3)
nlp = spacy.load(“en_core_web_sm”)
#Step 2
# Load a pre-trained BERT pipeline for Sentiment 
# Load a model specifically fine-tuned for ABSA or product reviews
# device=0 means the first GPU, device=-1 means CPU
import torch
# This checks if a GPU is available and set the device accordingly
device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
print(f”Using device: {device}”)
d = 0 if torch.cuda.is_available() else -1
sentiment_analyzer = pipeline(
    “sentiment-analysis”, 
    model=”LiYuan/amazon-review-sentiment-analysis”,
    device=d
)
#sentiment_analyzer = pipeline(“sentiment-analysis”, model=”LiYuan/amazon-review-sentiment-analysis”)
def run_absa_pipeline(text):
    doc = nlp(text)
    triplets = []
    # STEP 1:     
    # Aspect Extraction (Finding Nouns) ---
    # We look for nouns as potential aspects
    aspects = [token for token in doc if token.pos_ in [“NOUN”, “PROPN”]]
    for aspect in aspects:
        # STEP 3 (Part A): 
       # Finding Opinion Terms ---
        # We look for adjectives (opinions) connected to the aspect in the grammar tree
        opinions = [child.text for child in aspect.children if child.pos_ == “ADJ”]
        if not opinions:
            		continue # Skip if no descriptive word is found
        for opinion in opinions:
                  # --- STEP 2: Sentiment Classification ---
           	# We construct an “Auxiliary Sentence” for BERT: [Aspect] is [Opinion]
            	# This helps BERT focus its context.
            	aux_sentence = f”{aspect.text} is {opinion}”
            	result = sentiment_analyzer(aux_sentence)[0]
            	sentiment = result[‘label’]
            	# --- STEP 3 (Part B): Final Triplet Extraction ---
            	triplets.append({
                	“aspect”: aspect.text,
                	“opinion”: opinion,
                	“sentiment”: sentiment
            	})
    return triplets
# Test the pipeline
review = “I bought a spectacular camera but it has a terrible battery.”
results = run_absa_pipeline(review)
print(f”Review: {review}\n”)
print(f”{‘Aspect’:<15} | {‘Opinion’:<15} | {‘Sentiment’}”)
print(“-” * 45)
for t in results:
    print(f”{t[‘aspect’]:<15} | {t[‘opinion’]:<15} | {t[‘sentiment’]}”)

The output is:

Using device: cuda
Device set to use cuda:0
Review: I bought a spectacular camera but it has a terrible battery.
Aspect | Opinion | Sentiment
----------------------------------------------------
camera | spectacular | 5 stars
battery | terrible | 1 star

Note: This ABSA model is review text-dependent; make a simple sentence to have prompt sentiment analysis.

Limitations

Although BERT has revolutionised NLP and is widely used in modern ABSA, it isn’t a faultless model.

Size limitation: Standard BERT cannot process sequences longer than 512 tokens; it simply truncates anything beyond this limit.

Memory limitation: As BERT uses quadratic self-attention, for every doubling of the length of the input the computational cost increases fourfold.

Limitation of statistical vs logical pattern matching: BERT is a statistical pattern matcher, and it can misinterpret negation or sarcasm if the training data is not diverse enough. For example, a BERT model cannot interpret ‘The screen is anything but small’.

Computational cost: A fine-tuned BERT system usually needs a GPU (like an NVIDIA). Training on a standard laptop CPU is slow.

A pre-training model: As BERT is a pre-trained masked language model (guessing hidden words), ABSA often faces challenges in its classification tasks (since sequential classification is used in ABSA).

Domain incompatibility: The standard BERT models are generally trained on Wikipedia and books; that’s why it often faces challenges for texts with slang, emojis or technical jargon.

Sentiment analysis using the BERT mechanism is a simple and elegant approach. It offers an easy way to understand the contextual meaning of a sentence rather than understanding the words in isolation or from left-to-right. While BERT has enhanced NLP and is the engine behind modern ABSA, it has several imitations also. As BERT provides a statistical framework for NLP, it is necessary to be concerned about the construction of the sentence as well as the logic of the constructions.

Implementation pipeline

Step 1: Aspect Extraction (AE)

Step 2: Sentiment Classification (SC)

Step 3: Triplet Extraction (ASTE)

Python implementation

Limitations

LEAVE A REPLY Cancel reply

Thought Leaders

HOW TOs

MOST POPULAR

Open Journey

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY