[Paper] BanglaASTE: A Novel Framework for Aspect-Sentiment-Opinion Extraction in Bangla E-commerce Reviews Using Ensemble Deep Learning

Published: (November 26, 2025 at 08:27 AM EST)
3 min read
Source: arXiv

Source: arXiv - 2511.21381v1

Overview

The paper presents BanglaASTE, the first end‑to‑end framework that can automatically pull out aspect terms, opinion expressions, and their sentiment polarity from Bangla‑language e‑commerce reviews. By releasing a new annotated dataset and an ensemble deep‑learning model, the authors push aspect‑based sentiment analysis (ABSA) forward for a low‑resource language that has been largely ignored by the research community.

Key Contributions

  • Bangla ASTE dataset – 3,345 manually annotated product reviews from Daraz, Facebook, and Rokomari, each labeled with aspect‑opinion‑sentiment triplets.
  • Hybrid matching pipeline – a graph‑based algorithm that links aspect and opinion spans using semantic similarity, handling informal spelling and code‑mixing typical of Bangla social text.
  • Ensemble model – combines BanglaBERT contextual embeddings with an XGBoost classifier, delivering a strong boost over vanilla transformer or classic baselines.
  • Comprehensive evaluation – reports 89.9 % accuracy and 89.1 % F1, outperforming prior multilingual ABSA approaches on the same data.
  • Open‑source release – code, trained models, and the dataset are made publicly available for reproducibility and downstream applications.

Methodology

  1. Data collection & annotation – Reviews were scraped from three major Bangla e‑commerce platforms. Trained annotators marked three elements per sentence:
    • Aspect (e.g., “battery life”)
    • Opinion (e.g., “lasting long”)
    • Sentiment (positive/negative/neutral).
  2. Pre‑processing – Normalization steps address common Bangla quirks: inconsistent spelling, mixed English numerals, and emoticons.
  3. Graph‑based matching – Each sentence is turned into a bipartite graph where nodes are candidate aspect spans and opinion spans. Edge weights are computed via cosine similarity of their BanglaBERT embeddings, and a maximum‑weight matching algorithm selects the most plausible aspect‑opinion pairs.
  4. Ensemble classification
    • BanglaBERT generates contextual vectors for each candidate span.
    • XGBoost consumes these vectors (plus handcrafted features like POS tags and distance metrics) to predict the sentiment polarity of the pair.
    • The final triplet list is the union of the graph‑matched pairs with the XGBoost‑predicted sentiment.
  5. Training & evaluation – 80 % of the dataset is used for training, 10 % for validation, and 10 % for testing. Standard metrics (accuracy, precision, recall, F1) are reported per component and for the full triplet extraction task.

Results & Findings

ModelAccuracyPrecisionRecallF1
Baseline CRF + Word2Vec71.4 %68.9 %66.2 %67.5 %
Multilingual BERT (mBERT)82.1 %80.5 %78.9 %79.7 %
BanglaASTE (Ensemble)89.9 %88.6 %89.6 %89.1 %
  • The graph‑matching step alone raises aspect‑opinion pairing F1 by ~9 % over a naïve sequential tagging baseline.
  • Adding XGBoost for sentiment classification yields the final 2‑point F1 gain, confirming that shallow‑tree ensembles still complement deep embeddings in low‑resource settings.
  • Error analysis shows most remaining mistakes stem from highly ambiguous opinions (“meh”) and extreme spelling variations not covered by the normalization rules.

Practical Implications

  • E‑commerce analytics – Companies can automatically surface product‑level pain points (e.g., “slow charger”) and strengths (“crisp display”) from Bangla reviews, enabling faster product‑roadmap decisions.
  • Customer support automation – Chatbots can be equipped with the triplet extractor to flag negative aspects in real time and route tickets to the right support team.
  • Localized sentiment dashboards – Marketing teams can monitor sentiment trends across regions where Bangla is dominant, without needing manual tagging.
  • Transferable pipeline – The graph‑matching + XGBoost pattern can be adapted to other low‑resource languages that suffer from spelling noise and code‑mixing, reducing the need for massive labeled corpora.

Limitations & Future Work

  • Dataset size – 3.3 k reviews, while a solid start, is still modest; larger, domain‑diverse corpora could improve generalization.
  • Domain specificity – The current data is limited to product reviews; extending to social media or news comments may require additional preprocessing tweaks.
  • Aspect granularity – The model treats each aspect as a flat span; hierarchical aspect taxonomies (e.g., “camera → resolution”) are not yet supported.
  • Future directions suggested by the authors include:
    1. Semi‑supervised data augmentation to mitigate sparsity.
    2. Incorporating a multilingual pre‑training step to better handle code‑mixed Bangla‑English text.
    3. Exploring graph neural networks for end‑to‑end aspect‑opinion pairing.

Authors

  • Ariful Islam
  • Md Rifat Hossen
  • Abir Ahmed
  • B M Taslimul Haque

Paper Information

  • arXiv ID: 2511.21381v1
  • Categories: cs.LG, cs.CL
  • Published: November 26, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »