Exploring Somali Sentiment Analysis: A Resource-Light Approach for Small-scale Text Classification


Abstract views: 138 / PDF downloads: 274

Authors

  • Kadar Bahar Karabuk University
  • Nehad T.A Ramaha Karabuk University

DOI:

https://doi.org/10.59287/icaens.1069

Keywords:

Somali Language, Sentiment Analysis, NLP, Under-Resourced Languages, Resource-Light Approach, Tokenization, Stemming, Stopwords Removal, Negation Handling, Sentiment Classification Model

Abstract

Sentiment analysis, a fundamental task in natural language processing (NLP), plays a crucial role in understanding people's opinions and emotions expressed in textual data. While sentiment analysis has been extensively studied for major languages, under-resourced languages like Somali have received limited attention in this domain. This paper aims to address this research gap by proposing a resourcelight approach for sentiment analysis in Somali, which is tailored to the language's unique characteristics and limited linguistic resources. We present a methodology that combines lexicon-based methods and feature engineering techniques to effectively extract sentiment information from Somali text. A sentiment-annotated dataset was created through crowdsourcing, enabling the training and evaluation of a sentiment classification model specifically designed for Somali. Experimental results demonstrate the competitive performance of our approach compared to existing sentiment analysis techniques for underresourced languages. The findings highlight the feasibility of sentiment analysis in Somali, even with a small-scale dataset, and shed light on the implications for sentiment analysis in other under-resourced languages. This research contributes to the advancement of sentiment analysis capabilities for underresourced languages, empowering researchers and practitioners to gain insights from sentiment information in diverse linguistic contexts.

Author Biographies

Kadar Bahar, Karabuk University

Computer Engineering,Turkey

Nehad T.A Ramaha, Karabuk University

Computer Engineering, Turkey

Downloads

Published

2023-07-21

How to Cite

Bahar, K., & Ramaha, N. T. (2023). Exploring Somali Sentiment Analysis: A Resource-Light Approach for Small-scale Text Classification. International Conference on Applied Engineering and Natural Sciences, 1(1), 620–628. https://doi.org/10.59287/icaens.1069