Background of the Study
Language modeling is a fundamental aspect of natural language processing, designed to predict word sequences and capture linguistic patterns. In Nigeria, language use is often characterized by code-switching between Nigerian English and Pidgin. This code-switched speech is prevalent in social media, informal communication, and even formal contexts. Recent advancements in deep learning have enabled the development of sophisticated language models that can handle multilingual and code-switched data (Adejumo, 2023). However, the dynamic nature of code-switching, with its frequent alternation between languages and the blending of grammatical structures, poses unique challenges for conventional language models. Studies (Okeke, 2024) suggest that models trained on monolingual data often fail to capture the nuances of code-switched communication. Recent research (Umar, 2025) emphasizes the need for specialized training on mixed-language corpora to improve model performance. This study examines language modeling approaches for code-switched Nigerian English and Pidgin corpora, assessing model accuracy and proposing adaptations that account for the complexities of bilingual language use in Nigeria.
Statement of the Problem
Existing language models frequently struggle with code-switched text, leading to poor prediction accuracy and limited applicability in real-world Nigerian contexts (Adejumo, 2023). The alternating structures and blended vocabularies in Nigerian English and Pidgin create challenges that monolingual models are not equipped to handle (Okeke, 2024). This deficiency impacts applications such as speech recognition, translation, and sentiment analysis. Furthermore, the scarcity of annotated code-switched corpora exacerbates the problem. Addressing these issues is essential to develop language models that accurately reflect the bilingual dynamics of Nigerian communication.
Objectives of the Study
Research Questions
Significance of the Study
This study is significant because it addresses the complex issue of code-switching in Nigerian language use by evaluating and enhancing language models. Improved models will benefit applications such as speech recognition, machine translation, and sentiment analysis, leading to more accurate digital tools for multilingual communities. The findings will provide insights for researchers and developers working on bilingual NLP systems and contribute to the preservation and effective processing of Nigeria’s linguistic diversity.
Scope and Limitations of the Study
This study focuses on language modeling for code-switched Nigerian English and Pidgin corpora and does not cover other language pairs or broader sociolinguistic factors.
Definitions of Terms
Background of the Study
Government agricultural policies play a pivotal role in shaping the performance o...
Background of the Study
In recent years, the role of big data in optimizing supply chain operations has gained significant attention in v...
Chapter One: Introduction
1.1 Background of the Study...
Background of the Study
Improving branch accessibility is widely recognized as a key driver for increasing financial inclusion, especiall...
Background of the Study
Diplomatic negotiations are critical in easing international tensions and preventing conflicts. Hig...
ABSTRACT
The study of the effect of inventory management as a key to organizational effectiveness in se...
Background of the Study
Nurses are at the frontline of healthcare delivery and are frequently exposed to high-stress environments that ca...
Background of the study:
Immersive digital experiences, particularly those delivered through virtual reality (VR), are redefining how bra...
Background of the Study
The rapid evolution of cybercrime has prompted banks to adopt sophisticated fraud management software to protect...
Chapter One: Introduction