ACL
Table 1:Sub-tasks for classication and generation. In this paper, we aim to answer: (1) how each category ofmodelsperforms onUrdulanguagetasks, and (2) whichmodeltype is more effective in practical applications forUrdu-speaking users.
Open Full News
arXiv
Oct 10, 2025The rapid advancement of LLMs (Zhao et al.,2024) has revolutionized naturallanguageprocessing (NLP) across multiplelanguagesand applications. However, a significant disparity persists between high-resourcelanguages, such as English, and low-resourcelanguages, such asUrdu. These disparities create technological barriers for billions of speakers of underrepresentedlanguages, limiting ...
Open Full News
Research
This is the first study to address hope speech detection in code-mixed RomanUrduby introducing a carefully annotated dataset, thereby filling a critical gap in inclusiveNLPresearchfor low-resource, informal language varieties.
Open Full News
Web
Identifying thisresearchgap, thispaperpresents a framework for concept-level sentiment analysis, aiming to enhance the accuracy of sentiment analysis (SA). A comprehensiveUrdulanguage dataset was constructed by collecting data from YouTube, consisting of various talks and reviews on topics such as movies, politics, and commercial products.
Open Full News
ACL
Abstract This study evaluates the question-answering capabilities of Large Language Models (LLMs) inUrdu, addressing a critical gap in low-resource language processing. Four models GPT-4, mBERT, XLM-R, and mT5 are assessed across monolingual, cross-lingual, and mixed-language settings using the UQuAD1.0 and SQuAD2.0 datasets. Results reveal significant performance gaps between English and ...
Open Full News
Research
Apr 15, 2024These challenges stem from the intricate linguistic characteristics ofUrdu, including morphological diversity, a context-dependent lexicon, and the scarcity of training data. This study addresses these issues by focusing onUrduNamed Entity Recognition (U-NER) and introducing three key contributions.
Open Full News
arXiv
Jan 25, 2026The complete methodology -- including corpus, tokenizer, model weights, and evaluation benchmarks -- is released openly to establish a baseline forUrduNLPresearchand provide a scalable framework for other underrepresented languages.
Open Full News