arXiv Research
Abstract:We present LEGAL-UQA, the firstUrdulegal question-answering dataset derived from Pakistan's constitution. This parallel English-…▽ MoreWe present LEGAL-UQA, the firstUrdulegal question-answering dataset derived from Pakistan's constitution. This parallel English-Urdudataset includes 619 question-answer pairs, each with corresponding legal article contexts, addressing the need for domain-s
Open Full News
arXiv Research
Abstract:…we evaluate LLMs such as GPT-4, Llama 2, and Gemini to analyze their effectiveness in English compared to other low-resource languages from South Asia (e.g., Bangla, Hindi, andUrdu). Specifically, we utilized zero-shot prompting and five different prompt settings to extensively investigate the effectiveness of the LLMs in cross-lingual translated prompts.…▽ MoreLarge Language Models (LLM
Open Full News
arXiv Research
Abstract:Known by more than 1.5 billion people in the Indian subcontinent, Indic languages present unique challenges and opportunities for natural language processing (NLP) research due to their rich cultural heritage, linguistic diversity, and complex structures. IndicMMLU-Pro is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) across Indi…▽ MoreKnown by more than 1.5 b
Open Full News
arXiv Research
Abstract:Document levelUrduSentiment Analysis (SA) is a challenging Natural Language Processing (…▽ MoreDocument levelUrduSentiment Analysis (SA) is a challenging Natural Language Processing (NLP) task as it deals with large documents in a resource-poor language. In large documents, there are ample amounts of words that exhibit different viewpoints. Deep learning (DL) models comprise of complex ne
Open Full News
arXiv Research
Abstract:Named Entity Recognition (NER), a fundamental task in Natural Language Processing (NLP), has shown significant advancements for high-resource languages. However, due to a lack of annotated datasets and limited representation in Pre-trained Language Models (PLMs), it remains understudied and challenging for low-resource languages. To address these challenges,…▽ MoreNamed Entity Recognition
Open Full News
arXiv Research
Abstract:Named Entity Recognition (NER) plays a pivotal role in various Natural Language Processing (NLP) tasks by identifying and classifying named entities (NEs) from unstructured data into predefined categories such as person, organization, location, date, and time. While extensive research exists for high-resource languages and general domains, NER in…▽ MoreNamed Entity Recognition (NER) plays
Open Full News
arXiv Research
Abstract:This paper evaluates the performance of Large Multimodal Models (LMMs) on Optical Character Recognition (OCR) in the low-resource Pashto language. Natural Language Processing (NLP) in Pashto faces several challenges due to the cursive nature of its script and a scarcity of structured datasets. To address this, we developed a synthetic Pashto OCR dataset, PsO…▽ MoreThis paper evaluates the
Open Full News
arXiv Research
Abstract:The rapid adoption of Large Language Models (LLMs) has raised important concerns about the factual reliability of their outputs, particularly in low-resource languages such asUrdu. Existing automated fact-checking systems are predominantly developed for English, leaving a significant gap for the more than 200 million…▽ MoreThe rapid adoption of Large Language Models (LLMs) has raised impo
Open Full News
arXiv Research
Abstract:…closely tied to political, religious, and regional ideologies. We present a systematic evaluation of political bias in 13 state-of-the-art LLMs across five Pakistani languages:Urdu, Punjabi, Sindhi, Pashto, and Balochi. Our framework integrates a culturally adapted Political Compass Test (PCT) with multi-level framing analysis, capturing both ideological st…▽ MoreLarge Language Models (L
Open Full News
arXiv Research
Abstract:…that promotes optimism, resilience, and support, particularly in adverse contexts. Although hope speech detection has gained attention in Natural Language Processing (NLP), existing research mainly focuses on high-resource languages and standardized scripts, often overlooking informal and underrepresented forms such as Roman…▽ MoreHope is a positive emotional state involving the expectat
Open Full News
arXiv Research
Abstract:…this growing ability has also made it harder to tell whether a piece of text was written by a human or by a machine. This challenge becomes even more serious for languages likeUrdu, where there are very few tools available to detect AI-generated text. To address this gap, we propose a novel AI-generated text detection framework tailored for the…▽ MoreLarge Language Models (LLMs) are now
Open Full News
arXiv Research
Abstract:…a challenging task in Natural Language Processing, particularly when dealing with languages that differ in syntax and cultural context. In this work, we aim to detect irony inUrduby translating an English Ironic Corpus into the…▽ MoreIronic identification is a challenging task in Natural Language Processing, particularly when dealing with languages that differ in syntax and cultural cont
Open Full News