Urdu AI Dashboard

Monitoring AI enhancements in Urdu

AI News in Urdu

Latest Urdu-focused AI updates and research-backed news

arXiv Research

CorIL: Towards Enriching IndianLanguageto IndianLanguageParallel Corpora and Machine Translation Systems

Abstract:India's linguistic landscape is one of the most diverse in the world, comprising over 120 majorlanguagesand approximately 1,600 additional…▽ MoreIndia's linguistic landscape is one of the most diverse in the world, comprising over 120 majorlanguagesand approximately 1,600 additionallanguages, with 22 officially recognized as scheduledlanguagesin the Indian Constitution. Despite recent pro

Open Full News
arXiv Research

Multilingual Hope Speech Detection: A Comparative Study of Logistic Regression, mBERT, and XLM-RoBERTa with Active Learning

Abstract:Hope speechlanguagethat fosters encouragement and optimism plays a vital role in promoting positive discourse online. However, its detection remains challenging, especially in multilingual and low-resource settings. This paper presents a multilingual framework for hope speech detection using an active learning approach and transformer-based…▽ MoreHope speechlanguagethat fosters encouragem

Open Full News
arXiv Research

Detecting Hope AcrossLanguages: Multiclass Classification for Positive Online Discourse

Abstract:…as a critical task for promoting positive discourse and well-being. In this paper, we present a machine learning approach to multiclass hope speech detection across multiplelanguages, including English,…▽ MoreThe detection of hopeful speech in social media has emerged as a critical task for promoting positive discourse and well-being. In this paper, we present a machine learning approach

Open Full News
arXiv Research

Fine-Tuning LargeLanguageModelswith QLoRA for OffensiveLanguageDetection in RomanUrdu-English Code-Mixed Text

Abstract:The use of derogatory terms inlanguagesthat employ code mixing, such as Roman…▽ MoreThe use of derogatory terms inlanguagesthat employ code mixing, such as RomanUrdu, presents challenges for NaturalLanguageProcessing systems due to unstated grammar, inconsistent spelling, and a scarcity of labeled data. In this work, we propose a QLoRA based fine tuning framework to improve offensivelangu

Open Full News
arXiv Research

Do LLMs Know They Are Being Tested? Evaluation Awareness and Incentive-Sensitive Failures in GPT-OSS-20B

Abstract:Benchmarks for largelanguage…▽ MoreBenchmarks for largelanguagemodels(LLMs) often rely on rubric-scented prompts that request visible reasoning and strict formatting, whereas real deployments demand terse, contract-bound answers. We investigate whether such "evaluation scent" inflates measured performance without commensurate capability gains. Using a single open-weightsmodel(GPT-OSS-20B)

Open Full News
arXiv Research

Alif: AdvancingUrduLargeLanguageModelsvia Multilingual Synthetic Data Distillation

Abstract:Developing a high-performing largelanguage…▽ MoreDeveloping a high-performing largelanguagemodels(LLMs) for low-resourcelanguagessuch asUrdu, present several challenges. These challenges include the scarcity of high-quality datasets, multilingual inconsistencies, and safety concerns. Existing multilingual LLMs often address these issues by translating large volumes of available data. Howe

Open Full News
arXiv Research

Celebrity Profiling on ShortUrduText using Twitter Followers' Feed

Abstract:…among the most active users and often reveal aspects of their personal and professional lives through online posts. Platforms such as Twitter provide an opportunity to analyzelanguageand behavior for understanding demographic and social patterns. Since followers frequently share linguistic traits and interests with the celebrities they follow, textual data…▽ MoreSocial media has become a

Open Full News
arXiv Research

VLURes: Benchmarking VLM Visual and Linguistic Understanding in Low-ResourceLanguages

Abstract:VisionLanguage…▽ MoreVisionLanguageModels(VLMs) are pivotal for advancing perception in intelligent agents. Yet, evaluation of VLMs remains limited to predominantly English-centric benchmarks in which the image-text pairs comprise short texts. To evaluate VLM fine-grained abilities, in fourlanguagesunder long-text settings, we introduce a novel multilingual benchmark VLURes featuring eigh

Open Full News
arXiv Research

Evaluating LargeLanguageModelsonUrduIdiom Translation

Abstract:Idiomatic translation remains a significant challenge in machine translation, especially for low resourcelanguagessuch as…▽ MoreIdiomatic translation remains a significant challenge in machine translation, especially for low resourcelanguagessuch asUrdu, and has received limited prior attention. To advance research in this area, we introduce the first evaluation datasets forUrduto English

Open Full News
arXiv Research

Cross-Corpus Validation of Speech Emotion Recognition inUrduusing Domain-Knowledge Acoustic Features

Abstract:…affective computing technology that enables emotionally intelligent artificial intelligence. While SER is challenging in general, it is particularly difficult for low-resourcelanguagessuch as…▽ MoreSpeech Emotion Recognition (SER) is a key affective computing technology that enables emotionally intelligent artificial intelligence. While SER is challenging in general, it is particularly d

Open Full News
arXiv Research

Handwritten Text Recognition for Low ResourceLanguages

Abstract:Despite considerable progress in handwritten text recognition, paragraph-level handwritten text recognition, especially in low-resourcelanguages, such as Hindi,…▽ MoreDespite considerable progress in handwritten text recognition, paragraph-level handwritten text recognition, especially in low-resourcelanguages, such as Hindi,Urduand similar scripts, remains a challenging problem. Theselan

Open Full News
arXiv Research

Efficient ASR for Low-ResourceLanguages: Leveraging Cross-Lingual Unlabeled Data

Abstract:Automatic speech recognition for low-resourcelanguagesremains fundamentally constrained by the scarcity of labeled data and computational resources required by state-of-the-art…▽ MoreAutomatic speech recognition for low-resourcelanguagesremains fundamentally constrained by the scarcity of labeled data and computational resources required by state-of-the-artmodels. We present a systematic

Open Full News