Urdu Bench | Urdu LLM Benchmark
Evaluating Language Models on the nuances ofUrduUrduLLMbenchmarkand leaderboard for translation, grammar, and comprehension.
Open Full NewsMonitoring AI enhancements in Urdu
Latest Urdu-focused AI updates and research-backed news
Evaluating Language Models on the nuances ofUrduUrduLLMbenchmarkand leaderboard for translation, grammar, and comprehension.
Open Full NewsAbstract Neural methods inTexttoSpeechsynthesis (TTS) have demonstrated momentous advancement in terms of the naturalness and intelligibility of the synthesizedspeech. In thispaperwe present neuralspeechsynthesis system forUrdulanguage, a low resource language. The main challenge faced for this study was the non-availability of any publicly availableUrduspeechsynthesis corpora ...
Open Full NewsThispaperdocuments the exploration and refinement of the Common VoiceUrduCorpus dataset version 12.0 to create a clean and refined dataset suitable for trainingUrduText-to-Speechmodels.
Open Full NewsUrduis a language spoken by millions of people around the globe especially in South Asia. ExistingTTSmodels focus mainly on English and Chinese languages, having a minimal focus onUrduand other low-resource languages. In thispaper, we propose a generativeUrduTTSsystem.
Open Full NewsDec 6, 2024Thispaperintroduces a comprehensive approach to building natural-soundingUrduText-to-Speech(TTS) and voice cloning systems, addressing the lack of computational resources forUrdu. We developed a large-scale dataset of over 100 h ofUrduspeech, carefully cleaned and phonetically aligned through an automated transcription pipeline to preserve linguistic accuracy. The dataset was then used ...
Open Full NewsAbstract Whisper, a large-scale multilingual model, has demonstrated strong performance in speech recognition benchmarks, but its effectiveness onlow-resourcelanguages remains under-explored. Thispaperevaluates Whisper's per-formance on Pashto, Punjabi, andUrdu, three underrepresented languages. While Automatic Speech Recognition (ASR) has advanced for widely spoken languages,low...
Open Full News5 days agoThispaperevaluates Whisper's performance on Pashto, Punjabi, andUrdu, three underrepresented languages. While Automatic Speech Recognition (ASR) has advanced for widely spoken languages,low-resourcelanguages still face challenges due to limited data.
Open Full NewsMar 1, 2025This study focuses on end-to-endASRforlow-resourcelanguages. For instance,Urdu, despite being the 10th most spoken language globally, qualifies as alow-resourcelanguage. The scarcity of benchmark datasets inUrduhas compelled researchers to employ increasingly innovative methods to overcome this challenge.
Open Full NewsAug 13, 2025This study evaluates the feasibility of lightweight Whisper models (Tiny, Base, Small) forUrduspeech recognition inlow-resourcesettings. DespiteUrdubeing the 10th most spoken language globally with over 230 million speakers, its representation in automatic speech recognition (ASR) systems remains limited due to dialectal diversity, code-switching, and sparse training data. We benchmark ...
Open Full NewsAbstractNamedentityrecognition(NER) is a fundamental part of other natural language processing tasks such as information retrieval, question answering systems and machine translation. Progress and success have already been achieved inresearchon the English NER systems. However, theUrduNER system is still in its infancy due to the complexity and morphological richness of theUrdu...
Open Full NewsMar 28, 2024NamedEntityRecognition(NER) is a natural language processing task that has been widely explored for different languages in the recent decade but is still an under-researched area for theUrdu...
Open Full NewsOct 29, 2025This study unveils theNamedEntityRecognition(NER) system specifically designed forUrdunews headlines, aimed at bridging crucial linguistic resource gaps. We meticulously developed a comprehensive corpus from diverse news sources, specifically tailored to reflectUrdu'sunique orthographic and morphological characteristics. Our approach incorporates state-of-the-art (SOTA) neural ...
Open Full News