Web
Jan 2, 2025I'm currently studying at Lasbela University of Agriculture, Water and Marine Sciences (LUAWMS), where I focus on web development, artificial intelligence, and using tech for social good. As the founder and lead of theUrduAIproject, I'm passionate about making artificial intelligence more accessible forUrdu-speaking communities.
Open Full News
arXiv
May 24, 2024Our study, focuses on evaluating the potential of both closed and openLLMsfor supportingUrdu, a low resource language with limited data coverage inLLM'spre-training. In our experiments we utilize GPT3.5 turbo by OpenAI, Llama 2 by Meta and Bloomz 3B and 7B1 by Big Science in zero-shot setting, and performevaluationon 14UrduNLP tasks analysing their performances with the existing ...
Open Full News
Web
Evaluating Language Models on the nuances ofUrduUrduLLMbenchmarkand leaderboard for translation, grammar, and comprehension.
Open Full News
ACL
Abstract Neural methods inTexttoSpeechsynthesis (TTS) have demonstrated momentous advancement in terms of the naturalness and intelligibility of the synthesizedspeech. In thispaperwe present neuralspeechsynthesis system forUrdulanguage, a low resource language. The main challenge faced for this study was the non-availability of any publicly availableUrduspeechsynthesis corpora ...
Open Full News
Research
Thispaperdocuments the exploration and refinement of the Common VoiceUrduCorpus dataset version 12.0 to create a clean and refined dataset suitable for trainingUrduText-to-Speechmodels.
Open Full News
Web
Urduis a language spoken by millions of people around the globe especially in South Asia. ExistingTTSmodels focus mainly on English and Chinese languages, having a minimal focus onUrduand other low-resource languages. In thispaper, we propose a generativeUrduTTSsystem.
Open Full News
Web
Dec 6, 2024Thispaperintroduces a comprehensive approach to building natural-soundingUrduText-to-Speech(TTS) and voice cloning systems, addressing the lack of computational resources forUrdu. We developed a large-scale dataset of over 100 h ofUrduspeech, carefully cleaned and phonetically aligned through an automated transcription pipeline to preserve linguistic accuracy. The dataset was then used ...
Open Full News
Web
Abstract Whisper, a large-scale multilingual model, has demonstrated strong performance in speech recognition benchmarks, but its effectiveness onlow-resourcelanguages remains under-explored. Thispaperevaluates Whisper's per-formance on Pashto, Punjabi, andUrdu, three underrepresented languages. While Automatic Speech Recognition (ASR) has advanced for widely spoken languages,low...
Open Full News
ACL
5 days agoThispaperevaluates Whisper's performance on Pashto, Punjabi, andUrdu, three underrepresented languages. While Automatic Speech Recognition (ASR) has advanced for widely spoken languages,low-resourcelanguages still face challenges due to limited data.
Open Full News
Web
Mar 1, 2025This study focuses on end-to-endASRforlow-resourcelanguages. For instance,Urdu, despite being the 10th most spoken language globally, qualifies as alow-resourcelanguage. The scarcity of benchmark datasets inUrduhas compelled researchers to employ increasingly innovative methods to overcome this challenge.
Open Full News
arXiv
Aug 13, 2025This study evaluates the feasibility of lightweight Whisper models (Tiny, Base, Small) forUrduspeech recognition inlow-resourcesettings. DespiteUrdubeing the 10th most spoken language globally with over 230 million speakers, its representation in automatic speech recognition (ASR) systems remains limited due to dialectal diversity, code-switching, and sparse training data. We benchmark ...
Open Full News
Web
AbstractNamedentityrecognition(NER) is a fundamental part of other natural language processing tasks such as information retrieval, question answering systems and machine translation. Progress and success have already been achieved inresearchon the English NER systems. However, theUrduNER system is still in its infancy due to the complexity and morphological richness of theUrdu...
Open Full News