Research Interests
I mostly work on Large Language Models, Vision Language Models, Responsible AI, Privacy & Bias, their efficiency, and building downstream industrial applications using RAG methods and LLM deployment.I have also curated high-quality datasets and benchmarks for Multilingual LMMs, Bias Mitigation, and Industrial Applications for the MENA region.
|
News
[Aug 2025] - Our work GAEA is accepted at WACV 2026.
[Aug 2025] - Our work VIMUL is accepted at EMNLP 2025.
[Aug 2025] - Our work GRAMVIS is accepted at EMNLP Findings 2025 .
[Jun 2025] - Co-organized CVPR 2025 Workshop for VidLLMs.
[May 2025] - Started internship at Pinterest as an MLE Intern in the Visual Search Team.
[Mar 2025] - Our work MobiLlama is accepted at ICLR-SLLM Workshop 2025 (Spotlight).
[Feb 2025] - Our work ALM-Bench is accepted at CVPR 2025 (Highlight).
[Oct 2024] - Our work VURF/span> is accepted at NeurIPS VLM Workshop, 2024.
[Aug 2024] - Joined the UCF (CRCV) as a Master's in Computer Vision Student.
[Jan 2024] - Promoted to a Research Engineer at MBZUAI.
[Dec 2023] - Merit Award - Tertiary Student Project APICTA Awards held in Hong Kong 2023.
[Dec 2023] - People's Choice Award at APICTA Awards held in Hong Kong 2023.
[Nov 2023] - Released the Jais Climate as a Lead Engineer.
[Aug 2023] - Joined the MBZUAI as a Research Assistant.
|
Publications
selected publications, full here
* denotes joint first authors
|
|
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Ashmal Vayani,
Dinura Dissanayake,
Hasindri Watawana,
Omkar Thawakar,
Michael Felsberg,
Thamar Solorio,
Monojit Choudhury,
Ivan Laptev,
Mubarak Shah,
Salman Khan,
Fahad Shahbaz Khan
CVPR 2025 (Highlight)
Paper
/
Project
/
Code
/
Data
|
|
GAEA: A Geolocation Aware Conversational Assistant
Ron Campos*,
Ashmal Vayani*,
Parth Parag Kulkarni*,
Rohit Gupta
Aizan Zafar,
Aritra Dutta,
Mubarak Shah,
WACV 2026
Paper
/
Project
/
Code
/
Data
|
|
A Culturally-diverse Multilingual Multimodal Video Benchmark & Model
Bhuiyan Sanjid Shafique,
Ashmal Vayani,
Muhammad Maaz,
Hanoona Abdul Rasheed,
Dinura Dissanayake
Michael Felsberg,
Mubarak Shah,
Salman Khan
Fahad Shahbaz Khan
EMNLP 2025 (Main)
Paper
/
Project
/
Code
/
Data
|
|
Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models
Muhammed Saeed,
Shaina Raza,
Ashmal Vayani,
Muhammad Abdul-Mageed,
Ali Emami,
Shady Shehata
EMNLP 2025 (Findings)
Paper
|
|
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
Ashmal Vayani*,
Ahmad Mahmood*,
Muzammal Naseer,
Salman Khan,
Fahad Shahbaz Khan
NeurIPS Vision Language Models Workshop 2024.
Paper
/
Code
|
|
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Omkar Thawakar*,
Ashmal Vayani*,
Salman Khan,
Hisham Cholakkal,
Rao Muhammad Anwer,
Michael Felsberg,
Timothy Baldwin,
Eric P. Xing,
Fahad Shahbaz Khan
ICLR SLLM Workshop 2025 (Spotight)
Paper
/
Code
/
Models
|
|
SB-Bench: Sstereotype Bias Benchmark for Large Multimodal Models
Vishal Narnaware*,
Ashmal Vayani*,
Rohit Gupta,
Swetha Sirnam,
Mubarak Shah
Under Review
Paper
/
Project
/
Code
/
Data
|
|