Research Interests
I mostly work on Large Language Models, Vision Language Models, their efficiency, and building downstream industrial applications using RAG methods and LLM deployment. I have also curated high-quality datasets and benchmarks for Multilingual LMMs, Bias Mitigation, and Industrial Applications for the MENA region.
|
Publications
* denotes joint first authors
|
|
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
Ashmal Vayani*,
Ahmad Mahmood*,
Muzammal Naseer,
Salman Khan,
Fahad Shahbaz Khan
NeurIPS Vision Language Models Workshop 2024.
Paper
|
|
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Omkar Thawakar*,
Ashmal Vayani*,
Salman Khan,
Hisham Cholakkal,
Rao Muhammad Anwer,
Michael Felsberg,
Timothy Baldwin,
Eric P. Xing,
Fahad Shahbaz Khan
Under review
Code
/
Paper
|
|