Multi-Modal Document Understanding for Low-Resource Languages
June 2025 - Present | Independent Collaboration
Supervisor(s): Dr. Md Mofijul Islam, Dr. AKM Mahbubur Rahman, Dr. Aman Chadha
- Collecting, annotating, and curating government forms in South Asian low-resource languages for document understanding tasks such as key information extraction, classification, and entity linking.
- Building a synthetic document data generation pipeline to address low-resource scarcity and ensure demographic and regional representativeness.
Synthetic Data Based Agentic Visual Question Answering (VQA)
Link, May 2024 - Present | Independent Collaboration
Supervisor(s): Dr. AKM Mahbubur Rahma, Dr. Md Mofijul Islam, Dr. Aman Chadha
- Working on building large-scale synthetic question-answer-rationale dataset for interpretable VQA tasks.
- Applying zero-shot, one-shot, and few-shot based CoT (Chain of Thoughts)-style dynamic prompting for diversification of synthetic question-answering data.
Textual Context Integration for Indoor Sign Detection
Link, Jan 2023 - May 2023 | Wichita State University
Supervisor(s): Dr. Vinod Namboodiri
- Developed multi-modal fusion-based transformer models to detect indoor sign for BVI (Blind and Visually Impaired).
- Prototyped a multi-modal chat interface for cooperative guided movement of low/no-vision people in indoor settings.
Human in the Loop ML in Computational Healthcare
Link, Jan 2019 - Dec 2019 | University of Dhaka
Supervisor(s): Dr. Md Mofijul Islam, Md Mahmudur Rahman
- Investigated interpretability and domain-awareness in pulmonary disease detection using CNNs.
- Observed joint effects of trainable filters and customized domain-specific non-trainable filters on CNNs disease detection capability.