I teach machines to see, understand, and solve real-world problems.
I started exploring AI through computer vision, fascinated by the idea that machines could learn from pixels and patterns. Over time, that curiosity turned into a research path focused on making visual AI work in messy, real-world conditions.
After my sophomore year, I began working on gait recognition for surveillance, where faces are often unreliable and walking patterns can become a stronger identity cue. That experience pushed me toward visual recognition problems where AI has to work under noise, occlusion, limited data, and real deployment constraints. I later explored the same challenge in mammography classification and orchard-scale object detection, where models must understand complex visual patterns instead of only performing well on clean benchmarks.
Today, my focus is reliable visual intelligence: building models that can see, interpret, and reason over complex visual data. I am especially interested in vision-language models, multimodal learning, and trustworthy visual AI, with the goal of moving these systems toward safer and more useful real-world understanding.
Interests
Computer Vision
Vision Language Models
Multimodal Learning
RL for Visual Intelligence
Algorithms
Medical Image Processing
Education
B.Tech in Electronics and Communications EngineeringDelhi Technological University2022 – 2026
A compact gait-recognition model that learns short-term walking cycles and longer motion context, improving silhouette-based identity matching when faces, clothing, views, and video conditions are unreliable.
Mohammed Asad, Dr. Ajai Kumar Gautam, Priyanshu Dhiman, Rishi Raj Prajapati
A controlled orchard-vision benchmark comparing modern object detectors for apple detection under leaf clutter, dense clusters, illumination shifts, and partial occlusion.
Mohammed Asad, Mohit Bajpai, Dr. Sudhir Singh, Dr. Rahul Kataria
A hybrid mammography model that combines CNN-based lesion understanding with efficient global context modeling for benign-malignant ROI classification.
Built an orchard-scale apple detection study focused on visual reliability under occlusion, lighting variation, dense fruit clusters, and leaf-cluttered backgrounds. Associated paper accepted at ICICV 2026.
Formulated single-class apple detection on AppleBBCH81 for real-world orchard imagery.
Benchmarked one-stage, two-stage, and transformer-based detectors using COCO-style mAP, precision, recall, F1-score, and precision-recall analysis.
Studied edge deployment feasibility for Raspberry Pi-class devices, focusing on models that remain useful outside clean benchmark settings.
Research Intern
MMDA Lab, Delhi Technological University
Jun 2024 – Nov 2025New Delhi, India
Developed a silhouette-based gait recognition framework for identity matching when face cues are weak or unavailable. The work focuses on temporal modeling under view, clothing, carrying, noise, and long-sequence variation.
Designed a CNN + Temporal KAN architecture with learnable temporal dynamics and gated long-term memory.
Modeled local gait cycles and broader motion context for robust identity matching across surveillance conditions.
Validated the framework on CASIA-B using gallery-probe matching across normal walking, bag-carrying, and clothing-change settings.
Research Intern
CALIBRE Lab, Delhi Technological University
Jun 2025 – Oct 2025New Delhi, India
Worked on mammography ROI classification, focusing on benign-malignant lesion understanding under variation in texture, scale, appearance, and class balance.
Built a hybrid EfficientNetV2-M + Vision Mamba pipeline for local lesion representation and efficient global context modeling.
Evaluated the system on CBIS-DDSM lesion crops against CNN and Transformer baselines.
Connected model design with practical diagnostic constraints, where subtle visual patterns and long-range context both matter.
Computer Vision Engineer – Intern
DeepSight AI Labs Pvt. Ltd.
Aug 2025 – Feb 2026Gurugram, India
Worked on applied AI systems for enterprise data access, internal document intelligence, forecasting, and workflow automation.
Developed an agentic AI system for querying structured, unstructured, and operational data through natural language.
Built prediction, NLP, and automation components for internal decision-support workflows.
Contributed to production-oriented tools where reliable retrieval and usable outputs mattered more than demo performance.
Research Collaborator
Collaboration with Purdue & Penn State University
Jan 2025 – Mar 2025Remote
Explored reward-guided adaptation of Stable Diffusion v1.5 for task-specific image generation, using reinforcement learning to steer visual quality beyond the base diffusion objective.
Designed a parameter-efficient RL fine-tuning setup using LoRA layers in U-Net attention blocks.
Used reward scoring and PPO-style policy updates to guide latent diffusion outputs.
Analyzed the difference between moderate reward optimization and over-optimization in generated images.
Contact
Feel free to reach out for research collaborations, internships, or academic opportunities.