Mohammed Asad – Computer Vision Researcher

I teach machines to see, understand, and solve real-world problems.

I started exploring AI through computer vision, fascinated by the idea that machines could learn from pixels and patterns. Over time, that curiosity turned into a research path focused on making visual AI work in messy, real-world conditions.

After my sophomore year, I began working on gait recognition for surveillance, where faces are often unreliable and walking patterns can become a stronger identity cue. That experience pushed me toward visual recognition problems where AI has to work under noise, occlusion, limited data, and real deployment constraints. I later explored the same challenge in mammography classification and orchard-scale object detection, where models must understand complex visual patterns instead of only performing well on clean benchmarks.

Today, my focus is reliable visual intelligence: building models that can see, interpret, and reason over complex visual data. I am especially interested in vision-language models, multimodal learning, and trustworthy visual AI, with the goal of moving these systems toward safer and more useful real-world understanding.

Interests

Computer Vision
Vision Language Models
Multimodal Learning
RL for Visual Intelligence
Algorithms
Medical Image Processing

Education

B.Tech in Electronics and Communications Engineering Delhi Technological University 2022 – 2026

Publications

Selected research papers and preprints.

2026

Preprint 2026

Gait Recognition with Temporal Kolmogorov–Arnold Networks

Mohammed Asad, Dr. Dinesh Kumar Vishwakarma

A compact gait-recognition model that learns short-term walking cycles and longer motion context, improving silhouette-based identity matching when faces, clothing, views, and video conditions are unreliable.

Paper

ICICV conference 2026

A Comparative Study of Modern Object Detectors for Robust Apple Detection in Orchard Imagery

Mohammed Asad, Dr. Ajai Kumar Gautam, Priyanshu Dhiman, Rishi Raj Prajapati

A controlled orchard-vision benchmark comparing modern object detectors for apple detection under leaf clutter, dense clusters, illumination shifts, and partial occlusion.

Paper

Preprint 2026

A Hybrid Architecture for Breast Cancer Classification in Mammography

Mohammed Asad, Mohit Bajpai, Dr. Sudhir Singh, Dr. Rahul Kataria

A hybrid mammography model that combines CNN-based lesion understanding with efficient global context modeling for benign-malignant ROI classification.

Paper

Experience

Research, industry, and academic positions.

Bachelor's Thesis Researcher

Delhi Technological University

Sep 2025 – Present New Delhi, India

Built an orchard-scale apple detection study focused on visual reliability under occlusion, lighting variation, dense fruit clusters, and leaf-cluttered backgrounds. Associated paper accepted at ICICV 2026.

Formulated single-class apple detection on AppleBBCH81 for real-world orchard imagery.
Benchmarked one-stage, two-stage, and transformer-based detectors using COCO-style mAP, precision, recall, F1-score, and precision-recall analysis.
Studied edge deployment feasibility for Raspberry Pi-class devices, focusing on models that remain useful outside clean benchmark settings.

Research Intern

MMDA Lab, Delhi Technological University

Jun 2024 – Nov 2025 New Delhi, India

Developed a silhouette-based gait recognition framework for identity matching when face cues are weak or unavailable. The work focuses on temporal modeling under view, clothing, carrying, noise, and long-sequence variation.

Designed a CNN + Temporal KAN architecture with learnable temporal dynamics and gated long-term memory.
Modeled local gait cycles and broader motion context for robust identity matching across surveillance conditions.
Validated the framework on CASIA-B using gallery-probe matching across normal walking, bag-carrying, and clothing-change settings.

Research Intern

CALIBRE Lab, Delhi Technological University

Jun 2025 – Oct 2025 New Delhi, India

Worked on mammography ROI classification, focusing on benign-malignant lesion understanding under variation in texture, scale, appearance, and class balance.

Built a hybrid EfficientNetV2-M + Vision Mamba pipeline for local lesion representation and efficient global context modeling.
Evaluated the system on CBIS-DDSM lesion crops against CNN and Transformer baselines.
Connected model design with practical diagnostic constraints, where subtle visual patterns and long-range context both matter.

Computer Vision Engineer – Intern

DeepSight AI Labs Pvt. Ltd.

Aug 2025 – Feb 2026 Gurugram, India

Worked on applied AI systems for enterprise data access, internal document intelligence, forecasting, and workflow automation.

Developed an agentic AI system for querying structured, unstructured, and operational data through natural language.
Built prediction, NLP, and automation components for internal decision-support workflows.
Contributed to production-oriented tools where reliable retrieval and usable outputs mattered more than demo performance.

Research Collaborator

Collaboration with Purdue & Penn State University

Jan 2025 – Mar 2025 Remote

Explored reward-guided adaptation of Stable Diffusion v1.5 for task-specific image generation, using reinforcement learning to steer visual quality beyond the base diffusion objective.

Designed a parameter-efficient RL fine-tuning setup using LoRA layers in U-Net attention blocks.
Used reward scoring and PPO-style policy updates to guide latent diffusion outputs.
Analyzed the difference between moderate reward optimization and over-optimization in generated images.

Contact

Feel free to reach out for research collaborations, internships, or academic opportunities.