I teach machines to see, understand, and solve real-world problems.

I started exploring AI through computer vision, fascinated by the idea that machines could learn from pixels and patterns. Over time, that curiosity turned into a research path focused on making visual AI work in messy, real-world conditions.

After my sophomore year, I began working on gait recognition for surveillance, where faces are often unreliable and walking patterns can become a stronger identity cue. That experience pushed me toward visual recognition problems where AI has to work under noise, occlusion, limited data, and real deployment constraints. I later explored the same challenge in mammography classification and orchard-scale object detection, where models must understand complex visual patterns instead of only performing well on clean benchmarks.

Today, my focus is reliable visual intelligence: building models that can see, interpret, and reason over complex visual data. I am especially interested in vision-language models, multimodal learning, and trustworthy visual AI, with the goal of moving these systems toward safer and more useful real-world understanding.

Interests

  • Computer Vision
  • Vision Language Models
  • Multimodal Learning
  • RL for Visual Intelligence
  • Algorithms
  • Medical Image Processing

Education

  • B.Tech in Electronics and Communications Engineering Delhi Technological University 2022 – 2026

Publications

Selected research papers and preprints.

2026

Preprint 2026

Gait Recognition with Temporal Kolmogorov–Arnold Networks

Mohammed Asad, Dr. Dinesh Kumar Vishwakarma

A compact gait-recognition model that learns short-term walking cycles and longer motion context, improving silhouette-based identity matching when faces, clothing, views, and video conditions are unreliable.

Paper

Experience

Research, industry, and academic positions.

Delhi Technological University logo

Bachelor's Thesis Researcher

Delhi Technological University

Sep 2025 – Present New Delhi, India

Built an orchard-scale apple detection study focused on visual reliability under occlusion, lighting variation, dense fruit clusters, and leaf-cluttered backgrounds. Associated paper accepted at ICICV 2026.

  • Formulated single-class apple detection on AppleBBCH81 for real-world orchard imagery.
  • Benchmarked one-stage, two-stage, and transformer-based detectors using COCO-style mAP, precision, recall, F1-score, and precision-recall analysis.
  • Studied edge deployment feasibility for Raspberry Pi-class devices, focusing on models that remain useful outside clean benchmark settings.
MMDA Lab DTU logo

Research Intern

MMDA Lab, Delhi Technological University

Jun 2024 – Nov 2025 New Delhi, India

Developed a silhouette-based gait recognition framework for identity matching when face cues are weak or unavailable. The work focuses on temporal modeling under view, clothing, carrying, noise, and long-sequence variation.

  • Designed a CNN + Temporal KAN architecture with learnable temporal dynamics and gated long-term memory.
  • Modeled local gait cycles and broader motion context for robust identity matching across surveillance conditions.
  • Validated the framework on CASIA-B using gallery-probe matching across normal walking, bag-carrying, and clothing-change settings.
CALIBRE Lab DTU logo

Research Intern

CALIBRE Lab, Delhi Technological University

Jun 2025 – Oct 2025 New Delhi, India

Worked on mammography ROI classification, focusing on benign-malignant lesion understanding under variation in texture, scale, appearance, and class balance.

  • Built a hybrid EfficientNetV2-M + Vision Mamba pipeline for local lesion representation and efficient global context modeling.
  • Evaluated the system on CBIS-DDSM lesion crops against CNN and Transformer baselines.
  • Connected model design with practical diagnostic constraints, where subtle visual patterns and long-range context both matter.
DeepSight AI Labs logo

Computer Vision Engineer – Intern

DeepSight AI Labs Pvt. Ltd.

Aug 2025 – Feb 2026 Gurugram, India

Worked on applied AI systems for enterprise data access, internal document intelligence, forecasting, and workflow automation.

  • Developed an agentic AI system for querying structured, unstructured, and operational data through natural language.
  • Built prediction, NLP, and automation components for internal decision-support workflows.
  • Contributed to production-oriented tools where reliable retrieval and usable outputs mattered more than demo performance.
Purdue University logo

Research Collaborator

Collaboration with Purdue & Penn State University

Jan 2025 – Mar 2025 Remote

Explored reward-guided adaptation of Stable Diffusion v1.5 for task-specific image generation, using reinforcement learning to steer visual quality beyond the base diffusion objective.

  • Designed a parameter-efficient RL fine-tuning setup using LoRA layers in U-Net attention blocks.
  • Used reward scoring and PPO-style policy updates to guide latent diffusion outputs.
  • Analyzed the difference between moderate reward optimization and over-optimization in generated images.

Contact

Feel free to reach out for research collaborations, internships, or academic opportunities.