๐Ÿ‘จโ€๐Ÿ’ป About Me

Hi, I'm Mudit Jain, an engineer at heart with expertise in 3D/2D Machine Learning, SLAM, Computer Vision, GPU programming, and Embedded Systems.

Currently, my work is centered on using Multimodal AI, particularly fusing LiDAR and camera data for 2D/3D static/dynamic object detection/tracking. My interests include applied ML and CV research, real-time SLAM systems, and advancing computer vision capabilities for autonomous vehicles.

๐Ÿ“ฐ News & Updates

2025: Serving as a reviewer for IEEE IV 2025.

2024: Joined Qualcomm as Senior Deep Learning Engineer in Multimodal AI.

2024: Mentored a project at Google Summer of Code on 3D Reconstruction.

2024: Serving as a reviewer for WACV 2025.

2021: Joined Qualcomm as Senior Machine Learning Engineer, XR Research.

2021: Graduated with Master's degree from University of California San Diego.

2019: Started Master's in Electrical and Computer Engineering at UCSD.

2019: Joined DroneLab at UCSD as Graduate Student Researcher.

2017: Promoted to Embedded System Software Engineer II at NVIDIA.

2016: Joined NVIDIA as Embedded System Software Engineer I.

2016: Graduated with Bachelor's degree from BITS Pilani.

2016: Selected for Google Summer of Code as Developer for RTEMS.

2015: Joined NVIDIA as an Intern.

2014: Joined Srujana Innovation Center as an Intern.

2012: Started Bachelor's in Electronics and Communication at BITS Pilani.

๐Ÿ’ผ Work History

  • Senior Deep Learning Engineer - Multimodal AI Qualcomm Jun 2024 - Present

    Developing Multimodal AI (LiDAR + Camera) for static and dynamic object detection and tracking. Building Vision Foundation model-based networks for various autonomous tasks.

    Manager: Senthil Yogamani, Dr. Varun Ravi Kumar

    San Diego, California

  • Developer & Mentor - 3D Reconstruction Google Summer of Code May 2024 - Sep 2024

    Mentoring project on converting unconstrained video to Gaussian Splats with OpenCV organization. Working with Gary Bradski, Founder & President of OpenCV foundation.

    Manager: Gary Bradski

    San Diego, California

  • Senior Machine Learning Engineer Qualcomm Aug 2021 - Jun 2024

    Led the development of optimized visual odometry solutions for AR/VR/MR use cases. XR Research - Computer Vision team focusing on spatial computing technologies.

    Manager: Vasudev Bhaskaran

    San Diego, California

  • Graduate Student Researcher DroneLab, University of California San Diego Sep 2019 - May 2021

    Designed and deployed an Attention-based CNN on 600 cameras for wildfire detection (ALERTWildFire initiative). Developed 4K video processing pipeline on NVIDIA AGX Xavier using DeepStream SDK and TensorRT.

    Manager: Dr. Falko Kuester

    San Diego, California

  • Graduate Teaching Assistant University of California San Diego Jan 2020 - Mar 2020

    Taught Art of Product Engineering (ECE 140A) covering end-to-end software development and hardware integration.

    San Diego, California

  • Embedded System Software Engineer II NVIDIA Oct 2017 - Jul 2019

    Designed I2C Virtualization per ISO26262 functional safety standards for ARM-based SoCs (Xavier/Parker). Optimized bootloader and implemented OS-agnostic GPCDMA library for automotive applications.

    Bengaluru, India

  • Embedded System Software Engineer I NVIDIA Jul 2016 - Oct 2017

    Bengaluru, India

  • Developer Google Summer of Code Apr 2016 - Jul 2016

    Ported FreeBSD SDMMC driver for RTEMS and added DMA library for Raspberry Pi BSP.

    Bengaluru, India ยท Remote

  • Intern NVIDIA Jul 2015 - Dec 2015

    Developed production tools for automotive customers to create OS firmware and boot targets.

    Bengaluru, India

  • Intern Srujana Innovation Center Dec 2014 - Apr 2015

    Developed a low-cost wearable VR headset and Pupil+ platform for eye diagnosis (MIT Media Labs collaboration).

    Hyderabad, India

๐ŸŽ“ Education

  • Master's degree in Electrical and Computer Engineering University of California San Diego (UCSD) 2019-2021

    Specialization in Machine Learning and Data Science

    Courses: Linear Algebra, Probability and Statistics, Statistical Learning, Visual Learning, Computer Vision I & III, GPU Programming, Deep Learning and Applications

  • Bachelor of Engineering (B.E.) in Electronics and Communication Birla Institute of Technology and Science, Pilani 2012-2016

    Minor in Data Science with focus on Embedded Systems and Computer Vision

๐Ÿ› ๏ธ Skills

๐Ÿ’ป Programming Languages

C++ [6+ years] Python [6+ years]

๐Ÿง  Technical Knowledge Domains

Multimodal Large Scale Deep Learning [Ray, Kubernetes, PyTorch] Classical Computer Vision [C++, OpenCV] 3D Computer Vision Machine Learning [PyTorch, JAX] SLAM [ORB-SLAM, VINS Mono] Non-linear Optimization [Eigen, g2o, ceres, GTSAM] Bundle Adjustment Camera Calibration Pose Graph Optimization IMU Preintegration Bayesian Inference Embedded Systems SIMD Programming [CUDA] Model Optimization [TensorRT] 3D Reconstruction [NeRFs, Gaussian Splatting] 2D/3D Object Detection

๐Ÿš€ Projects

DINOv2 Object Detection

Implementation of object detection using DINOv2 self-supervised vision transformers, enabling state-of-the-art zero-shot detection capabilities.

Bundle Adjustment in the Large

Optimization methods for large-scale bundle adjustment in 3D reconstruction challenges, focusing on efficiency and scalability.

Depth Estimation

Monocular depth estimation techniques for 3D scene understanding from 2D images using deep learning approaches.

Custom CUDA Implementation for Multi-Agent Reinforcement Learning

Accelerated Q-table updates and reward policies for multi-agent Q-learning using CUDA, achieving 100% training accuracy and 99.8% test accuracy in under 4 minutes on a 46ร—46 grid with 512 agents.

University of California San Diego โ€ข Jan 2021 - Mar 2021

Speeding up Mario RL with Custom Torch C++ Extensions

Developed custom CUDA kernels for linear, pooling, ReLU, and convolutional layers to accelerate the training and inference of a CNN for a Double Q-learning based RL agent playing Mario.

University of California San Diego โ€ข Jan 2021 - Mar 2021

AlertWILDFire Plume Detection

Deployed an ensemble neural network model across 610 cameras in California for early wildfire detection, using modified MaskRCNN with focal loss and an EfficientNet-based segmentation model with SCSE attention.

Drone Lab - UCSD โ€ข Jul 2020 - Jan 2021

Domain Adaptation for Semantic Segmentation

Trained OCNet on Cityscapes dataset and used CycleGAN-based domain adaptation to generate real-world-like data from gaming data, improving model performance through expanded training data.

University of California San Diego โ€ข Sep 2019 - Dec 2019

Image Denoising using Deep CNNs

Implemented and compared DnCNN, UDnCNN, and DUDnCNN architectures for image denoising, achieving up to 99.85% accuracy with U-Net with dilated convolutions.

University of California San Diego โ€ข Sep 2019 - Dec 2019

3D Reconstruction of the Anterior Segment of the Eye

Developed image processing pipeline and GUI interface for 3D eye model reconstruction by projecting patterns on the anterior segment and applying PCA and object tracking techniques.

Srujana Innovation Center and MIT Media Labs

๐Ÿ† Honors & Awards

100% Tuition Scholarship

DroneLab, University of California San Diego

Full tuition coverage for exceptional research contributions and academic merit.

Telangana Overseas Scholarship

Government of Telangana โ€ข University of California San Diego

Prestigious government scholarship awarded to exceptional students for pursuing graduate studies abroad.

MCN Scholarship

Birla Institute of Technology and Science, Pilani

Merit-based scholarship recognizing academic excellence and leadership potential.

๐Ÿ“ Blogs

Bird's-Eye View Perception

Comprehensive overview of Bird's-Eye View (BEV) perception techniques for autonomous driving, covering image-based, LiDAR-based, and multi-modal approaches.

Learning C++

Guide to mastering C++ programming language from fundamentals to advanced concepts, with practical examples and best practices for efficient code development.

Attention Mechanisms in 2D and 3D Vision Transformers

Detailed exploration of attention mechanisms in modern vision transformers, analyzing their application in both 2D and 3D computer vision tasks.

Adapters for Vision Foundation Models

Analysis of parameter-efficient adaptation methods for vision models, including visual adapters, prompt tuning, and task-specific fine-tuning approaches.