-
ST0096: Internship - Multimodal Tracking and Imaging
MERL is seeking a motivated intern to assist in developing hardware and algorithms for multimodal imaging applications. The project involves integration of radar, camera, and depth sensors in a variety of sensing scenarios. The ideal candidate should have experience with FMCW radar and/or depth sensing, and be fluent in Python and scripting methods. Familiarity with optical tracking of humans and experience with hardware prototyping is desired. Good knowledge of computational imaging and/or radar imaging methods is a plus.
Required Specific Experience
- Experience with Python and Python Deep Learning Frameworks.
- Experience with FMCW radar and/or Depth Sensors.
- Research Areas: Computer Vision, Machine Learning, Signal Processing, Computational Sensing
- Host: Petros Boufounos
- Apply Now
-
ST0174: Internship - Sensor Reasoning Models
The Computation Sensing team at MERL is seeking a highly motivated intern to conduct fundamental research on sensor reasoning models—algorithms that can understand, explain, and act on multi-sensor data (e.g., RF, infrared, LiDAR, event camera) through text-, visual-, and multimodal reasoning. Ideal candidates will be comfortable bridging modern perception (detection/segmentation/tracking) with higher-level reasoning capabilities. Experience with text, visual, and multimodal reasoning is highly preferred. The intern will work closely with MERL researchers to develop novel algorithms, design experiments using MERL’s in-house testbeds, and prepare results for patents and publication. The internship is expected to last 3 months, with a flexible start date from October 2025 onward.
Required Specific Experience
- Reasoning with sensor data: Demonstrated work in text-, visual-, and multimodal reasoning (e.g., VQA over sensor streams, temporal/spatio-temporal reasoning, chain-of-thought, instruction following).
- LLMs & VLMs for sensor perception: Experience aligning or conditioning LLMs/VLMs on sensor outputs (e.g., point clouds, radar heatmaps, BEV features).
- Perception foundations: Solid understanding of state-of-the-art transformer-based (e.g., DETR) and diffusion-based (e.g., DiffusionDet) frameworks
- Datasets & evaluation: Hands-on experience with open large-scale multi-sensor datasets (e.g., nuScenes, Waymo Open Dataset, Argoverse) and open radar datasets (e.g., MMVR, HIBER, RT-Pose, K-Radar). Ability to design reasoning-centric benchmarks (e.g., QA over multi-sensor inputs, temporal prediction).
- Proficiency in Python and deep learning frameworks (PyTorch/JAX), plus experience with GPU cluster job scheduling and scalable data pipelines.
- Proven publication record in top-tier venues such as CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML (or equivalent).
- Knowledge of sensor (RF, infrared, LiDAR, event camera) fundamentals; for radar, familiarity with FMCW, MIMO, Doppler signatures, radar point clouds/heatmaps, and raw ADC waveforms.
- Familiarity with MERL’s recent radar perception research, e.g., TempoRadar, SIRA, MMVR, RETR.
- Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning
- Host: Perry Wang
- Apply Now