Tim K. Marks

- Phone: 617-621-7524
- Email:
-
Position:
Research / Technical Staff
Senior Principal Research Scientist, Team Leader -
Education:
Ph.D., University of California San Diego, 2006 -
Research Areas:
- Computer Vision
- Artificial Intelligence
- Machine Learning
- Speech & Audio
- Robotics
- Human-Computer Interaction
- Signal Processing
External Links:
Tim's Quick Links
-
Biography
Prior to joining MERL's Imaging Group in 2008, Tim did postdoctoral research in robotic Simultaneous Localization and Mapping in collaboration with NASA's Jet Propulsion Laboratory. His research at MERL spans a variety of areas in computer vision and machine learning, including face recognition under variations in pose and lighting, and robotic vision and touch-based registration for industrial automation.
-
Recent News & Events
-
NEWS MERL presenting 8 papers at ICASSP 2022 Date: May 22, 2022 - May 27, 2022
Where: Singapore
MERL Contacts: Anoop Cherian; Chiori Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Tim K. Marks; Philip V. Orlik; Kuan-Chuan Peng; Pu (Perry) Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computer Vision, Signal Processing, Speech & AudioBrief- MERL researchers are presenting 8 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held in Singapore from May 22-27, 2022. A week of virtual presentations also took place earlier this month.
Topics to be presented include recent advances in speech recognition, audio processing, scene understanding, computational sensing, and classification.
ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
- MERL researchers are presenting 8 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held in Singapore from May 22-27, 2022. A week of virtual presentations also took place earlier this month.
-
NEWS MERL work on scene-aware interaction featured in IEEE Spectrum Date: March 1, 2022
MERL Contacts: Anoop Cherian; Chiori Hori; Jonathan Le Roux; Tim K. Marks; Anthony Vetro
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Speech & AudioBrief- MERL's research on scene-aware interaction was recently featured in an IEEE Spectrum article. The article, titled "At Last, A Self-Driving Car That Can Explain Itself" and authored by MERL Senior Principal Research Scientist Chiori Hori and MERL Director Anthony Vetro, gives an overview of MERL's efforts towards developing a system that can analyze multimodal sensing information for highly natural and intuitive interaction with humans through context-dependent generation of natural language. The technology recognizes contextual objects and events based on multimodal sensing information, such as images and video captured with cameras, audio information recorded with microphones, and localization information measured with LiDAR.
Scene-Aware Interaction for car navigation, one target application that the article focuses on, will provide drivers with intuitive route guidance. Scene-Aware Interaction technology is expected to have wide applicability, including human-machine interfaces for in-vehicle infotainment, interaction with service robots in building and factory automation systems, systems that monitor the health and well-being of people, surveillance systems that interpret complex scenes for humans and encourage social distancing, support for touchless operation of equipment in public areas, and much more. MERL's Scene-Aware Interaction Technology had previously been featured in a Mitsubishi Electric Corporation Press Release.
IEEE Spectrum is the flagship magazine and website of the IEEE, the world’s largest professional organization devoted to engineering and the applied sciences. IEEE Spectrum has a circulation of over 400,000 engineers worldwide, making it one of the leading science and engineering magazines.
- MERL's research on scene-aware interaction was recently featured in an IEEE Spectrum article. The article, titled "At Last, A Self-Driving Car That Can Explain Itself" and authored by MERL Senior Principal Research Scientist Chiori Hori and MERL Director Anthony Vetro, gives an overview of MERL's efforts towards developing a system that can analyze multimodal sensing information for highly natural and intuitive interaction with humans through context-dependent generation of natural language. The technology recognizes contextual objects and events based on multimodal sensing information, such as images and video captured with cameras, audio information recorded with microphones, and localization information measured with LiDAR.
See All News & Events for Tim -
-
Awards
-
AWARD MERL Researchers win Best Paper Award at ICCV 2019 Workshop on Statistical Deep Learning in Computer Vision Date: October 27, 2019
Awarded to: Abhinav Kumar, Tim K. Marks, Wenxuan Mou, Chen Feng, Xiaoming Liu
MERL Contact: Tim K. Marks
Research Areas: Artificial Intelligence, Computer Vision, Machine LearningBrief- MERL researcher Tim Marks, former MERL interns Abhinav Kumar and Wenxuan Mou, and MERL consultants Professor Chen Feng (NYU) and Professor Xiaoming Liu (MSU) received the Best Oral Paper Award at the IEEE/CVF International Conference on Computer Vision (ICCV) 2019 Workshop on Statistical Deep Learning in Computer Vision (SDL-CV) held in Seoul, Korea. Their paper, entitled "UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss," describes a method which, given an image of a face, estimates not only the locations of facial landmarks but also the uncertainty of each landmark location estimate.
-
-
Research Highlights
-
Internships with Tim
-
CV1920: Conditional Diffusion Models in Computer Vision
We seek a highly motivated intern to conduct original research in conditional diffusion models for computer vision tasks. We are interested in applications to various tasks including image editing, multimodal generation, and image-to-image translation. The successful candidate will collaborate with MERL researchers to design and implement new models, conduct experiments, and prepare results for publication. The candidate should be a PhD student (or postdoc) in computer vision and machine learning with a strong publication record. Strong programming skills, experience developing and implementing new models in deep learning platforms such as PyTorch, and broad knowledge of machine learning and deep learning methods are expected. Previous experience in diffusion models is required.
-
CV1922: Vital Signs from video using computer vision and machine learning
MERL is seeking a highly motivated intern to conduct original research in estimating vital signs such as heart rate, heart rate variability, and blood pressure from video of a person. The successful candidate will use the latest methods in deep learning, computer vision, and signal processing to derive and implement new models, collect data, conduct experiments, and prepare results for publication, all in collaboration with MERL researchers. The candidate should be a PhD student in computer vision with a strong publication record and experience in computer vision, signal processing, machine learning, and health monitoring. Strong programming skills (Python, Pytorch, Matlab, etc.) are required.
-
-
MERL Publications
- "Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2022, pp. 7732-7736.BibTeX TR2022-019 PDF
- @inproceedings{Shah2022apr,
- author = {Shah, Ankit Parag and Geng, Shijie and Gao, Peng and Cherian, Anoop and Hori, Takaaki and Marks, Tim K. and Le Roux, Jonathan and Hori, Chiori},
- title = {Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning},
- booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
- year = 2022,
- pages = {7732--7736},
- month = apr,
- publisher = {IEEE},
- issn = {1520-6149},
- isbn = {978-1-6654-0540-9},
- url = {https://www.merl.com/publications/TR2022-019}
- }
, - "Overview of Audio Visual Scene-Aware Dialog with Reasoning Track for Natural Language Generation in DSTC10", The 10th Dialog System Technology Challenge Workshop at AAAI, February 2022.BibTeX TR2022-016 PDF
- @inproceedings{Hori2022feb,
- author = {Hori, Chiori and Shah, Ankit Parag and Geng, Shijie and Gao, Peng and Cherian, Anoop and Hori, Takaaki and Le Roux, Jonathan and Marks, Tim K.},
- title = {Overview of Audio Visual Scene-Aware Dialog with Reasoning Track for Natural Language Generation in DSTC10},
- booktitle = {The 10th Dialog System Technology Challenge Workshop at AAAI},
- year = 2022,
- month = feb,
- url = {https://www.merl.com/publications/TR2022-016}
- }
, - "(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering", AAAI Conference on Artificial Intelligence, DOI: 10.1609/aaai.v36i1.19922, February 2022, pp. 444-453.BibTeX TR2022-014 PDF Video Presentation
- @inproceedings{Cherian2022feb,
- author = {Cherian, Anoop and Hori, Chiori and Marks, Tim K. and Le Roux, Jonathan},
- title = {(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering},
- booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
- year = 2022,
- pages = {444--453},
- month = feb,
- doi = {10.1609/aaai.v36i1.19922},
- url = {https://www.merl.com/publications/TR2022-014}
- }
, - "MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation", AAAI Conference on Artificial Intelligence, DOI: 10.1609/aaai.v36i2.20091, February 2022, pp. 1962-1971.BibTeX TR2022-011 PDF Video
- @inproceedings{Medin2022feb,
- author = {Medin, Safa C. and Egger, Bernhard and Cherian, Anoop and Wang, Ye and Tenenbaum, Joshua B. and Liu, Xiaoming and Marks, Tim K.},
- title = {MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation},
- booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
- year = 2022,
- pages = {1962--1971},
- month = feb,
- doi = {10.1609/aaai.v36i2.20091},
- url = {https://www.merl.com/publications/TR2022-011}
- }
, - "InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images", IEEE International Conference on Computer Vision (ICCV), October 2021, pp. 10023-10032.BibTeX TR2021-097 PDF Video Data Software Presentation
- @inproceedings{Cherian2021oct,
- author = {Cherian, Anoop and Pais, Goncalo and Jain, Siddarth and Marks, Tim K. and Sullivan, Alan},
- title = {InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images},
- booktitle = {IEEE International Conference on Computer Vision (ICCV)},
- year = 2021,
- pages = {10023--10032},
- month = oct,
- publisher = {CVF},
- url = {https://www.merl.com/publications/TR2021-097}
- }
,
- "Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2022, pp. 7732-7736.
-
Other Publications
- "Gamma-SLAM: Visual SLAM in unstructured environments using variance grid maps", Journal of Field Robotics, Vol. 26, No. 1, pp. 26-51, 2009.BibTeX
- @Article{marks2009gamma,
- author = {Marks, Tim K and Howard, Andrew and Bajracharya, Max and Cottrell, Garrison W and Matthies, Larry H},
- title = {Gamma-SLAM: Visual SLAM in unstructured environments using variance grid maps},
- journal = {Journal of Field Robotics},
- year = 2009,
- volume = 26,
- number = 1,
- pages = {26--51},
- publisher = {Wiley Online Library}
- }
, - "NIMBLE: A kernel density model of saccade-based visual memory", Journal of Vision, Vol. 8, No. 14, 2008.BibTeX
- @Article{barrington2008nimble,
- author = {Barrington, Luke and Marks, Tim K and Hsiao, Janet Hui-wen and Cottrell, Garrison W},
- title = {NIMBLE: A kernel density model of saccade-based visual memory},
- journal = {Journal of Vision},
- year = 2008,
- volume = 8,
- number = 14,
- publisher = {Association for Research in Vision and Ophthalmology}
- }
, - "Gamma-SLAM: Using stereo vision and variance grid maps for SLAM in unstructured environments", Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on, 2008, pp. 3717-3724.BibTeX
- @Inproceedings{marks2008gamma,
- author = {Marks, Tim K and Howard, Andrew and Bajracharya, Max and Cottrell, Garrison W and Matthies, Larry},
- title = {Gamma-SLAM: Using stereo vision and variance grid maps for SLAM in unstructured environments},
- booktitle = {Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on},
- year = 2008,
- pages = {3717--3724},
- organization = {IEEE}
- }
, - "SUN: A Bayesian framework for saliency using natural statistics", Journal of Vision, Vol. 8, No. 7, 2008.BibTeX
- @Article{zhang2008sun,
- author = {Zhang, Lingyun and Tong, Matthew H and Marks, Tim K and Shan, Honghao and Cottrell, Garrison W},
- title = {SUN: A Bayesian framework for saliency using natural statistics},
- journal = {Journal of Vision},
- year = 2008,
- volume = 8,
- number = 7,
- publisher = {Association for Research in Vision and Ophthalmology}
- }
, - "Gamma-SLAM: Stereo visual SLAM in unstructured environments using variance grid maps", IROS visual SLAM workshop, 2007.BibTeX
- @Article{marks2007gamma,
- author = {Marks, Tim K and Howard, Andrew and Bajracharya, Max and Cottrell, Garrison W and Matthies, Larry},
- title = {Gamma-SLAM: Stereo visual SLAM in unstructured environments using variance grid maps},
- journal = {IROS visual SLAM workshop},
- year = 2007,
- publisher = {Citeseer}
- }
, - "Joint tracking of pose, expression, and texture using conditionally Gaussian filters", Advances in neural information processing systems, Vol. 17, pp. 889-896, 2005.BibTeX
- @Article{marks2005joint,
- author = {Marks, Tim K and Hershey, John and Roddey, J Cooper and Movellan, Javier R},
- title = {Joint tracking of pose, expression, and texture using conditionally Gaussian filters},
- journal = {Advances in neural information processing systems},
- year = 2005,
- volume = 17,
- pages = {889--896}
- }
, - "3d tracking of morphable objects using conditionally gaussian nonlinear filters", Computer Vision and Pattern Recognition Workshop, 2004. CVPRW'04. Conference on, 2004, pp. 190-190.BibTeX
- @Inproceedings{marks20043d,
- author = {Marks, Tim K and Hershey, John and Roddey, J Cooper and Movellan, Javier R},
- title = {3d tracking of morphable objects using conditionally gaussian nonlinear filters},
- booktitle = {Computer Vision and Pattern Recognition Workshop, 2004. CVPRW'04. Conference on},
- year = 2004,
- pages = {190--190},
- organization = {IEEE}
- }
, - "Diffusion networks, products of experts, and factor analysis", Proc. Int. Conf. on Independent Component Analysis, pp. 481-485, 2001.BibTeX
- @Article{marks2001diffusion,
- author = {Marks, Tim K and Movellan, Javier R},
- title = {Diffusion networks, products of experts, and factor analysis},
- journal = {Proc. Int. Conf. on Independent Component Analysis},
- year = 2001,
- pages = {481--485},
- publisher = {Citeseer}
- }
,
- "Gamma-SLAM: Visual SLAM in unstructured environments using variance grid maps", Journal of Field Robotics, Vol. 26, No. 1, pp. 26-51, 2009.
-
Software Downloads
-
Videos
-
MERL Issued Patents
-
Title: "Scene-Aware Video Encoder System and Method"
Inventors: Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan; Marks, Tim; Sullivan, Alan
Patent No.: 11,582,485
Issue Date: Feb 14, 2023 -
Title: "Low-latency Captioning System"
Inventors: Hori, Chiori; Hori, Takaaki; Cherian, Anoop; Marks, Tim; Le Roux, Jonathan
Patent No.: 11,445,267
Issue Date: Sep 13, 2022 -
Title: "System and Method for a Dialogue Response Generation System"
Inventors: Hori, Chiori; Cherian, Anoop; Marks, Tim; Hori, Takaaki
Patent No.: 11,264,009
Issue Date: Mar 1, 2022 -
Title: "System and Method for Remote Measurements of Vital Signs"
Inventors: Marks, Tim; Mansour, Hassan; Nowara, Ewa; Nakamura, Yudai; Veeraraghavan, Ashok N.
Patent No.: 11,259,710
Issue Date: Mar 1, 2022 -
Title: "Image Processing System and Method for Landmark Location Estimation with Uncertainty"
Inventors: Marks, Tim; Kumar, Abhinav; Mou, Wenxuan; Feng, Chen; Liu, Xiaoming
Patent No.: 11,127,164
Issue Date: Sep 21, 2021 -
Title: "Method and System for Determining 3D Object Poses and Landmark Points using Surface Patches"
Inventors: Jones, Michael J.; Marks, Tim; Papazov, Chavdar
Patent No.: 10,515,259
Issue Date: Dec 24, 2019 -
Title: "Method and System for Multi-Modal Fusion Model"
Inventors: Hori, Chiori; Hori, Takaaki; Hershey, John R.; Marks, Tim
Patent No.: 10,417,498
Issue Date: Sep 17, 2019 -
Title: "Method and System for Detecting Actions in Videos"
Inventors: Jones, Michael J.; Tuzel, Oncel; Marks, Tim; Singh, Bharat
Patent No.: 10,242,266
Issue Date: Mar 26, 2019 -
Title: "Method and System for Detecting Actions in Videos using Contour Sequences"
Inventors: Jones, Michael J.; Marks, Tim; Kulkarni, Kuldeep
Patent No.: 10,210,391
Issue Date: Feb 19, 2019 -
Title: "Method for Estimating Locations of Facial Landmarks in an Image of a Face using Globally Aligned Regression"
Inventors: Tuzel, Oncel; Marks, Tim; Tambe, Salil
Patent No.: 9,633,250
Issue Date: Apr 25, 2017 -
Title: "Method for Generating Representations Polylines Using Piecewise Fitted Geometric Primitives"
Inventors: Brand, Matthew E.; Marks, Tim; MV, Rohith
Patent No.: 9,613,443
Issue Date: Apr 4, 2017 -
Title: "Method for Determining Similarity of Objects Represented in Images"
Inventors: Jones, Michael J.; Marks, Tim; Ahmed, Ejaz
Patent No.: 9,436,895
Issue Date: Sep 6, 2016 -
Title: "Method for Detecting 3D Geometric Boundaries in Images of Scenes Subject to Varying Lighting"
Inventors: Marks, Tim; Tuzel, Oncel; Porikli, Fatih M.; Thornton, Jay E.; Ni, Jie
Patent No.: 9,418,434
Issue Date: Aug 16, 2016 -
Title: "Method for Factorizing Images of a Scene into Basis Images"
Inventors: Tuzel, Oncel; Marks, Tim; Porikli, Fatih M.; Ni, Jie
Patent No.: 9,384,553
Issue Date: Jul 5, 2016 -
Title: "Method and System for Tracking People in Indoor Environments using a Visible Light Camera and a Low-Frame-Rate Infrared Sensor"
Inventors: Marks, Tim; Jones, Michael J.; Kumar, Suren
Patent No.: 9,245,196
Issue Date: Jan 26, 2016 -
Title: "Method for Detecting and Tracking Objects in Image Sequences of Scenes Acquired by a Stationary Camera"
Inventors: Marks, Tim; Jones, Michael J.; MV, Rohith
Patent No.: 9,213,896
Issue Date: Dec 15, 2015 -
Title: "Method and System for Segmenting Moving Objects from Images Using Foreground Extraction"
Inventors: Veeraraghavan, Ashok N.; Marks, Tim; Taguchi, Yuichi
Patent No.: 8,941,726
Issue Date: Jan 27, 2015 -
Title: "Camera-Based 3D Climate Control"
Inventors: Marks, Tim; Jones, Michael J.
Patent No.: 8,929,592
Issue Date: Jan 6, 2015 -
Title: "Method and System for Registering an Object with a Probe Using Entropy-Based Motion Selection and Rao-Blackwellized Particle Filtering"
Inventors: Taguchi, Yuichi; Marks, Tim; Hershey, John R.
Patent No.: 8,510,078
Issue Date: Aug 13, 2013 -
Title: "Localization in Industrial Robotics Using Rao-Blackwellized Particle Filtering"
Inventors: Marks, Tim; Taguchi, Yuichi
Patent No.: 8,219,352
Issue Date: Jul 10, 2012 -
Title: "Method for Synthetically Images of Objects"
Inventors: Jones, Michael J.; Marks, Tim; Kumar, Ritwik
Patent No.: 8,194,072
Issue Date: Jun 5, 2012
-
Title: "Scene-Aware Video Encoder System and Method"