TR2024-041

SIRA: Scalable Inter-frame Relation and Association for Radar Perception


    •  Yataka, R., Wang, P., Boufounos, P.T., Takahashi, R., "SIRA: Scalable Inter-frame Relation and Association for Radar Perception", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024.
      BibTeX TR2024-041 PDF
      • @inproceedings{Yataka2024jun,
      • author = {Yataka, Ryoma and Wang, Pu and Boufounos, Petros T. and Takahashi, Ryuhei},
      • title = {SIRA: Scalable Inter-frame Relation and Association for Radar Perception},
      • booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2024,
      • month = jun,
      • url = {https://www.merl.com/publications/TR2024-041}
      • }
  • MERL Contacts:
  • Research Areas:

    Computational Sensing, Machine Learning, Signal Processing

Abstract:

Conventional radar feature extraction faces limitations due to low spatial resolution, noise, multipath reflection, the presence of ghost targets, and motion blur. Such limitations can be exacerbated by nonlinear object motion, particularly from an ego-centric viewpoint. It becomes evident that to address these challenges, the key lies in exploiting temporal feature relation over an extended horizon and en- forcing spatial motion consistence for effective association. To this end, this paper proposes SIRA (Scalable Inter-frame Relation and Association) with two designs. First, inspired by Swin Transformer, we introduce extended temporal relation, generalizing the existing temporal relation layer from two consecutive frames to multiple inter-frames with tem- porally regrouped window attention for scalability. Second, we propose motion consistency track with the concept of a pseudo-tracklet generated from observational data for bet- ter trajectory prediction and subsequent object association. Our approach achieves 58.11 mAP@0.5 for oriented object detection and 47.79 MOTA for multiple object tracking on the Radiate dataset, surpassing previous state-of-the-art by a margin of +4.11 mAP@0.5 and +9.94 MOTA, respectively. 1. Introduction Automotive perception involves the in