TR2025-138

Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal


    •  Hu, Y., Lohit, S., Kamilov, U., Marks, T.K., "Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal", IEEE Transactions on Geoscience and Remote Sensing, September 2025.
      BibTeX TR2025-138 PDF
      • @article{Hu2025sep2,
      • author = {Hu, Yuyang and Lohit, Suhas and Kamilov, Ulugbek and Marks, Tim K.},
      • title = {{Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal}},
      • journal = {IEEE Transactions on Geoscience and Remote Sensing},
      • year = 2025,
      • month = sep,
      • url = {https://www.merl.com/publications/TR2025-138}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Computer Vision, Machine Learning

Abstract:

Deep learning has achieved some success in addressing the challenge of cloud removal in optical satellite images, by fusing with synthetic aperture radar (SAR) images. Recently, diffusion models have emerged as powerful tools for cloud removal, delivering higher-quality estimation by sampling from cloud-free distributions, compared to earlier methods. However, diffusion models suffer from limitations that can result in suboptimal performance. In particular, diffusion models initiate sampling from pure Gaussian noise, which complicates the sampling trajectory. Moreover, current methods often inadequately fuse SAR and optical data; simple concatenation of these disparate modalities at the input stage typically yields suboptimal results. To address these limitations, we propose Diffusion Bridges for Cloud Removal, DB-CR, which directly bridges between the cloudy and cloud-free image distributions.
In addition, we propose a novel multimodal diffusion bridge architecture with a two-branch backbone for multimodal image restoration, incorporating an efficient backbone and dedicated cross-modality fusion blocks to effectively extract and fuse features from synthetic aperture radar (SAR) and optical images. By formulating cloud removal as a diffusion-bridge problem and leveraging this tailored architecture, DB-CR achieves high-fidelity results while being computationally efficient. We evaluated DBCR on the SEN12MS-CR cloud-removal dataset, demonstrating that it achieves state-of-the-art results.

 

  • Related Publication

  •  Hu, Y., Lohit, S., Kamilov, U., Marks, T.K., "Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal", arXiv, April 2025.
    BibTeX arXiv
    • @article{Hu2025apr,
    • author = {Hu, Yuyang and Lohit, Suhas and Kamilov, Ulugbek and Marks, Tim K.},
    • title = {{Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal}},
    • journal = {arXiv},
    • year = 2025,
    • month = apr,
    • url = {https://arxiv.org/abs/2504.03607}
    • }