TR2018-178

Sem-GAN: Semantically-Consistent Image-to-Image Translation

- Cherian, A., Sullivan, A., "Sem-GAN: Semantically-Consistent Image-to-Image Translation", IEEE Winter Conference on Applications of Computer Vision (WACV), DOI: 10.1109/WACV.2019.00196, January 2019.
  BibTeX TR2018-178 PDF
  - @inproceedings{Cherian2019jan,
  - author = {Cherian, Anoop and Sullivan, Alan},
  - title = {Sem-GAN: Semantically-Consistent Image-to-Image Translation},
  - booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
  - year = 2019,
  - month = jan,
  - doi = {10.1109/WACV.2019.00196},
  - url = {https://www.merl.com/publications/TR2018-178}
  - }
MERL Contact:
- Anoop
  Cherian
Research Areas:

Artificial Intelligence, Computer Vision, Machine Learning

Abstract:

Unpaired image-to-image translation is the problem of mapping an image in the source domain to one in the target domain, without requiring corresponding image pairs. To ensure the translated images are realistically plausible, recent works, such as Cycle-GAN, demands this mapping to be invertible. While, this requirement demonstrates promising results when the domains are unimodal, its performance is unpredictable in a multi-modal scenario such as in an image segmentation task. This is because, invertibility does not necessarily enforce semantic correctness. To this end, we present a semantically-consistent GAN framework, dubbed Sem-GAN, in which the semantics are defined by the class identities of image segments in the source domain as produced by a semantic segmentation algorithm. Our proposed framework includes consistency constraints on the translation task that, together with the GAN loss and the cycle-constraints, enforces that the images when translated will inherit the appearances of the target domain, while (approximately) maintaining their identities from the source domain. We present experiments on several imageto-image translation tasks and demonstrate that Sem-GAN improves the quality of the translated images significantly, sometimes by more than 20% on the FCN score. Further, we show that semantic segmentation models, trained with synthetic images translated via Sem-GAN, leads to significantly better segmentation results than other variants.

MERL Contact:

AnoopCherian

Research Areas:

Abstract:

Anoop
Cherian