publication . Preprint . Other literature type . Conference object . 2018

Sem-GAN: Semantically-Consistent Image-to-Image Translation

Anoop Cherian; Alan Sullivan;
Open Access English
  • Published: 11 Jul 2018
Abstract
Unpaired image-to-image translation is the problem of mapping an image in the source domain to one in the target domain, without requiring corresponding image pairs. To ensure the translated images are realistically plausible, recent works, such as Cycle-GAN, demands this mapping to be invertible. While, this requirement demonstrates promising results when the domains are unimodal, its performance is unpredictable in a multi-modal scenario such as in an image segmentation task. This is because, invertibility does not necessarily enforce semantic correctness. To this end, we present a semantically-consistent GAN framework, dubbed Sem-GAN, in which the semantics a...
Subjects
ACM Computing Classification System: ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Computer vision, Semantics, Artificial intelligence, business.industry, business, Correctness, Task analysis, Image segmentation, Pattern recognition, Invertible matrix, law.invention, law, Segmentation, Image translation, Computer science
Related Organizations
51 references, page 1 of 4

[1] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan.

arXiv preprint arXiv:1701.07875, 2017.

mapping. In NIPS, 2017.

[3] K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Kr-

[7] C. R. de Souza, A. Gaidon, Y. Cabon, and A. L. Pena. Procedural generation of videos to train deep action recognition

[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. Imagenet: A large-scale hierarchical image database. In

[9] W. Deng, L. Zheng, G. Kang, Y. Yang, Q. Ye, and

[13] Z. Gan, L. Chen, W. Wang, Y. Pu, Y. Zhang, H. Liu, C. Li, and L. Carin. Triangle generative adversarial networks. In

[14] Y. Ganin and V. Lempitsky. Unsupervised domain adaptation by backpropagation. In ICML, 2015.

[15] I. Goodfellow, J. Pouget-Abadie,

[16] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.

[17] F. C. Heilbron, V. Escorcia, B. Ghanem, and J. C. Niebles. Activitynet: A large-scale video benchmark for human activity understanding. In CVPR, 2015. [OpenAIRE]

[18] J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. A. Efros, and T. Darrell. Cycada: Cycle-consistent adversarial domain adaptation. arXiv preprint arXiv:1711.03213, 2017.

[19] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. CVPR, 2017.

[20] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.

51 references, page 1 of 4
Abstract
Unpaired image-to-image translation is the problem of mapping an image in the source domain to one in the target domain, without requiring corresponding image pairs. To ensure the translated images are realistically plausible, recent works, such as Cycle-GAN, demands this mapping to be invertible. While, this requirement demonstrates promising results when the domains are unimodal, its performance is unpredictable in a multi-modal scenario such as in an image segmentation task. This is because, invertibility does not necessarily enforce semantic correctness. To this end, we present a semantically-consistent GAN framework, dubbed Sem-GAN, in which the semantics a...
Subjects
ACM Computing Classification System: ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Computer vision, Semantics, Artificial intelligence, business.industry, business, Correctness, Task analysis, Image segmentation, Pattern recognition, Invertible matrix, law.invention, law, Segmentation, Image translation, Computer science
Related Organizations
51 references, page 1 of 4

[1] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan.

arXiv preprint arXiv:1701.07875, 2017.

mapping. In NIPS, 2017.

[3] K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Kr-

[7] C. R. de Souza, A. Gaidon, Y. Cabon, and A. L. Pena. Procedural generation of videos to train deep action recognition

[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. Imagenet: A large-scale hierarchical image database. In

[9] W. Deng, L. Zheng, G. Kang, Y. Yang, Q. Ye, and

[13] Z. Gan, L. Chen, W. Wang, Y. Pu, Y. Zhang, H. Liu, C. Li, and L. Carin. Triangle generative adversarial networks. In

[14] Y. Ganin and V. Lempitsky. Unsupervised domain adaptation by backpropagation. In ICML, 2015.

[15] I. Goodfellow, J. Pouget-Abadie,

[16] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.

[17] F. C. Heilbron, V. Escorcia, B. Ghanem, and J. C. Niebles. Activitynet: A large-scale video benchmark for human activity understanding. In CVPR, 2015. [OpenAIRE]

[18] J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. A. Efros, and T. Darrell. Cycada: Cycle-consistent adversarial domain adaptation. arXiv preprint arXiv:1711.03213, 2017.

[19] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. CVPR, 2017.

[20] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.

51 references, page 1 of 4
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Preprint . Other literature type . Conference object . 2018

Sem-GAN: Semantically-Consistent Image-to-Image Translation

Anoop Cherian; Alan Sullivan;