[CycleGAN] Unpaired image-to-image translation using cycle-consistent adversarial networks

논문 요약

1. Paper Bibliography

논문 제목

- Unpaired image-to-image translation using cycle-consistent adversarial networks

저자

- Zhu, Jun-Yan, et al.

출판 정보 / 학술대회 발표 정보

- Proceedings of the IEEE international conference on computer vision. 2017

년도

- 2017

2. Problems & Motivations

논문에서 문제점 정리 + 관련 연구

Image-to-Image Translation

- Image-to-image translation[29, 20]은 한 이미지에서 특별한 특성을 얻어서 다른 이미지에 그 특성이 나타나도록 하는 것이다. 예를 들어 흑백 이미지를 컬러 이미지로 바꾸기, 이미지를 semantic label로 바꾸기, edge-map을 사진으로 바꾸기 등이 있다. 이러한 작업을 하기 위해서는 입력 이미지와 출력 이미지간의 매핑을 학습해야 하는데 이전에는 주로 supervised한 방식이었다. 이는 학습 데이터로 이미지 쌍(image pair)이 있어야 함을 의미하는데 이러한 데이터를 구하는 것은 쉽지 않고 비용이 많이 든다

Neural Style Transfer

- [11, 21, 48, 10]는 image-to-image translation을 할 수 있는 다른 방법인데 한 이미지에서 content를 얻어 다른 이미지의 style을 합성해 새로운 이미지를 만드는 것이다. 이는 미리 학습한 deep features의 gram matrix statistics를 매칭시켜 할 수 있다. (CycleGAN은 두 개의 특정 이미지 간의 매핑이 아닌 두 도메인 간의 매핑을 학습한다.)

[10] L. A. Gatys, M. Bethge, A. Hertzmann, and E. Shechtman. Preserving color in neural artistic style transfer. arXiv preprint arXiv:1606.05897, 2016.

[11] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. CVPR, 2016.

[20] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-toimage translation with conditional adversarial networks. In CVPR, 2017.

[21] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711. Springer, 2016.

[29] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, pages 3431–3440, 2015.

[48] D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. In Int. Conf. on Machine Learning (ICML), 2016.

3. Proposed Solutions

논문에서 제안하는 해결책들 정리

Learn mapping function between two domains

- CycleGAN의 목표는 주어진 데이터 X와 Y간의 도메인 변환이 가능한 매핑 함수를 학습하는 것이다.

- Fig 3에 그려진 것 처럼 모델은 2가지 매핑함수(G, F)와 2개의 discriminator(Dx, Dy)를 가지고 있는 것을 볼 수 있다. Dx는 {x}와 {F(y)}를 구별하고 Dy는 {y}와 {G(x)}를 구별하는 역할을 한다.

{x}: 도메인 X 이미지 (real)
{y}: 도메인 Y 이미지 (real)
{G(x)}: x를 y처럼 만든 이미지 (fake)
{F(y)}: y를 x처럼 만든 이미지 (fake)

- 2가지 Loss function이 필요하다

Adversarial loss: 생성한 이미지의 분포를 대상 도메인의 데이터 분포와 일치시킨다
Cycle consistency loss: 학습된 매핑함수 G와 F가 서로 모순되는 것을 방지한다

3.1 Adversarial Loss

- Adversarial loss는 2가지 매핑함수에 모두 쓰인다.

- mapping function G: X -> Y와 discriminator Dy의 경우 아래 수식을 사용한다: G는 Y도메인에 있는 이미지와 비슷하게 보이는 이미지 G(x)를 만들고 Dy는 real인 y와 fake인 G(x)를 구별해야 한다.

- 이와 비슷하게 F: Y -> X의 경우 F는 X도메인에 있는 이미지와 비슷하게 보이는 이미지 F(y)를 만들고 Dx는 real인 x와 fake인 F(y)를 구별해야 한다.

3.2 Cycle Consistency Loss

- 최적화된 G의 분포는 도메인 X의 분포를 Y처럼 만들어준다. G가 가질 수 있는 분포는 무한히 있는데 잘못하면 G가 다양한 결과물이 아닌 discriminator를 속이기 쉬운 결과물 하나만 만드는 방향으로 바뀔 수 있다. (mode collapse). 그렇기 때문에 translation이 일방적인 방향으로 하는 것이 아닌 "cycle consistent"해야 한다. G와 F를 동시에 학습하면서 cycle consistency loss를 더해주면 unpaired image-to-image translation을 가능하게 한다.

- Forward cycle consistency (Fig 3(b)): 도메인 X의 이미지 x는 image translation cycle을 통해 변환했다가 다시 원래대로 돌아올 수 있어야 한다.

- Backward cycle consistency (Fig 3(c)): 도메인 Y의 이미지 y는 변환했다가 다시 원래대로 돌아올 수 있어야 한다

- 전체 cycle consistency loss는 다음과 같다

3.3 Full Objective

- 전체 Loss function은 다음과 같이 된다.

4. Results

결과

성공한 케이스

- CycleGAN은 도메인간 변환을 하는 것이기 때문에 다양한 task에 적용 가능하다

실패한 케이스

- Geometric changes가 필요한 task도 있다

- 학습하지 않은 도메인에서는 잘 적용되지 않는다: horse -> zebra의 경우 말은 학습하였으나 사람은 학습하지 않았다. (학습 데이터 세트의 분포 특성이 다르기 때문에 발생)

Google Scholar Link

https://scholar.google.co.kr/scholar?hl=ko&as_sdt=0%2C5&q=Unpaired+Image-to-Image+Translation+using+Cycle-Consistent+Adversarial+Networks&btnG=

Google 학술 검색

Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be a

scholar.google.co.kr

GitHub

https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

GitHub - junyanz/pytorch-CycleGAN-and-pix2pix: Image-to-Image Translation in PyTorch

Image-to-Image Translation in PyTorch. Contribute to junyanz/pytorch-CycleGAN-and-pix2pix development by creating an account on GitHub.

github.com

저작자표시 (새창열림)

'논문 리뷰 > Generative' 카테고리의 다른 글

[LPTN] High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network (0)	2022.07.14

뀰 블로그

[CycleGAN] Unpaired image-to-image translation using cycle-consistent adversarial networks

논문 요약

1. Paper Bibliography

2. Problems & Motivations

3. Proposed Solutions

4. Results

'논문 리뷰 > Generative' 카테고리의 다른 글

댓글

티스토리툴바

[CycleGAN] Unpaired image-to-image translation using cycle-consistent adversarial networks

논문 요약

1. Paper Bibliography

2. Problems & Motivations

3. Proposed Solutions

4. Results

'논문 리뷰 > Generative' 카테고리의 다른 글

관련글

댓글

티스토리툴바