DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample (ICCV 2021 Oral) 리뷰

논문제목: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample (ICCV 2021 Oral)

Abstract

DeepSIM은 generative model for conditional image manipulation based on a single image이다. 본 논문에서는 TPS를 이용한 augmentation이 single image training에 효과적이라고 함. 제안하는 네트워크는 primitive representation과 realistic image를 mapping한다. 기존 sota를 뛰어넘었다고 한다.

Introduction

Image manipulation은 특정 이미지에 변형을 통해서 그 이미지를 원하는 형태로 만드는 작업을 말한다. 예를들어 우리가 흔히 알고 있는 포토샵 같은 것을 의미합니다. 딥러닝의 발전은 image manipulation task의 발전에도 큰 영향을 주었다. 하지만 여전히 많은 수의 input output 샘플들이 필요했습니다. 보통 manipulation 하려는 이미지들의 경우 쉽게 데이터를 구하지 못하는 디자인들로 되어있는 경우가 많고, 대량의 데이터셋으로 학습시킨 이미지의 경우 원본이미지의 특성들을 제대로 가지고 있지 못하는 문제들이 발생하곤 합니다.

이에 single image만을 이용하여 생성모델을 학습시키는 연구들이 생겨났다. 이 논문에서는 single image pair을 가지고 학습을 진행하는 DeepSIM 네트워크를 소개한다. 해당 방법은 많은 image manipulation task들에 이용될 수 있는데, shape warping, object rearrangement, object removal, object addition, creation of painted and photorealistic animated clips 등의 task를 할 수 있다.

간략하게 설명을 해보자면, single target 이미지에 대해서 먼저 primitive representation을 만든다. 이러한 하나의 pair 이미지들을 가지고 학습을 진행하고 나면, 사용자가 primitive representation에 조작을 가하면 해당 manipulation에 맞는 이미지를 real image domain에서 얻을 수 있다.

이 논문의 contribution은 다음과 같다.

- Single image-pair을 가지고 학습한 conditional generator을 가지고 여러 일반적인 task들을 수행할 수 있다.

- TPS augmentation이 single image manipulation에 효과적이라는 것을 밝혀냄

- Outstanding visual performance를 보여줌

Related Woks

Related work로 5가지 정도를 언급하고 넘어가려고 합니다.

Classical image manipulation method

A few notable image manipulation techniques include: Poisson Image Editing, Seam Carving, PatchMatch, ShiftMap, and Image Analogies. 딥러닝을 이용하지 않앗을때에 원시이미지와 realistic image간의 관계를 학습하는 것은 매우 어려운 일이었음

Deep conditional generative models

Image to image translation은 source domain의 이미지를 target domain로 mapping하는 것을 의미한다. 대부분의 image-to-image translation method은 gan을 이용함. 하지만 많은 데이터 셋들이 필요함. Conditional generation model은 보통 crop이나 flip augmentation들을 많이 이용함. 하지만 이 논문에서는 TPS를 image manipulation에 처음으로 도입했습니다.

Single Image generation

보통의 딥러닝 접근법들이 많은 dataset을 필요로 하지만, single image만을 학습하는 연구들도 등장 중. Deep image prior, retargeting, superresolution과 같은 분야에서 쓰이고 있구요. 그 중에서도 몇가지 highlight 하고 싶은 연구들은 singan과 tuigan.

Singan- conditional manipulation에 대한 성능은 떨어짐

Tuigan- conditional unsupervised image to image method based on a single image pair. -> every new pair에 대해 retraining 해야됨.

But Our method, uses a single aligned image pair for training a single generator that can be used for multiple manipulations without retraining, it is able to affect significantly more elaborate changes to images including to large objects in the scene.

Method

Single image pair 을 이용하여 conditional generative adversarial network를 학습함. Single 이미지를 TPS로 augment하여 진행. 3가지 목적이 있었음

1) Single image training

2) Fidelity- the output should reflect the primitive representation

3) Appearance- the output image should appear to come from the same distribution as the training image

cGAN 모델인 Pix2pixHD를 기반으로 만들어짐.

'컴퓨터공학 > 딥러닝 논문리뷰' 카테고리의 다른 글

Bayesian learning via stochastic gradient Langevin dynamics 리뷰 (0)	2021.12.04
Feature-weighted linear stacking. (0)	2021.12.01
Domain-adversarial training of neural networks 요약 (0)	2021.12.01
Xgboost: A scalable tree boosting system (0)	2021.12.01
Lightgbm: A highly efficient gradient boosting decision tree (0)	2021.12.01