Learning to Generate Imagesœ±俊彦.pdf · Andrey Ignatov, Nikolay Kobyshev, Kenneth Vanhoey,...

Learning to Generate Images

Jun-Yan ZhuPh.D. at UC BerkeleyPostdoc at MIT CSAIL

Computer Vision before 2012

CatFeatures Clustering Pooling Classification

[LeCun et al, 1998], [Krizhevsky et al, 2012]

Computer Vision Now

Deep Net

CatFeatures Clustering Pooling Classification

Deep Learning for Computer Vision

[Zhao et al., 2017][Güler et al., 2018][Redmon et al., 2018]

Object detection Human understanding Autonomous driving

Top 5 accuracy on ImageNet benchmark

[Deng et al. 2009] 70

2010 2011 2012 2013 2014 2015 2016 2017

Deep Net

Can Deep Learning Help Graphics?

CatModeling Texturing Lighting Rendering

Good/BadDeep Net

Selecting the most attractive expressions

Photos

101 Photos[Zhu et al. SIGGRAPH Asia 2014]

Selecting the most realistic composites

Most realistic composites

Least realistic composites [Zhu et al. ICCV 2015]

CatDeep Net

Generating images is hard!

8 Deep Net

28x28 pixels

Generative Adversarial Networks(GANs)

[Goodfellow et al. 2014]

fake image

Random code Generator

A two-player game:• G tries to generate fake images that can fool D.• D tries to detect fake images.

Discriminatorfake image

Real (1) or fake (0)?

fake (0.1)

Learning objective (GANs)

D real (0.9)

fake (0.1)

real image

D real (0.9)

fake (0.3)

Generator

Discriminatorfake imageRandom code

real image

Limitations of GANs

• No user control.

• Low resolution and quality.

Random code

Output User input Output

ContributionsCo-authors:

Phillip Isola, Taesung Park, Ting-Chun WangRichard Zhang, Tinghui Zhou, Ming-Yu Liu, Andrew Tao

Jan Kautz, Bryan Catanzaro, Alexei A. Efros

2014 ⋯

Goals: Improve Control, Quality, and Resolution

2016 2017

pix2pix CycleGAN

pix2pixHD

• Conditional on user inputs.• Learning without pairs.• High quality and resolution.

2014 ⋯

2016 2017

pix2pix CycleGAN

pix2pixHD

Real or fake?

z G(z)

GRandom code

DiscriminatorGenerator Output image

Real or fake?

x G(x)

[Isola, Zhu, Zhou, Efros, 2016]

Learning objective (pix2pix)

Input image Output image DiscriminatorGenerator

Real✓

x G(x)

Input image Output image DiscriminatorGenerator

x G(x)

Real too✓D

Discriminator

GeneratorInput image Output image

Real or fake pair ?

x G(x)

Discriminator

Generator

#edges2cats [Christopher Hesse]

Ivy Tasi @ivymyt

@gods_tail

@matthematician

https://affinelayer.com/pixsrv/

Vitaly Vidmirov @vvid

x G(x)

Input: Sketch → Output: PhotoInput: Grayscale → Output: Color

Discriminator

Generator Real or fake pair ?

Automatic Colorization with pix2pixInput Output Input Output Input Output

Data from [Russakovsky et al. 2015]

Interactive Colorization

[Zhang*, Zhu*, Isola, Geng, Lin, Yu, Efros, 2017]

Edges → Images

Input Output Input Output Input Output

Edges from [Xie & Tu, 2015]

Sketches → Images

Input Output Input Output Input Output

Trained on Edges → Images

Data from [Eitz, Hays, Alexa, 2012]

Input Output Groundtruth

Data from[maps.google.com]

Input Output Groundtruth

Data from [maps.google.com]

Paired

……

Paired Unpaired

2016 2017

pix2pix CycleGAN

pix2pixHD

Cycle-Consistent Adversarial Networks

[Zhu*, Park*, Isola, and Efros, 2017]

[Mark Twain, 1903]

⋯ ⋯

Cycle-Consistent Adversarial Networks

[Zhu*, Park*, Isola, and Efros, 2017]

G(x) F(G x )x

F G x − x1

DY(G x )

Reconstructionerror

Cycle Consistency Loss

[Zhu*, Park*, Isola, and Efros, 2017]See also [Yi et al., 2017], [Kim et al, 2017]

G(x) F(G x )x F(y) G(F x )𝑦

F G x − x1

G F y − 𝑦1

DY(G x )

Reconstructionerror

DG(F x )

Cycle Consistency Loss

[Zhu*, Park*, Isola, and Efros, 2017]See also [Yi et al., 2017], [Kim et al, 2017]

Horse → Zebra

Orange → Apple

Collection Style Transfer

Monet Van Gogh

Cezanne Ukiyo-e

Photograph © Alexei Efros

Monet’s paintings → photographic style

Why CycleGAN works

Style and Content Separation

Separating Style and Content with Bilinear Models

[Tenenbaum and Freeman 2000’]

Content

Adversarial Loss: change the style

Cycle Consistency Loss: preserve the content

Two empirical assumptions:- content is easy to keep.- style is easy to change.

Paired Separation Unpaired Separation

Neural Style Transfer [Gatys et al. 2015]

Style and Content:- Content: feature difference- Style: Gram Matrix difference- Both losses are hard-coded.

Input Style Image I CycleGANStyle image II Entire collection

Photo → Van Gogh

horse → zebra

Input Style image I CycleGANStyle image II Entire collection

Cycle Loss upper bounds Conditional Entropy

Conditional Entropy

“ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching” [Li etal. NIPS 2017]. Also see [Tiao et al. 2018] “CycleGAN as Approximate Bayesian Inference”

HighConditional

Entropy

LowConditional

Entropy

Cycle Loss upper bounds Conditional Entropy

Conditional Entropy

“ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching” [Li etal. NIPS 2017]. Also see [Tiao et al. 2018] “CycleGAN as Approximate Bayesian Inference”

Customizing Gaming Experience

Street view images in German citiesGrand Theft Auto v (GTA5)

Data from [Richter et al., 2016], [Cordts et al, 2016]

Customizing Gaming Experience

Input GTA5 CGOutput image with German street view style

Domain Adaptation with CycleGAN

See Judy Hoffman’s talk at 14:30 “Adversarial Domain Adaptation”

meanIOU Per-pixel accuracy

Oracle (Train and test on Real) 60.3 93.1

Train on CG, test on Real 17.9 54.0

Train on GTA5 data Test on real images

FCN in the wild [Previous STOA] 27.1 -

GTA5 data + Domain adaptation Test on real images

FCN in the wild [Previous STOA] 27.1 -

Train on CycleGAN, test on Real 34.8 82.8

Train on CycleGAN data Test on real images

Failure case

Open Source CycleGAN and pix2pix

• Among the mostpopular GitHubresearch projectssince 2017.

• Among the mostcited papers inGraphics/CV/MLsince 2017.

CycleGAN results by students

CycleGAN in Classes

Input photo

MS emoji Apple emoji MS emoji Stained glass art

Applications and ExtentionsAttribute Editing [Lu et al.]

Low-res Bald Bangs

Object Editing [Liang et al.]

Mask Input OutputarXiv:1705.09966 arXiv:1708.00315

Front/Character Transfer [Ignatov et al.]

Input outputarXiv: 1801.08624

Data generation [Wang et al.]

arXiv:1707.03124

samples by CycleWGAN

Photo Enhancement

WESPE: Weakly Supervised Photo Enhancer for Digital Cameras. arxiv 1709.01118Andrey Ignatov, Nikolay Kobyshev, Kenneth Vanhoey, Radu Timofte, Luc Van Gool

Image Dehazing

Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing. CVPRW 2018Deniz Engin∗ Anıl Genc∗, Hazım Kemal Ekenel

Unsupervised Motion Retargeting

Neural Kinematic Networks for Unsupervised Motion Retargetting. CVPR 2018 (oral)Ruben Villegas, Jimei Yang, Duygu Ceylan, Honglak Lee

Applications Beyond Computer Vision

• Medical Imaging and Biology [Wolterink et al., 2017]

• Voice conversion [Fang et al., 2018, Kaneko et al., 2017]

• Cryptography [CipherGAN: Gomez et al., ICLR 2018]

• Robotics• NLP: Unsupervised machine translation.• NLP: Text style transfer.

Input MR Generated CT Ground truth CT

Latest from #CycleGANInput dog Output cat Input cat Output dog

Fortnite Input PUBG Style Final result

CycleGAN for Customized Gaming

Low-res256p/512p

Battle royale games

2014 ⋯

2016 2017

pix2pix CycleGAN

pix2pixHD

Sidewalk

Building

The Curse of Dimensionality

Pix2pix output

[Wang, Liu, Zhu, Tao, Kautz. Catanzaro, 2018]

Low-resGenerator

Real/fake?

Low-resDiscriminator

Real/fake?D2

High-resDiscriminator

Image Pyramid [Burt and Adelson, 1987]Also see [Zhang et al., 2017]

[Karras et al., 2018]

High-resGenerator

arse-to-fin

pix2pixHD

Sidewalk

Building

pix2pixHD: 2048×1024

pix2pixHD for sketch→ face

2014 ⋯ 2016 2017

pix2pix CycleGAN

pix2pixHD

• Learning to generate images from trillions of photos.• Help more people tell their own visual stories.

Improve Control, Quality, and ResolutionStill a long way to go

Thank You!

Learning to Generate Imagesœ±俊彦.pdf · Andrey Ignatov, Nikolay Kobyshev, Kenneth Vanhoey,...

Documents

RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Realopenaccess.thecvf.com/content_CVPR_2020/papers/Rao_RL... · 2020-06-07 · RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real

CycleGAN, a Master of Steganography - arXiv · CycleGAN, a Master of Steganography Casey Chu Stanford University caseychu@stanford.edu Andrey Zhmoginov Google Inc. azhmogin@google.com

CycleGAN Voice Conversion of Spectral Envelopes using

DeepLPF: Deep Local Parametric Filters for Image Enhancement · 2020. 5. 13. · [7] Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Kenneth Vanhoey, and Luc Van Gool. Dslr-quality

Codice di Condoa · 2020-03-27 · The Adecco roup Codice di Condotta Alain Dehaze Amministratore delegato Gentili Colleghi Rolf Dörig Presidente del Consiglio di Amministrazione

Lecture 10 - GitHub Pages · Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 10 14 / 18. CycleGAN: Cycle consistency across domains Cycle consistency: If we can

il be considered as these methods can identify individual molecules with higher specificity (VanHoey, 2005). PANTOPRAZOLE SODIUM --URINE DRUG SCREENING MODERATE PROTON PUMP INHIBITORS

CycleGAN, a Master of Steganography - Research at … claim that CycleGAN is learning an encoding scheme in which it “hides” information about the aerial photograph xwithin the

Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazingopenaccess.thecvf.com/content_cvpr_2018_workshops/...Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing ... augmentation,

Web画像マイニングを用いた Conditional Cycle GANによる食事画像 …img.cs.uec.ac.jp/pub/conf18/180810horita_4.pdf · 深層学習による画像変換手法の1 つであるCycleGAN

EE565 Advanced Image Processing Copyright Xin Li 20091 EE565: Advanced Image Processing Xin Li LDCSEE, Fall 2009 dehaze

Attribute-GuidedFace GenerationUsing Conditional CycleGAN

CycleGAN with Better Cycles3.2 Cycle consistency weight decay As shown in Section 2.1, cycle consistency loss helps stabilizing training a lot in early stages but becomes an obstacle

Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain ...openaccess.thecvf.com/content_CVPR_2019/papers/... · pixel feature pixel source ≈ target Feature Pixel – CycleGAN

SoildNet: Soiling Degradation Detection in Autonomous Driving · Another successful effort has been to dehaze [2] the high resolution ultrasound images. For both the approaches, the

Reducing Steganography In Cycle-consistency GANs · 2019. 6. 10. · our method with standard CycleGAN[12], MUNIT[8] and DRIT[9], and show that the semantic segmentation model trained

SPIGAN: PRIVILEGED ADVERSARIAL LEARNING FROM SIMULATION · 2017). In particular, CycleGAN (Zhu et al., 2017) leverages cycle consistency using a forward GAN and a backward GAN to

ours - ECVA · representations for style transfer. We rst give an overview of the I2I approaches using disentangled representations in Fig.2(a). Inspired by CycleGAN [33] which de

RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real · 2020. 6. 29. · RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real Kanishka Rao1, Chris Harris1, Alex Irpan1,

ImaGAN: Unsupervised Training of Conditional Joint ...to the CycleGAN [29], two GANs are jointly trained with cycle-consistency con-straints. However, in contrast to the CycleGAN [29],