GAN | Newly Blog

Training tricks:

17 tricks for training GAN: https://github.com/soumith/ganhacks

soft label: replace 1 with 0.9 and 0 with 0.3
train discriminator more times (e.g., 2X) than generator
use labels: auxiliary tasks
normalize inputs to [-1, 1]
use tanh before output
use batchnorm (not for the first and last layer)
use spherical distribution instead of uniform distribution
leaky relu
stability tricks from RL

Tricks from the BigGAN [1]

class-conditional BatchNorm
Spectral normalization
orthogonal initialization
truncated prior (truncation trick to seek the trade-off between fidelity and variety)
enforce orthogonality on weights to improve the model smoothness

More tricks

gradient penalty [8]
unrolling [9] and packing [10]

Famous GANs:

LSGAN: replace cross-entropy loss with least square loss
Wasserstein GAN: replace discriminator with a critic function
LAPGAN: coarse-to-fine using laplacian pyramid
seqGAN: generate discrete sequences
E-GAN [2]: place GAN under the framework of genetic evolution
Dissection GAN [3]: use intervention for causality
CoGAN [4]: two generators and discriminators softly share parameters
DCGAN [5]
Progressive GAN [6]
Style-based GAN [7]
stack GAN [17]
self-attention GAN [18]
BigGAN [20]
LoGAN [19]
Conditioned on label vector: conditional GAN [14], CVAE-GAN [16]
Conditioned on a single image: pix2pix [11]; high-resolution pix2pix [12] (add coarse-to-fine strategy); BicycleGAN [13] (combination of cVAE-GAN and cLR-GAN); DAGAN [15]
StyleGAN-XL [23]
StyleGAN-T [22]
GigaGAN [21]

Measurement：

Results: Besides qualitative results, there are some quantitative metric like Inception score and Frechet Inception Distance.

Stability: for the stability of generator and discriminator, refer to [1].

Tutorial and Survey:

References

[1] Brock A, Donahue J, Simonyan K. Large scale gan training for high fidelity natural image synthesis[J]. arXiv preprint arXiv:1809.11096, 2018.

[2] Wang C, Xu C, Yao X, et al. Evolutionary Generative Adversarial Networks[J]. arXiv preprint arXiv:1803.00657, 2018.

[3] Bau D, Zhu J Y, Strobelt H, et al. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks[J]. arXiv preprint arXiv:1811.10597, 2018.

[4] Liu M Y, Tuzel O. Coupled generative adversarial networks[C]//Advances in neural information processing systems. 2016: 469-477.

[5] Radford, Alec, Luke Metz, and Soumith Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks.” arXiv preprint arXiv:1511.06434 (2015).

[6] Karras, Tero, et al. “Progressive growing of gans for improved quality, stability, and variation.” arXiv preprint arXiv:1710.10196 (2017).

[7] Karras, Tero, Samuli Laine, and Timo Aila. “A Style-Based Generator Architecture for Generative Adversarial Networks.” arXiv preprint arXiv:1812.04948 (2018).

[8] Gulrajani, Ishaan, et al. “Improved training of wasserstein gans.” Advances in Neural Information Processing Systems. 2017.

[9] Metz, Luke, et al. “Unrolled generative adversarial networks.” arXiv preprint arXiv:1611.02163 (2016).

[10] Lin, Zinan, et al. “PacGAN: The power of two samples in generative adversarial networks.” Advances in Neural Information Processing Systems. 2018.

[11] Isola, Phillip, et al. “Image-to-image translation with conditional adversarial networks.” CVPR, 2017

[12] Wang, Ting-Chun, et al. “High-resolution image synthesis and semantic manipulation with conditional gans.” CVPR, 2018.

[13] Zhu, Jun-Yan, et al. “Toward multimodal image-to-image translation.” NIPS, 2017.

[14] Mirza, Mehdi, and Simon Osindero. “Conditional generative adversarial nets.” arXiv preprint arXiv:1411.1784 (2014).

[15] Antoniou, Antreas, Amos Storkey, and Harrison Edwards. “Data augmentation generative adversarial networks.” arXiv preprint arXiv:1711.04340 (2017).

[16] Bao, Jianmin, et al. “CVAE-GAN: fine-grained image generation through asymmetric training.” ICCV, 2017.

[17] Han Zhang, Tao Xu, Hongsheng Li, “StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks”, ICCV 2017

[18] Han Zhang, Ian J. Goodfellow, Dimitris N. Metaxas, Augustus Odena, “Self-Attention Generative Adversarial Networks”. CoRR abs/1805.08318 (2018)

[19] Wu, Yan, et al. “LOGAN: Latent Optimisation for Generative Adversarial Networks.” arXiv preprint arXiv:1912.00953 (2019).

[20] Brock, Andrew, Jeff Donahue, and Karen Simonyan. “Large scale gan training for high fidelity natural image synthesis.” arXiv preprint arXiv:1809.11096 (2018).

[21] Kang, Minguk, et al. “Scaling up GANs for Text-to-Image Synthesis.” arXiv preprint arXiv:2303.05511 (2023).

[22] Sauer, Axel, et al. “Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis.” arXiv preprint arXiv:2301.09515 (2023).

[24] Sauer, Axel, Katja Schwarz, and Andreas Geiger. “Stylegan-xl: Scaling stylegan to large diverse datasets.” ACM SIGGRAPH 2022 conference proceedings. 2022.