1. re-sampling
  2. synthetic samples: generate more samples for minor classes
  3. re-weighting
  4. few-shot learning
  5. decoupling representation and classifier learning: use normal sampling in the feature learning stage and use re-sampling in the classifier learning stage.

Big Names: Judy Pearl [Tutorial] [slides] [textbook], James Robin [Textbook] [slides]

Tutorial:

Workshop: NIPS2018 workshop on causal learning, KDD2020 Tutorial on Causal Inference Meets Machine Learning

Material: MILA Course

Causality and disentanglement: [5] [6]

Counterfactual and disentanglement: [7]

Reference

[1] Chalupka K, Perona P, Eberhardt F. Visual causal feature learning. arXiv preprint arXiv:1412.2309, 2014.

[2] Lopez-Paz D, Nishihara R, Chintala S, et al. Discovering causal signals in images. CVPR, 2017.

[3] Bau D, Zhu J Y, Strobelt H, et al. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. arXiv preprint arXiv:1811.10597, 2018.

[4] Bernhard Schölkopf: CAUSALITY FOR MACHINE LEARNING. arXiv preprint arXiv:1911.10500, 2019.

[5] Kim, Hyemi, et al. “Counterfactual Fairness with Disentangled Causal Effect Variational Autoencoder.” arXiv preprint arXiv:2011.11878 (2020).

[6] Shen, Xinwei, et al. “Disentangled Generative Causal Representation Learning.” arXiv preprint arXiv:2010.02637 (2020).

[7] Yue, Zhongqi, et al. “Counterfactual Zero-Shot and Open-Set Visual Recognition.” arXiv preprint arXiv:2103.00887 (2021).

[8] Schölkopf, Bernhard, et al. “Towards causal representation learning.” arXiv preprint arXiv:2102.11107 (2021).

  • watermark removal: ICA [4], inpainting [5]

  • watermarks consistent across a collection of images: multi-image matting and reconstruction [3]

  • Survey papers on watermarking: [1] [2]

Reference

  1. Podilchuk, Christine I., and Edward J. Delp. “Digital watermarking: algorithms and applications.” IEEE signal processing Magazine 18.4 (2001): 33-46.
  2. Potdar, Vidyasagar M., Song Han, and Elizabeth Chang. “A survey of digital image watermarking techniques.” INDIN’05. 2005 3rd IEEE International Conference on Industrial Informatics, 2005.. IEEE, 2005.
  3. Dekel, Tali, et al. “On the effectiveness of visible watermarks.” CVPR, 2017.

First deep learning approach for video harmonization [1]

[1] Haozhi Huang, Senzhe Xu, Junxiong Cai, Wei Liu, Shimin Hu, “Temporally Coherent Video Harmonization Using
Adversarial Networks”, arxiv, 2018.

Advanced VAE

  1. VQVAE [1],VQVAE2 [2]. Accelerate auto-regression: [4] [5]

  2. NVAE [3]

References

[1] Oord, Aaron van den, Oriol Vinyals, and Koray Kavukcuoglu. “Neural discrete representation learning.” arXiv preprint arXiv:1711.00937 (2017).
[2] Razavi, Ali, Aaron van den Oord, and Oriol Vinyals. “Generating diverse high-fidelity images with vq-vae-2.” Advances in neural information processing systems. 2019.
[3] Vahdat, Arash, and Jan Kautz. “Nvae: A deep hierarchical variational autoencoder.” arXiv preprint arXiv:2007.03898 (2020).
[4] Bond-Taylor, Sam, et al. “Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes.” arXiv preprint arXiv:2111.12701 (2021).
[5] Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman, “MaskGIT: Masked Generative Image Transformer”, arXiv preprint arXiv:2202.04200.

  • StyleGAN of all trades [1]
  • StyleGANv1[5]
  • StyleGANv2[6]: remove blob-shaped artifacts that resemble water droplets.
  • StyleGANv3[2]: solve alias (texture sticking) issue, that is, detail appearing to glued to image coordinates instead of the surface of depicted objects.
  • StyleGAN-XL [3]: extend to large dataset
  • 3D styleGAN [4]

Image editing using styleGA

InsetGAN [7]

Reference

[1] Chong, Min Jin, Hsin-Ying Lee, and David Forsyth. “StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN.” arXiv preprint arXiv:2111.01619 (2021).

[2] Karras, Tero, et al. “Alias-free generative adversarial networks.” Thirty-Fifth Conference on Neural Information Processing Systems. 2021.

[3] Sauer, Axel, Katja Schwarz, and Andreas Geiger. “Stylegan-xl: Scaling stylegan to large diverse datasets.” arXiv preprint arXiv:2202.00273 (2022).

[4] Xiaoming Zhao, Fangchang Ma, David Güera, Zhile Ren, Alexander G. Schwing, Alex Colburn. “Generative Multiplane Images: Making a 2D GAN 3D-Aware”.

[5] Karras, Tero, Samuli Laine, and Timo Aila. “A style-based generator architecture for generative adversarial networks.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.

[6] Karras, Tero, et al. “Analyzing and improving the image quality of stylegan.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.

[7] Frühstück, Anna, et al. “Insetgan for full-body image generation.” CVPR, 2022.

  • SAM [1]

  • FastSAM [2]: first generate proposals and then select target proposals

  • High-quality SAM [3]

  • Semantic-SAM [4]: assign semantic labels

Reference

[1] Kirillov, Alexander, et al. “Segment anything.” arXiv preprint arXiv:2304.02643 (2023).

[2] Zhao, Xu, et al. “Fast Segment Anything.” arXiv preprint arXiv:2306.12156 (2023).

[3] Ke, Lei, et al. “Segment Anything in High Quality.” arXiv preprint arXiv:2306.01567 (2023).

[4] Li, Feng, et al. “Semantic-SAM: Segment and Recognize Anything at Any Granularity.” arXiv preprint arXiv:2307.04767 (2023).

Translate one or multiple instances in an image: [1]

Reference

[1] Mo, Sangwoo, Minsu Cho, and Jinwoo Shin. “Instagan: Instance-aware image-to-image translation.” arXiv preprint arXiv:1812.10889 (2018).

Optimization-based:

  • texture synthesis

    • Texture synthesis using convolutional neural networks. [pdf]
  • feature inversion

    • Understanding Deep Image Representations by Inverting Them.
  • style transfer = feature inversion + texture synthesis

    • Image style transfer using convolutional neural networks. [pdf] [code] (no training, test is slow)

    • Perceptual Losses for Real-Time Style Transfer and Super-Resolution. [pdf] (train a network for each style using style image and content image as inputs, real-time test, belong to one-to-one image mapping)

    • Texture Networks: Feed-forward Synthesis of Textures and Stylized Image. [pdf]

    • A learned representation for artistic style. [pdf] (train a unified network for multiple styles)

Feedforward-based:

  • super-resolution

    • Learning a deep convolutional network for image super-resolution. [pdf]

    • Accurate Image Super-Resolution Using Very Deep Convolutional Networks [pdf] [code] (VGG learns residual)

    • Accelerating the Super-Resolution Convolutional Neural Network. [pdf] (hourglass structure, deconv)

    • Deeply-recursive convolutional network for image super-resolution. [pdf]

    • Photo-realistic single image super-resolution using a generative adversarial network. [pdf] (content_loss, adversarial loss)

  • inpainting or hole-filling

    • Deep Image Inpainting. [pdf]
    • Context Encoders: Feature Learning by Inpainting [pdf] [code]
  • colorization

    • Colorful image colorization. [pdf] [code]

    • Learning Representations for Automatic Colorization. [pdf] [code

  • denoising

    • Image Restoration Using Very Deep Convolutional EncoderDecoder Networks with Symmetric Skip Connections [pdf] [code]:(conv and deconv)
  • decompression

    • Compression Artifacts Reduction by a Deep Convolutional Network [pdf]
  • dehaze/deraining

    • Dehazenet: An end-to-end system for single image haze removal [pdf]
  • demosaicking

    • Deep joint demosaicking and denoising [pdf]
  • image harmonization

  • domain adaptation
    • Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Network. [pdf] [code]
  • general image-to-image translation

    • paired training data

      • Image-to-Image Translation with Conditional Adversarial Nets. [pdf] [code] (pixelGAN)

      • High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. [pdf]: extend pixel2pixel GAN with coarse-to-fine strategy.

    • unpaired training data

      • Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. [pdf][code] (CycleGAN)

        • DualGAN: Unsupervised Dual Learning for Image-to-Image Translation [pdf]
      • Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. [pdf] (discoGAN)

Surveys

Partial and Gated Convolution

  • partial convolution [1]: hard-gating single-channel unlearnable layer

  • gated convolution [2]: soft-gating multi-channel learnable layer

Filling Priority

filling priority [3]: Priority is the product of confidence term (a measure of the amount of reliable information surrounding the pixel) and data term (a function of the strength of isophotes hitting the front). Select the patch to be filled based on the priority, similar to patch-based texture synthesis.

<img src="http://bcmi.sjtu.edu.cn/~niuli/github_images/bO5YXEQ.jpg" width="40%"> 

Diverse image inpainting

  • random vector: use random vector to generate diverse and plausible outputs [6]

  • attribute vector: use target attribute values to guide image inpainting [7]

  • use autoregressive model: [11] [12]

Auxiliary Information

  • Semantics

    • enforce inpainted result to have expected semantics [8]
    • first inpaint semantic map and then use complete semantic map as guidance [9]
    • guide feature learning in the decoder [10]
    • semantic-aware attention [13]
  • Edges

    • Inpaint edge map and use complete edge map to help image inpainting [4] [5]

Frequency Domain

  • using frequency map as network input [14]
  • fourier convolution: LAMA[15])
  • wavelet [16]

Bridging Inpainting and Generation

Transformer

[12] [18] [19]

Diffusion Model

[20] [21] [22] [23]

References

  1. Liu, Guilin, et al. “Image inpainting for irregular holes using partial convolutions.” ECCV, 2018.
  2. Yu, Jiahui, et al. “Free-form image inpainting with gated convolution.” ICCV, 2019.
  3. Criminisi, Antonio, Patrick Pérez, and Kentaro Toyama. “Region filling and object removal by exemplar-based image inpainting.” TIP, 2004.
  4. Nazeri, Kamyar, et al. “Edgeconnect: Generative image inpainting with adversarial edge learning.” arXiv preprint arXiv:1901.00212 (2019).
  5. Xiong, Wei, et al. “Foreground-aware image inpainting.” CVPR, 2019.
  6. Zheng, Chuanxia, Tat-Jen Cham, and Jianfei Cai. “Pluralistic image completion.” CVPR, 2019.
  7. Chen, Zeyuan, et al. “High resolution face completion with multiple controllable attributes via fully end-to-end progressive generative adversarial networks.” arXiv preprint arXiv:1801.07632 (2018).
  8. Li, Yijun, et al. “Generative face completion.” CVPR, 2017.
  9. Song, Yuhang, et al. “Spg-net: Segmentation prediction and guidance network for image inpainting.” arXiv preprint arXiv:1805.03356 (2018).
  10. Liao, Liang, et al. “Guidance and evaluation: Semantic-aware image inpainting for mixed scenes.” arXiv preprint arXiv:2003.06877 (2020).
  11. Peng, Jialun, et al. “Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE.” CVPR, 2021.
  12. Wan, Ziyu, et al. “High-Fidelity Pluralistic Image Completion with Transformers.” arXiv preprint arXiv:2103.14031 (2021).
  13. Liao, Liang, et al. “Image inpainting guided by coherence priors of semantics and textures.” CVPR, 2021.
  14. Roy, Hiya, et al. “Image inpainting using frequency domain priors.” arXiv preprint arXiv:2012.01832 (2020).
  15. Suvorov, Roman, et al. “Resolution-robust Large Mask Inpainting with Fourier Convolutions.” WACV (2021).
  16. Yu, Yingchen, et al. “WaveFill: A Wavelet-based Generation Network for Image Inpainting.” ICCV, 2021.
  17. Zhao, Shengyu, et al. “Large scale image completion via co-modulated generative adversarial networks.” ICLR (2021).
  18. Zheng, Chuanxia, et al. “Bridging global context interactions for high-fidelity image completion.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
  19. Li, Wenbo, et al. “Mat: Mask-aware transformer for large hole image inpainting.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
  20. Lugmayr, Andreas, et al. “Repaint: Inpainting using denoising diffusion probabilistic models.” CVPR, 2022.
  21. Rombach, Robin, et al. “High-resolution image synthesis with latent diffusion models.” CVPR, 2022.
  22. Li, Wenbo, et al. “SDM: Spatial Diffusion Model for Large Hole Image Inpainting.” arXiv preprint arXiv:2212.02963 (2022).
  23. Wang, Su, et al. “Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting.” arXiv preprint arXiv:2212.06909 (2022).
0%