1. deepfashion: http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html (attribute, bounding box, landmark)

  2. Colorful-Fashion: https://sites.google.com/site/fashionparsing/home (pixel-level color-category label)

  3. CCP (Clothing Co-Parsing): https://github.com/bearpaw/clothing-co-parsing (parsing label)

  4. fashionistas: http://vision.is.tohoku.ac.jp/~kyamagu/research/clothing_parsing/(parsing label)

  5. HPW (Human Parsing in the Wild): https://github.com/lemondan/HumanParsing-Dataset (parsing label)

  6. modaNet: https://github.com/eBay/modanet (polygon annotations)

  1. re-sampling
  2. synthetic samples: generate more samples for minor classes
  3. re-weighting
  4. few-shot learning
  5. decoupling representation and classifier learning: use normal sampling in the feature learning stage and use re-sampling in the classifier learning stage.

Big Names: Judy Pearl [Tutorial] [slides] [textbook], James Robin [Textbook] [slides]

Tutorial:

Workshop: NIPS2018 workshop on causal learning, KDD2020 Tutorial on Causal Inference Meets Machine Learning

Material: MILA Course

Causality and disentanglement: [5] [6]

Counterfactual and disentanglement: [7]

Reference

[1] Chalupka K, Perona P, Eberhardt F. Visual causal feature learning. arXiv preprint arXiv:1412.2309, 2014.

[2] Lopez-Paz D, Nishihara R, Chintala S, et al. Discovering causal signals in images. CVPR, 2017.

[3] Bau D, Zhu J Y, Strobelt H, et al. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. arXiv preprint arXiv:1811.10597, 2018.

[4] Bernhard Schölkopf: CAUSALITY FOR MACHINE LEARNING. arXiv preprint arXiv:1911.10500, 2019.

[5] Kim, Hyemi, et al. “Counterfactual Fairness with Disentangled Causal Effect Variational Autoencoder.” arXiv preprint arXiv:2011.11878 (2020).

[6] Shen, Xinwei, et al. “Disentangled Generative Causal Representation Learning.” arXiv preprint arXiv:2010.02637 (2020).

[7] Yue, Zhongqi, et al. “Counterfactual Zero-Shot and Open-Set Visual Recognition.” arXiv preprint arXiv:2103.00887 (2021).

[8] Schölkopf, Bernhard, et al. “Towards causal representation learning.” arXiv preprint arXiv:2102.11107 (2021).

  • watermark removal: ICA [4], inpainting [5]

  • watermarks consistent across a collection of images: multi-image matting and reconstruction [3]

  • Survey papers on watermarking: [1] [2]

Reference

  1. Podilchuk, Christine I., and Edward J. Delp. “Digital watermarking: algorithms and applications.” IEEE signal processing Magazine 18.4 (2001): 33-46.
  2. Potdar, Vidyasagar M., Song Han, and Elizabeth Chang. “A survey of digital image watermarking techniques.” INDIN’05. 2005 3rd IEEE International Conference on Industrial Informatics, 2005.. IEEE, 2005.
  3. Dekel, Tali, et al. “On the effectiveness of visible watermarks.” CVPR, 2017.

First deep learning approach for video harmonization [1]

[1] Haozhi Huang, Senzhe Xu, Junxiong Cai, Wei Liu, Shimin Hu, “Temporally Coherent Video Harmonization Using
Adversarial Networks”, arxiv, 2018.

Advanced VAE

  1. VQVAE [1],VQVAE2 [2]. Accelerate auto-regression: [4] [5]

  2. NVAE [3]

References

[1] Oord, Aaron van den, Oriol Vinyals, and Koray Kavukcuoglu. “Neural discrete representation learning.” arXiv preprint arXiv:1711.00937 (2017).
[2] Razavi, Ali, Aaron van den Oord, and Oriol Vinyals. “Generating diverse high-fidelity images with vq-vae-2.” Advances in neural information processing systems. 2019.
[3] Vahdat, Arash, and Jan Kautz. “Nvae: A deep hierarchical variational autoencoder.” arXiv preprint arXiv:2007.03898 (2020).
[4] Bond-Taylor, Sam, et al. “Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes.” arXiv preprint arXiv:2111.12701 (2021).
[5] Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman, “MaskGIT: Masked Generative Image Transformer”, arXiv preprint arXiv:2202.04200.

  • StyleGAN of all trades [1]
  • StyleGANv1[5]
  • StyleGANv2[6]: remove blob-shaped artifacts that resemble water droplets.
  • StyleGANv3[2]: solve alias (texture sticking) issue, that is, detail appearing to glued to image coordinates instead of the surface of depicted objects.
  • StyleGAN-XL [3]: extend to large dataset
  • 3D styleGAN [4]

Image editing using styleGA

InsetGAN [7]

Reference

[1] Chong, Min Jin, Hsin-Ying Lee, and David Forsyth. “StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN.” arXiv preprint arXiv:2111.01619 (2021).

[2] Karras, Tero, et al. “Alias-free generative adversarial networks.” Thirty-Fifth Conference on Neural Information Processing Systems. 2021.

[3] Sauer, Axel, Katja Schwarz, and Andreas Geiger. “Stylegan-xl: Scaling stylegan to large diverse datasets.” arXiv preprint arXiv:2202.00273 (2022).

[4] Xiaoming Zhao, Fangchang Ma, David Güera, Zhile Ren, Alexander G. Schwing, Alex Colburn. “Generative Multiplane Images: Making a 2D GAN 3D-Aware”.

[5] Karras, Tero, Samuli Laine, and Timo Aila. “A style-based generator architecture for generative adversarial networks.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.

[6] Karras, Tero, et al. “Analyzing and improving the image quality of stylegan.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.

[7] Frühstück, Anna, et al. “Insetgan for full-body image generation.” CVPR, 2022.

  • SAM [1]

  • FastSAM [2]: first generate proposals and then select target proposals

  • High-quality SAM [3]

  • Semantic-SAM [4]: assign semantic labels

Reference

[1] Kirillov, Alexander, et al. “Segment anything.” arXiv preprint arXiv:2304.02643 (2023).

[2] Zhao, Xu, et al. “Fast Segment Anything.” arXiv preprint arXiv:2306.12156 (2023).

[3] Ke, Lei, et al. “Segment Anything in High Quality.” arXiv preprint arXiv:2306.01567 (2023).

[4] Li, Feng, et al. “Semantic-SAM: Segment and Recognize Anything at Any Granularity.” arXiv preprint arXiv:2307.04767 (2023).

Translate one or multiple instances in an image: [1]

Reference

[1] Mo, Sangwoo, Minsu Cho, and Jinwoo Shin. “Instagan: Instance-aware image-to-image translation.” arXiv preprint arXiv:1812.10889 (2018).

Optimization-based:

  • texture synthesis

    • Texture synthesis using convolutional neural networks. [pdf]
  • feature inversion

    • Understanding Deep Image Representations by Inverting Them.
  • style transfer = feature inversion + texture synthesis

    • Image style transfer using convolutional neural networks. [pdf] [code] (no training, test is slow)

    • Perceptual Losses for Real-Time Style Transfer and Super-Resolution. [pdf] (train a network for each style using style image and content image as inputs, real-time test, belong to one-to-one image mapping)

    • Texture Networks: Feed-forward Synthesis of Textures and Stylized Image. [pdf]

    • A learned representation for artistic style. [pdf] (train a unified network for multiple styles)

Feedforward-based:

  • super-resolution

    • Learning a deep convolutional network for image super-resolution. [pdf]

    • Accurate Image Super-Resolution Using Very Deep Convolutional Networks [pdf] [code] (VGG learns residual)

    • Accelerating the Super-Resolution Convolutional Neural Network. [pdf] (hourglass structure, deconv)

    • Deeply-recursive convolutional network for image super-resolution. [pdf]

    • Photo-realistic single image super-resolution using a generative adversarial network. [pdf] (content_loss, adversarial loss)

  • inpainting or hole-filling

    • Deep Image Inpainting. [pdf]
    • Context Encoders: Feature Learning by Inpainting [pdf] [code]
  • colorization

    • Colorful image colorization. [pdf] [code]

    • Learning Representations for Automatic Colorization. [pdf] [code

  • denoising

    • Image Restoration Using Very Deep Convolutional EncoderDecoder Networks with Symmetric Skip Connections [pdf] [code]:(conv and deconv)
  • decompression

    • Compression Artifacts Reduction by a Deep Convolutional Network [pdf]
  • dehaze/deraining

    • Dehazenet: An end-to-end system for single image haze removal [pdf]
  • demosaicking

    • Deep joint demosaicking and denoising [pdf]
  • image harmonization

  • domain adaptation
    • Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Network. [pdf] [code]
  • general image-to-image translation

    • paired training data

      • Image-to-Image Translation with Conditional Adversarial Nets. [pdf] [code] (pixelGAN)

      • High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. [pdf]: extend pixel2pixel GAN with coarse-to-fine strategy.

    • unpaired training data

      • Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. [pdf][code] (CycleGAN)

        • DualGAN: Unsupervised Dual Learning for Image-to-Image Translation [pdf]
      • Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. [pdf] (discoGAN)

Surveys

0%