Disentangled Representation

Methods:

The goal of Disentangled Representation [4] is to extract explanatory factors of the data in the input distribution and generate a more meaningful representation. disentangle codes/encodings/representations/latent factors/latent variables. single-dimension attribute encoding or multi-dimension attribute encoding.

A math definition of disentangled representation [11]

A survey on disentangled representation learning [19]

  • Unsupervised disentanglement

    Recently, InfoGAN [5] utilizes GAN framework and maximizes the mutual information between a subset of the latent variables to learn disentangled representations in an unsupervised manner. Different latent variables are enforced to be independent based on the independence assumption [6].

  • Supervised disentanglement

    Swapping attribute representation with the supervision of attribute annotation such as Dual Swap GAN [7] (semi-supervised) and DNA-GAN [8].

  • Disentangle representation for domain adaptation, disentangle representation into Class/domain-invariant and class/domain-specific: [9][10][12] [13]

  • instance-level disentangle[14] [15] FUNIT[16] COCO-FUNIT[17

  • close-form disentanglement [18]: after the model is trained, perform eigen decomposition to obtain orthogonal directions.

Disentanglement metric:

  • disentangement metric score [1])
  • perceptual path length, linear separabilit [2]

Reference

[1] Higgins, Irina, et al. “beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.” ICLR 2.5 (2017): 6.

[2] Karras, Tero, Samuli Laine, and Timo Aila. “A style-based generator architecture for generative adversarial networks.” CVPR, 2019.

[4] Representation learning: A review and new perspectives

[5] Infogan: Interpretable representation learning by information maximizing generative adversarial nets

[6] Learning Independent Features with adversarial Nets for Non-linear ICA

[7] Dual Swap Disentangling

[8] DNA-GAN: Learning Disentangled Representations from Multi-Attribute Images

[9] Image-to-image translation for cross-domain disentanglement

[10] Diverse Image-to-Image Translation via Disentangled Representations

[11] Higgins, Irina, et al. “Towards a definition of disentangled representations.” arXiv preprint arXiv:1812.02230 (2018).

[12] Gabbay, Aviv, and Yedid Hoshen. “Demystifying Inter-Class Disentanglement.” arXiv preprint arXiv:1906.11796 (2019).

[13] Hadad, Naama, Lior Wolf, and Moni Shahar. “A two-step disentanglement method.” CVPR, 2018.

[14] Shen, Zhiqiang, et al. “Towards instance-level image-to-image translation.” CVPR, 2019.

[15] Sangwoo Mo, Minsu Cho, Jinwoo Shin:
InstaGAN: Instance-aware Image-to-Image Translation. ICLR, 2019.

[16] Liu, Ming-Yu, et al. “Few-shot unsupervised image-to-image translation.” ICCV, 2019.

[17] Saito, Kuniaki, Kate Saenko, and Ming-Yu Liu. “COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder.” arXiv preprint arXiv:2007.07431 (2020).

[18] Shen, Yujun, and Bolei Zhou. “Closed-Form Factorization of Latent Semantics in GANs.” arXiv preprint arXiv:2007.06600 (2020).

[19] Xin Wang, Hong Chen, Siao Tang, Zihao Wu, and Wenwu Zhu. “Disentangled Representation Learning.”