A survey on disentangled representation learning [1]
Disentangled Diffusion Model
- [2]: make adjustment based on text embedding, learn optimal combination coefficients for different time steps.
- [3]: disentangled gradient field, predict gradients conditioned on latent factors
- [4]: semantic subcode and stochastic details
- [5]: predict the direction change in the latent h-space
References
[1] Xin Wang, Hong Chen, Siao Tang, Zihao Wu, and Wenwu Zhu. “Disentangled Representation Learning.”
[2] Wu, Qiucheng, et al. “Uncovering the disentanglement capability in text-to-image diffusion models.” CVPR, 2023.
[3] Yang, Tao, et al. “DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models.” arXiv preprint arXiv:2301.13721 (2023).
[4] Preechakul, Konpat, et al. “Diffusion autoencoders: Toward a meaningful and decodable representation.” CVPR, 2022.
[5] Kwon, Mingi, Jaeseok Jeong, and Youngjung Uh. “Diffusion models already have a semantic latent space.” arXiv preprint arXiv:2210.10960 (2022).