Thank you for the post. How we can measure this disentanglement. The dataset doesn’t have labels to do visualization using tsne. Consider the following case:
I trained two VAEs, A and B, and got two latent spaces. After PCA, the eigenvalues for A are all almost equal, around 1. For B, these values are different. From 1 to 100. It means most of informationare in some directions. Does this mean that A disentangled the information better than B? Because it distributed the information in more dimensions?
If the mean and the var of each dimensions of latent vector for A are more similar to normal distribution, does it mean that A disentanles better?
I look for metrics to measure disentanglement ability in datasets without label.
Thank you