1. Dimension Reduction
1.1.1 Motivation
- Find low dimensional representation of data, while preserving the content
1.2 Linear Autoencoder
The linear autoencoder is obtained by identifying F, G with linear maps
Linear encode:
Linear decoder:
This leads to following Risk:
- linearity comes with a powerful arsenal of analytic tools
- if we compose linear maps, the resulting map is linear
Def 1.2.1 Centering
The data is centered, if
For centered data and the squared loss: optimal affine reconstruction maps are linear.
The weight matrices found for linear maps are non-identifiable and one needs to be careful not to over-interpret the found representation.
The reconstruction map of a linear autoencoder is of rank less or equal to m. The bottleneck layer constitutes a rank constraint.
1.3 Projection
Orthogonal Projection
For given subspace U the optimal reconstruction map P is the matrix represent- ing the orthogonal projection ΠU . The optimal linear autoencoder represents a projection.
The optimal weight matrix of the autoencoder with tied parameter matrices
1.4 PCA
For centered data, the optimal autoencoder represents the projection P which maximizes the variance.
Sufficient Statistics
The optimal projection is fully determined by the covariance matrix of the data, i.e.