Details: |
Abstract
We revisit the problem of finding principal components to
the multivariate datasets, that lie on an embedded nonlinear Riemannian
manifold within the higher-dimensional space. Our aim is to extend the
geometric interpretation of PCA, while being able to capture the non-geodesic
form of variation in the data. We introduce the concept of a principal
sub-manifold, a manifold passing through the center of the data, and at any point
of the manifold, it moves in the direction of the highest curvature in the
space spanned by the eigenvectors of the local tangent space PCA. Compared to
the recent work in the case where the sub-manifold is of dimension one
(Panaretos et al. 2014), essentially a curve lying on the manifold attempting
to capture the one-dimensional variation, the current setting is much more
general. The principal sub-manifold is therefore
an extension of the principal flow, accommodating to capture the higher dimensional
variation in the data. We show the principal sub-manifold yields the usual principal
components in Euclidean space. By means of examples, we illustrate that how to
find, use and interpret principal sub-manifold with an extension of using it in
shape analysis. (This is a joint work with Tung Pham)
|