Abstract
Principal component analysis (PCA) is a ubiquitous technique for dimension reduction of multivariate data. Regularization of PCA becomes essential for high dimensionality, for example, in techniques such as functional PCA and sparse PCA. Maximizing variance of a standardized linear combination of variables is the standard textbook treatment of PCA. A more general perspective of PCA is by way of fitting low rank approximations to the data matrix. I shall first take this low-rank-approximation perspective and describe a general regularization framework for PCA, that leads to alternative approaches for its regularized siblings. This perspective can then be extended to a framework that incorporates supervision on (regularized) PCA, when there is auxiliary information relevant for dimension reduction. I shall finally introduce a general asymptotic framework for studying consistency properties of PCA. The framework includes several existing domains of asymptotics as special cases, and furthermore enables one to investigate interesting connections and transitions among the various domains. The various methods will be demonstrated with interesting applications from bioinformatics, neuroimaging, and business analytics.