Improving Performance and Accuracy of Local PCA

Václav Gassenbauer Jaroslav Křivánek Kadi Bouatouch Christian Bouville Mickaël Ribardiere
Computer Graphics Forum (Proceedings of PG 2011) 30(7):1903-1910, 2011
Local Principal Component Analysis (LPCA) is one of the popular techniques for dimensionality reduction and data compression of large data sets encountered in computer graphics. The LPCA algorithm is a variant of kmeans clustering where the repetitive classification of high dimensional data points to their nearest cluster leads to long execution times. The focus of this paper is on improving the efficiency and accuracy of LPCA. We propose a novel SortCluster LPCA algorithm that significantly reduces the cost of the point-cluster classification stage, achieving a speed-up of up to 20. To improve the approximation accuracy, we investigate different initialization schemes for LPCA and find that the k-means++ algorithm [AV07] yields best results, however at a high computation cost. We show that similar ideas that lead to the efficiency of our SortCluster LPCA algorithm can be used to accelerate k-means++. The resulting initialization algorithm is faster than purely random seeding while producing substantially more accurate data approximation.