Skip to Main content Skip to Navigation
Conference papers

Sparse Correspondence Analysis

Gilbert Saporta 1 Ruiping Liu 2 Ndeye Niang Keita 1 Huiwen Wang 2 
1 CEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : Since the introduction of the lasso in regression, various sparse methods have been developped in an unsupervised context like sparsePCA which is a combination of feature selection and dimension reduction. Their interest is to simplify the interpretation of the pseudo principal components since each is expressed as a linear combination of only a small number of variables. The disadvantages lie on the one hand in the difficulty of choosing the number of non-zero coefficients in the absence of a criterion and on the other hand in the loss of orthogonality properties for the components and/or the loadings. In this paper we are interested in sparse variants of correspondence analysis (CA) for large contingency tables like documents-terms matrices. We use the fact that CA is both a PCA (or a weighted SVD) and a canonical analysis, in order to develop column sparse CA and rows and columns doubly sparse CA.
Document type :
Conference papers
Complete list of metadata
Contributor : Gilbert Saporta Connect in order to contact the contributor
Submitted on : Wednesday, December 9, 2020 - 11:03:44 AM
Last modification on : Wednesday, September 28, 2022 - 5:50:54 AM


  • HAL Id : hal-02471317, version 1



Gilbert Saporta, Ruiping Liu, Ndeye Niang Keita, Huiwen Wang. Sparse Correspondence Analysis. ASMDA 2019. 18th Conference of the Applied Stochastic Models and Data Analysis International Society, Jun 2019, Florence, Italy. ⟨hal-02471317⟩



Record views


Files downloads