Sparse Correspondence Analysis - Archive ouverte HAL Access content directly
Conference Papers Year :

Sparse Correspondence Analysis

(1) , (2) , (1) , (2)
1
2

Abstract

Since the introduction of the lasso in regression, various sparse methods have been developped in an unsupervised context like sparsePCA which is a combination of feature selection and dimension reduction. Their interest is to simplify the interpretation of the pseudo principal components since each is expressed as a linear combination of only a small number of variables. The disadvantages lie on the one hand in the difficulty of choosing the number of non-zero coefficients in the absence of a criterion and on the other hand in the loss of orthogonality properties for the components and/or the loadings. In this paper we are interested in sparse variants of correspondence analysis (CA) for large contingency tables like documents-terms matrices. We use the fact that CA is both a PCA (or a weighted SVD) and a canonical analysis, in order to develop column sparse CA and rows and columns doubly sparse CA.
Vignette du fichier
Saporta_sparseASMDA2019V2.pdf (1.96 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-02471317 , version 1 (09-12-2020)

Identifiers

  • HAL Id : hal-02471317 , version 1

Cite

Gilbert Saporta, Ruiping Liu, Ndeye Niang Keita, Huiwen Wang. Sparse Correspondence Analysis. ASMDA 2019. 18th Conference of the Applied Stochastic Models and Data Analysis International Society, Jun 2019, Florence, Italy. ⟨hal-02471317⟩
153 View
21 Download

Share

Gmail Facebook Twitter LinkedIn More