Sparse Correspondence Analysis - Cnam - Conservatoire national des arts et métiers Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Sparse Correspondence Analysis

Résumé

Since the introduction of the lasso in regression, various sparse methods have been developped in an unsupervised context like sparsePCA which is a combination of feature selection and dimension reduction. Their interest is to simplify the interpretation of the pseudo principal components since each is expressed as a linear combination of only a small number of variables. The disadvantages lie on the one hand in the difficulty of choosing the number of non-zero coefficients in the absence of a criterion and on the other hand in the loss of orthogonality properties for the components and/or the loadings. In this paper we are interested in sparse variants of correspondence analysis (CA) for large contingency tables like documents-terms matrices. We use the fact that CA is both a PCA (or a weighted SVD) and a canonical analysis, in order to develop column sparse CA and rows and columns doubly sparse CA.
Saporta_sparseASMDA2019V2.pdf (1.96 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02471317 , version 1 (09-12-2020)

Identifiants

  • HAL Id : hal-02471317 , version 1

Citer

Gilbert Saporta, Ruiping Liu, Ndeye Niang Keita, Huiwen Wang. Sparse Correspondence Analysis. ASMDA 2019. 18th Conference of the Applied Stochastic Models and Data Analysis International Society, Jun 2019, Florence, Italy. ⟨hal-02471317⟩
197 Consultations
36 Téléchargements

Partager

Gmail Facebook X LinkedIn More