https://hal-cnam.archives-ouvertes.fr/hal-02470100Wang, HuiwenHuiwenWangBUAA - Beihang UniversityGu, JieJieGuBUAA - Beihang UniversityWang, ShanshanShanshanWangBUAA - Beihang UniversitySaporta, GilbertGilbertSaportaCEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage - CEDRIC - Centre d'études et de recherche en informatique et communications - ENSIIE - Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise - CNAM - Conservatoire National des Arts et Métiers [CNAM] - HESAM - HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers universitéSpatial partial least squares autoregression: Algorithm and applicationsHAL CCSD2019[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST]Rigaux, Philippe2020-02-07 08:13:462022-09-28 05:51:402020-02-07 08:13:46enJournal articles10.1016/j.chemolab.2018.12.0011Partial least squares (PLS) is an efficient multivariate statistical data analysis method and has shown to be particularly useful to deal with knotty problems in practical applications, e.g., the ill-conditioned regression issues where the predictor variables are highly collinear or the sample size is much smaller than the number of covariates. Various works on PLS and its extension have been developed under the assumption that all the observations are mutually independent. However, this independence assumption may be violated in practice, especially when we collect data with network dependence structure among observations from some scientific disciplines, such as sociology and spatial economics. Yet relatively few works are available for PLS with network dependence structure. Here, we propose an spatial partial least squares autoregression (SPLSAR) to solve this problem, incorporating a spatial autoregressive parameter and a spatial weight matrix into PLS to accommodate dependence structure between individuals. Thus, the proposed method is more flexible as it can not only inherit the advantages of PLS, but also have the ability to deal with the network dependence. Moreover, we propose an efficient algorithm for the implementation of parameter estimation in SPLSAR regression. The finite sample performance of the proposed method is evaluated via extensive simulation study and a real data analysis, where results show that SPLSAR can achieve a better and more robust performance in terms of parameter estimation precision and out-of-sample prediction accuracy, when compared to classical PLS and SAR.