Abstract : Current implementations of Clusterwise methods for regression when applied to massive data either have prohibitive computational costs or produce models that are difficult to interpret. We introduce a new implementation Micro-Batch Clusterwise Partial Least Squares (mb-CW-PLS), which is consists of two main improvements: (a) a scalable and distributed computational framework and (b) a micro-batch Clusterwise regression using buckets (micro-clusters). With these improvements, we are able to produce interpretable regression models with multicollinearity within a reasonable time frame.
https://hal-cnam.archives-ouvertes.fr/hal-02471601 Contributor : Ndeye NiangConnect in order to contact the contributor Submitted on : Sunday, February 9, 2020 - 5:55:21 PM Last modification on : Monday, February 21, 2022 - 3:38:18 PM Long-term archiving on: : Sunday, May 10, 2020 - 1:24:06 PM