https://hal-cnam.archives-ouvertes.fr/hal-03134518Saporta, GilbertGilbertSaportaCEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage - CEDRIC - Centre d'études et de recherche en informatique et communications - ENSIIE - Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise - CNAM - Conservatoire National des Arts et Métiers [CNAM] - HESAM - HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers universitéBéra, MichelMichelBéraCEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage - CEDRIC - Centre d'études et de recherche en informatique et communications - ENSIIE - Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise - CNAM - Conservatoire National des Arts et Métiers [CNAM] - HESAM - HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers universitéEditorialHAL CCSD2005[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST][INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG]Saporta, Gilbert2021-02-08 13:19:182022-08-05 14:54:002021-02-16 14:01:36enJournal articles10.1002/asmb.5431In the last 30 years, many algorithms have been developed throughout the AI/Computer Science community (artificial intelligence, neural networks, machine learning), and in Statistics (for instance: Ridge and Principal Component regression, PLS, Generalized Additive Models etc.), to solve what was called the curse of high dimensionality, where Nature embodiment of a problem, be it qualitative or quantitative, could only be described by a very large number of attributes, and observed through a relatively limited number of samples. Somehow controversially, this effort has driven the emergence of Machine Learning as a scientific community. As years went by, powerful methodologies came to existence, and a pure mathematical background, now known as Statistical Learning Theory (SLT) was derived mainly in the 1990s. It may come as a surprise that many popular SLT developments are very close to rather old work in Multivariate Data Analysis, such as Principal Components, Multidimensional Scaling, Kernel based non-linear attributes geometries, non-parametric estimations (density, regression). Regularization techniques and control of VC-dimension are two good examples of this new way of dealing with stochastically ill-posed problems, where the dimensionality of the feature space is too high. Today's SLT efficiency, in terms of computing complexity, allows for a huge scalability in database and problem sizes (millions of events and thousands of attributes are now of common usage). SLT is applied successfully in various fields: finance, bioinformatics, marketing, with an obvious and strong intersection with what is called the Data Mining process. This special issue addresses both theoretical and applied aspects of SLT and its interfaces with statistics, multidimensional classical methods and specific applied fields. Its main purpose}-hopefully}is to show how many bridges already exist between classical statistics and SLT, a powerful source to cross-fertilization for future research. The issue features nine papers briefly described below. Those papers were first presented in a November 2002 conference, 'Statistical Learning, Theory and Applications', held at Amphith! e e# a atre Gr! e egoire at Conservatoire National des Arts et M! e etiers (CNAM) Paris, where more than 250 attendees came. The conference was co-sponsored by CNAM, Soci! e et! e e Franc¸aise de Statistique (SFdS), and KXEN Inc. In Supervised classification and tunnel vision, David Hand analyses the dilemma that users of complex SLT methods for supervised classification meet: maintain interpretability while achieving an excellent performance in terms of misclassification rate. He provides an illustration for a new method to solve this dilemma. The use of Kernel based geometry has now become a major tool for SLT: in A Tutorial on n-Support Vector Machiness, Pai-Hsuen Chen, Chih-Jen Lin and Bernard Sch. o olkopf describe the main ideas of statistical learning theory, support vector machines and kernel feature spaces with a particular emphasis for the so-called n-SVM. Very large scale learning problems remain mainly unsolved: for instance learning how to recognize objects in arbitrary scenes using TV broadcasts as a data source. Current learning algorithms do not scale well enough to allow a timely treatment of such massive data. In Learning VERY large data sets, Leon Bottou and Yann Le Cun show that simple on-line