New optimization models for optimal classification trees - Archive ouverte HAL Access content directly
Preprints, Working Papers, ... (Working Paper) Year :

New optimization models for optimal classification trees

Nouveaux modèles d'optimisation pour la construction optimale d'arbres de classification

, (1) , (1)
1
Zacharie Alès
Valentine Huré
Amélie Lambert

Abstract

Interpretability is a growing concept in Machine Learning. Decision-making algorithms are more and more used in healthcare, finance or other high stakes contexts. Therefore, the need for algorithms whose decisions are understandable is of the utmost importance. Intrinsically interpretable classifiers such as decision trees are often seen as less accurate than black box models such as neural networks. For decision trees, state-of-the-art methods are recursive heuristics (e.g. CART) that may fail to find underlying characteristics in datasets. Recently, linear formulations were introduced to model the problem of the construction of the best decision tree for a given dataset. Notably, a MIO formulation, introduced by Bertsimas and al., has shown better accuracy than CART. However this model does not scale up to datasets with more than 1000 data points. Our work focuses on improvements of MIOs that speed up their resolution in order to handle larger problems. We present a quadratic formulation of the MIP devised by Bertsimas and al. as well as its linearization and another that extends a flow-formulation (from binary dataset to real-value dataset). We prove that our new formulations have stronger continuous relaxation than the MIP introduced by Bertsimas and al.. Finally, our experiments show that they have a significantly smaller resolution time than the MIP of Bertsimas and al. while maintaining or improving performances on test sets.
Fichier principal
Vignette du fichier
New_optimization_models_for_optimal_classification_trees (1).pdf (1.16 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03865931 , version 1 (22-11-2022)

Identifiers

  • HAL Id : hal-03865931 , version 1

Cite

Zacharie Alès, Valentine Huré, Amélie Lambert. New optimization models for optimal classification trees. 2022. ⟨hal-03865931⟩
0 View
0 Download

Share

Gmail Facebook Twitter LinkedIn More