Servane Gey (MAP5, université Paris Descartes)

Mesures d’influence pour les arbres de classification

vendredi 26 avril 2013, 9h30 - 10h30

Salle de réunion, espace Turing


The question of measuring influence of observations on the results obtained with classification trees is of interest to detect atypical individuals. To define the influence of individuals on the analysis, we propose criterions to measure the sensitivity of the Classification And Regression Trees (CART) analysis. The proposals are based on predictions and use jackknife trees. The analysis is extended to the pruned subtrees sequences of CART to produce specific notions of influence. Using the framework of influence functions, distributional results are derived.

A real dataset relating the administrative classification of cities surrounding Paris, France, to the characteristics of their tax revenues distribution, is analyzed using the new
influence-based tools.