Charles Bouveyron (Université Paris 1)

Classification des donnée de grande dimension: enjeux, problématiques et quelques solutions récentes

vendredi 25 janvier 2013, 11h00 - 12h00

Salle de réunion, espace Turing


The Fisher-EM algorithm has been recently proposed for the
simultaneous visualization and clustering of high-dimensional data. It is
based on a mixture model which fits the data into a latent discriminative
subspace with a low intrinsic dimension. From a practical point of view, the
Fisher-EM algorithm turns out to outperform other subspace clustering in most
situations. The convergence of the Fisher-EM algorithm is as well studied. It
is in particular proved that the algorithm converges under weak conditions in
the general case. It is also shown that the Fisher’s criterion can be used as
stopping criterion for the algorithm to improve the clustering accuracy and
that the Fisher-EM algorithm usually converges faster than both the EM and CEM
algorithms. Finally, a sparse extension of the Fisher-EM algorithm is proposed
by adding a L1 constraint in the F step. This allows in particular to perform
a selection of the original variables which are discriminative.