* Van Hanh Nguyen (Laboratoire Statistique et Génome, Université d'Evry et Université Paris-Sud 11) - MAP5-UMR 8145

Van Hanh Nguyen (Laboratoire Statistique et Génome, Université d’Evry et Université Paris-Sud 11)

Existence d’un estimateur asymptotiquenent efficace de la proportion d’hypothèses nulles vraies dans un cadre de tests multiples

vendredi 9 mars 2012, 9h45 - 11h00

Salle de réunion, espace Turing

One important problem in the multiple testing context is the estimation of the proportion theta of true null hypotheses.

This proportion appears in a semiparametric mixture model with two components: a uniform distribution on the interval [0,1] and a nonparametric density f. A large number of estimators of this proportion exist under different identifiability assumptions but their rate of convergence or asymptotic efficiency has only been partly studied.

We shall focus here on two different categories of identifiability assumptions previously introduced in the literature: in the first case, f vanishes on a set with positive Lebesgue measure (and a subcase is obtained when this set is an interval) and in the second case, the set of points where f vanishes has a null Lebesgue measure. We first improve the consistency results of the estimator proposed by Celisse & Robin (2010), by establishing its almost sure convergence as well as root-n-consistency, under the assumption that f vanishes on an interval. To our knowledge, this is the first result proving that the parametric rate of convergence may be achieved by a consistent estimator of the proportion theta in this semiparametric setup. We also compute a lower bound on the local asymptotic minimax (LAM) quadratic risk of any estimator under the first case. Then, we shall discuss the existence of asymptotically efficient estimators of the proportion theta in the sense of a convolution theorem. In the first case, we conjecture that no root-n-consistent estimator is efficient. In the second case, we prove that the efficient information matrix for estimating theta is zero. Hence in this case, the LAM quadratic risk is not finite and there is no regular estimator of the proportion theta.