Vittorio Perduca (MAP5, Université Paris Descartes)

Phenotype simulation under a disease model and applications to power analysis of GWAs

vendredi 17 février 2012, 9h30 - 10h45

Salle de réunion, espace Turing

Genome Wide Association studies (GWAs) are widely used to investigate
the connection between genotypic and phenotypic variation with respect
to a given trait (e.g. a given disease). Assessing the statistical power
of such studies is crucial. Power is empirically estimated by simulating
realistic samples under a disease model H1. For this purpose, the gold
standard consists in simulating the genotypes given the observed
phenotypes (case or control); thus ensuring that the total number of
cases stays unchanged. This method is implemented in the software of
reference Hapgen.
I will present an alternative approach for simulating samples under H1
that does not require generating new genotypes for each simulation but
only phenotypes. This method is based on a backward sampling algorithm
and make it possible to simulate new phenotypic datasets under the
constraints that a) the phenotypes are in accordance with the
corresponding observed genotypes under the chosen model H1; b) the total
number of cases is the same as in the observed dataset.
I will show that our backward sampling algorithm outperforms other
standard approaches such as simple rejection algorithm and MCMC.
Moreover our algorithm is faster than Hapgen. At last, I will discuss
the results of a power analysis on a fictive GWAs we conducted on real
data from the 1000 Genomes Project.

This is joint work with Gregory Nuel (MAP5, Université Paris Descartes),
Christine Sinoquet and Raphael Mourad (Université de Nantes).