Mahendra Mariadassou (INRA Jouy-en-Josas, laboratoire MaIAGE)

Variational Inference for Probabilistic Poisson PCA: application to metagenomics

vendredi 9 mars 2018, 9h30 - 10h30

Salle du conseil, espace Turing

Many application domains such as ecology or genomics have to deal with multivariate non Gaussian observations. A typical example is the joint observation of the respective abundances of a set of species in a series of sites, aimed at understanding the co-variations between these species. The Gaussian setting provides a canonical way to model such dependencies, but does not apply in general to such data.

We consider here the multivariate exponential family framework for which we introduce a generic hierarchical model with multivariate Gaussian latent variables. This model can be seen as an extension of probabilistic Principal Component Analysis (pPCA) to non Gaussian settings and enables us to account for covariates and offsets at little additional cost.

Unlike the purely Gaussian setting, the likelihood is generally not tractable in this framework. We resort instead to a variational approximation for parameter inference and solve the corresponding optimization problem using gradient descent. Formal expression of the gradient depends on the exponential family at hand and does not always have a analytical. However, coordinates of the gradient depend only on one dimensional integral of smooth functions which can be evaluated efficiently using Gauss-Hermite quadrature.

We then focus on the case of the Poisson-lognormal model, for which both the variational approximation and
its gradient have closed-formed expressions, in the context of microbial ecology. We illustrate the importance
of accounting for offsets and covariates on two datasets. Finally, we sketch some promising extensions of the framework, most notably to inference of co-occurrence networks.

Chiquet, Julien, Mahendra Mariadassou, and Stéphane Robin. 2017. “Variational Inference for Probabilistic Poisson Pca.”