Marie Verbanck (Université de Paris, EA 7537 – Biostatistique, Traitement et Modélisation des données biologiques)

The landscape of horizontal pleiotropy in human genetic variation: HOPS – a per-genetic variant quantitative score of horizontal pleiotropy and MR-PRESSO – a horizontal pleiotropy detection method in causal inference

vendredi 6 mars 2020, 9h30 - 10h30

Salle du conseil, espace Turing


Horizontal pleiotropy, where one genetic variant has independent effects on multiple traits, is important for our understanding of the genetic architecture of human phenotypes. However, it is currently unknown the extent of horizontal pleiotropy that is present in human genetic variation.

First, we aimed at developing a score to quantify the amount of horizontal pleiotropy for a given genetic variant using summary statistics (estimated effect sizes and associated standard errors for a large number of traits). We propose the HOrizontal Pleiotropy Score (HOPS). Variants Z-scores are gathered in a matrix (genetic variants x traits) and a Mahalanobis whitening procedure is applied to decorrelate the traits. Per variant, two quantities are calculated and tested i) the total magnitude of effect (sum of squared Z-scores, tested using a chi2 distribution with total number of traits); ii) the number of independent traits (sum of |Z-scores| greater than 2, tested using a binomial distribution with probability 0.05 and total number of traits). Two additional corrections can be applied in the HOPS pipeline to correct for linkage disequilibrium (variant Z-score dependent on neighboring variants) and polygenicity (high number of variants causal to a trait) which can induce serendipitous horizontal pleiotropy. HOPS was validated using simulations and applied to 1,564 medical phenotypes measured in 337,119 humans from the UK Biobank. HOPS detected a significant excess of horizontal pleiotropy. This signal of horizontal pleiotropy was pervasive throughout the human genome and across a wide range of traits, but was especially prominent in regions of high linkage disequilibrium and among highly polygenic traits. We identified thousands of variants with extreme horizontal pleiotropy, a majority of which had never been reported in any published study.

Second, we focused on Mendelian randomization (MR), a method to infer the causality of a risk factor on disease using genetic variants as instrumental variables. MR using summary statistics relies on a weighted linear regression of the disease on the risk factor. The effect sizes both on the risk factor and disease of the genetic variants (associated with the risk factors) are directly fed into the regression model with the standard errors of the variants on the disease as weights. The slope of the regression provides an estimate of the causal effect of the risk factor on the disease. A crucial assumption to MR is the absence of horizontal pleiotropy, meaning the genetic variants must have no other effect on disease than the effect mediated by the risk factor tested for causality. Violation of the ‘no horizontal pleiotropy’ assumption can cause severe bias in MR. However, the extent and impact of horizontal pleiotropy in MR is unknown. That is why we developed the Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) test. MR-PRESSO assumes that horizontal pleiotropy can produce outlier genetic variants in the MR regression. Therefore MR- PRESSO provides a test to detect horizontal pleiotropic outlier. MR-PRESSO was validated using simulations and applied, along with several other MR tests, to 4,250 pairs of complex traits and diseases and found that horizontal pleiotropy (i) was detectable in over 48% of significant causal relationships in MR; (ii) introduced distortions in the causal estimates in MR that ranged on average from –131% to 201%; (iii) induced false-positive causal relationships in up to 10% of relationships.

These findings suggest that horizontal pleiotropy is pervasive in human genetic variation, and has significant implications for our understanding of the genetic architecture of complex traits and diseases, and causal inference testing using Mendelian randomization between complex traits and diseases.

Part of this work was done while at the Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, USA.