1 / 12

Adjusting Relatedness for Family Data in Collapsing Test of Rare Variants

Adjusting Relatedness for Family Data in Collapsing Test of Rare Variants. Qunyuan Zhang, Doyoung Chung Ingrid Borecki, Michael A. Province Division of Statistical Genomics Washington University School of Medicine St. Louis, Missouri, USA IGES, Sept. 2011, Heidelberg

rene
Télécharger la présentation

Adjusting Relatedness for Family Data in Collapsing Test of Rare Variants

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adjusting Relatedness for Family Data in Collapsing Test of Rare Variants Qunyuan Zhang, Doyoung Chung Ingrid Borecki, Michael A. Province Division of Statistical Genomics Washington University School of Medicine St. Louis, Missouri, USA IGES, Sept. 2011, Heidelberg Contact: Qunyuan Zhang, qunyuan@wustl.edu

  2. Introduction Advances of sequencing technologies have been facilitating rare variants (RVs) identification. Family data, as potentially enriched with RVs within pedigrees, may provide a great source for detecting association between RVs and human complex traits. Most RV testing methods developed in recent years, however, are data-driven and permutation-based collapsing methods, which are inapplicable to family data, because direct permutation test ignores and destroys family structure.

  3. Purpose To deal with the relatedness issue in family data , we propose a mixed model based procedure that incorporates family information with collapsing analysis in a permutation test, denoted by MMPT (Mixed Model-based Permutation Test).

  4. Statistical Model (1) Y is the observed trait , α the intercept, βthe collective effect coefficient, m the number of RVs in a genetic unit (usually a gene) of interest, wi the weight of variant i, gi the number (0, 1 or 2) of minor allele of variant i, εthe residual. The Σwigipart in the model is the weighted sum score of multiple variants. Z is the design matrix corresponding to γ, andγ follows a multivariate normal distribution of N(0,G). Here G is the variance-covariance matrix of γ, which can be decomposed as G=2σ2K, where K is the kinship matrix and σ2 is the additive ploygene genetic variance. To deal with family structure, we generalize collapsing test as a weighted sum score test based on a linear mixed model:

  5. Weighted Sum Scores In terms of weighting, most existing collapsing methods can be viewed as special instances of model (1). For example, Morgenthaler and Thilly’s CAST is equivalent to setting wi=1 for all RVs; Li and Leal’s CMC sets wi=1 for all RVs but limits the sum ≤1. Madsen and Browning’s WSS calculates wi based-on allele frequency in controls. Han and Pan’s aSum test recodes genotypes (equivalent to choosing wi = 1 or -1) according to a pre-defined cutoff of p-value; Zhang et al’s PWST and SPWST define wi as a rescaled left-tailed p-value.

  6. MMPT: Mixed Model-based Permutation Test Permuted Since WSS, aSum, PWST and SPWST are data-driven and permutation-based test, we apply model (1) to them by permuting the weighted sum score part and fixing the subject IDs of the rest of components, illustrated as below: Non-permuted, subject IDs fixed

  7. Data The 200 replications of data of 697 subjects from 8 extended families simulated by the Genetic Analysis Workshop (GAW) 17 [Almasy et al., 2011] were used, and the quantitative trait Q2 was chosen as the target trait. For each gene, the genotypes with minor allele frequency (MAF) less than 0.01 were collapsed into a variable using different weighting methods (CMC, WSS, aSum, PWST and SPWST) . The kinship matrix K was calculated based on the pedigree data. The Genetic Analysis Workshop (GAW) 17 is supported by the NIH Grant R01 GM031575. Preparation of the GAW 17 simulated data set was supported in part by NIH R01 MH059490 and used sequencing data from the 1000 Genomes Project (www.1000genomes.org)

  8. Results(1) Q-Q Plots of –log10(P) under the Null CMC non-permutation test, ignoring family structure, inflation of type-1 error CMC non-permutation test, modeling family structure via mixed model, inflation is corrected

  9. WSS Permutation test, ignoring family structure, inflation of type-1 error Results(2) Q-Q Plots under the Null aSum PWST SPWST

  10. WSS Mixed model-based permutation test (MMPT), modeling family structure, inflation corrected Results(3) Q-Q Plots under the Null aSum PWST SPWST

  11. Conclusions Ignoring relatedness between subjects in family data may result in significant inflation of type-1 error in collapsing test of rare variants. Directly modeling kinship data using mixed model can correct the inflation of non-data-driven collapsing test (e.g. CMC). Directly applying data-driven and permutation-based methods (e.g. WSS, aSum, PWST and SPWST) to family data may result in significant inflation of type-1 error, too. The inflation of data-driven and permutation-based methods can be corrected by the proposed MMPT method, which incorporates kinship information with permutation test.

  12. Main References Almasy LA, Dyer TD, Peralta JM, Kent JW Jr, Charlesworth JC, Curran JE, Blangero J.: Genetic Analysis Workshop 17 mini-exome simulation. BMC Proc 2011, 5 (suppl 8): Han F, Pan W. 2010. A data-adaptive sum test for disease association with multiple common or rare variants. Hum Hered 70(1):42-54. Li B, Leal SM. 2008. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83(3):311-21. Madsen BE, Browning SR. 2009. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5(2):e1000384. Morgenthaler S, Thilly WG. 2007. A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat Res 615(1-2):28-56. Zhang Q, Irvin MR, Arnett DK, Province MA, Borecki I. Genet Epidemiol. 2011, doi: 10.1002/gepi.20618

More Related