WISARD[wɪzərd]
Workbench for Integrated Superfast Association study with Related Data
HOME  |   DOWNLOAD  |   OPTIONS  |   TROUBLE?  |   LOGIN
 

Family-based Gene-level Association Tests

Available statistics [top]

Family-based samples are correlated and thus gene-level analysis with family-based samples should consider the phenotypic/genetic correlations between family members. Under the presence of population stratification, family-based gene-level association analysis is same as the gene-level association analysis with population-based samples, and see the gene-level analysis under population stratification page for statistics.

Set file which lists the variants belonging to a same gene should be provided and the format for set file can be found at the gene-level analysis page.

Statistical power is affected by several factors; definition of set, and homogeneity of effect of each rare variant on phenotypes. Depending on the characteristic of these factor, the most efficient statistic is different and several statistics should be considered at the same time.

Summary for available statistics:

  1. Methods which are efficient when effects of rare variants are homogeneous
    • PEDCMC test (fast): PEDCMC test is an extension of CMC test to family-based samples and be applied for dichotomous phenotypes
    • Collapsing test (fast): can be applied for quantitative phenotypes. May be efficient if the number of rare alleles is related with the disease risk.
  2. Method which is less sensitive to the definition of group
    • FamVT test (moderate): can be applied for quantitative phenotypes. May be efficient if rarer variants have stronger effect on disease.
  3. Method which is efficient if rare variants with both positive and negative effect on disease are grouped to a single set
    • SKAT test (fast): can be applied for quantitative phenotypes. May be efficient if rare variants with positive and negative effect on phenotype are grouped as a set.
    • FB-SKAT (moderate): it is an extension of SKAT, introduced by Ionita-Laza et al., 2014, to regarding family structure in the analysis based on SKAT.
    • rv-TDT (slow): it is an extension of Transmission Disequilibrium Test (TDT) to regarding a set of rare variants, introduced by He et al., 2014.
  4. Method which combines collapsing test and SKAT test, and is robust to the heterogeneity of effects of rare variants
    • SKAT-o test (moderate): can be applied for quantitative phenotypes. May be efficient if rare variants with positive and negative effect on phenotype are grouped as a set.
    • FARVAT (fast): This is an extension of MQLS to SKAT-o and useful for case-control design. It can be modified for quantitative phenotypes and for scenario where there exist covariate effects for dichotomous phenotypes.
    • MFARVAT (slow): This is an extension of FARBAT for joint analysis of multiple phenotypes and genotypes.


PEDCMC [top]

PEDCMC test was suggested by Zhu and Xiong (Am J Hum Genet 2012). It is an extension of CMC test for family-based samples, and can be applied for dichotomous phenotypes. Covariate effect cannot be adjusted.

Example code

  • Calculating PEDCMC test with --genetest option and --pedcmc option, with familial kinship
  • Analysis with PEDCMC, with familial kinship C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --kinship --pedcmc --out res_pedcmc
  • Calculating PEDCMC test with --genetest option and --pedcmc option, with genetic relationship matrix
  • Analysis with PEDCMC, with genetic relationship matrix C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --pedcmc --out res_pedcmc
    NOTE!
    This type of analysis requires lots of common(MAF>5%) markers to perform(say over 10,000)!


Collapsing-based test [top]

Collapsing-based test was suggested by Morris and Zeggini (Genet Epi 2010) and can be applied to family-based gene-level test for quantitative phenotypes. The linear mixed model is applied with the variance covariance parameterized with kinship coefficient matrix.

Example code

  • Quantitative phenotypes
  • Collapsing test C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --indep --genetest --out res_collapsing


FamVT [top]

VT test was suggested by Price et al (Am J Hum Genet 2010). This idea was applied to Scoreseq (Lin and Tang, Am J Hum Genet 2011) to consider the various thresholds for family-based gene-level analysis. It is a score test and can be applied for quantitative phenotype. Phenotypes are assumed to be normally distributed.

Final p-values for VT test are calcualted with numerical algorithm. If the number of variants is too large, p-values are calculated with Monte Carlo simulation and the number of iteration should be decided with the significance level. For instance if you are interested in the 0.05 significance level, then we suggest to iterate at least 1/0.05 *10 times. It should be noted that the maximum number of iteration is limited to 2^32 - 1

  • Example code with familial kinship matrix estimation
  • Analysis with FamVT with default number of permutation C:\Users\WISARD> wisard --bed test_miss0.bed --sampvar test_miss0_phen.txt --pname height --set test_gene.txt --genetest --kinship --famvt --out res_famvt
    Analysis with FamVT with more permutation C:\Users\WISARD> wisard --bed test_miss0.bed --sampvar test_miss0_phen.txt --pname height --set test_gene.txt --genetest --kinship --famvt --nperm 10000 --out res_famvt


SKAT [top]

SNP-set/Sequence kernel association test (SKAT) was suggested by Wu et al (Am J Hum Genet 2011) and can be applied for family-based gene-level test with quantitative phenotypes. The linear mixed model must be used for quantitative phenotype with the variance covariance matrix parameterized with kinship coefficient matrix. It cannot be used for family-based gene-level test with dichotomous phenotypes.

  • Example code
  • Analysis with SKAT for family data C:\Users\WISARD> wisard --bed test_miss0.bed --sampvar test_miss0_phen.txt --pname height --set test_gene.txt --genetest --kinship --skat --out res_skat


SKAT-o [top]

SNP-set/Sequence Kernel Association Test-optimal(SKAT-o) is an extension of SKAT and was suggested by Lee et al (Biostatistics 2012). SKAT-o can be applied for family-based gene-level test with quantitative phenotypes. The linear mixed model must be used for quantitative phenotype with the variance covariance matrix parameterized with kinship coefficient matrix. It cannot be used for family-based gene-level test with dichotomous phenotypes.

  • Example code
  • Analysis with SKAT-o for family data C:\Users\WISARD> wisard --bed test_miss0.bed --sampvar test_miss0_phen.txt --pname height --set test_gene.txt --genetest --kinship --skato --out res_skato


FARVAT [top]

It was proposed by Choi et al. (2014) and is an extension of MQLS for rare variant association analysis. FARVAT is an optimized family-based gene-level association test of dichotomous trait and the kinship coefficient matrix should be incorporated as a genetic correlation matrix. If there exist some covariate effects or phenotypes are quantitative, FARVAT can be modified but some power loss is expected. It has similar property with SKAT-o.

  • Example code
  • Analysis with FARVAT C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --farvat --out res_farvat
  • Dichotomous phenotypes, using familial kinship
  • FARVAT C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --farvat --kinship --out res_farvat_di
    If there is a single variant within a gene, an optimal test of FARVAT is not calculated.
  • Coveriate adjustment for dichotomous phenotypes
  • FARVAT with adjustment of covariate effect from height C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --farvat --kinship --sampvar test_miss0_phen.txt --cname height --out res_farvat_dicov
  • Quantitative phenotypes, covariate adjustment with two-step
  • Step 1) Residual estimation C:\Users\WISARD> wisard --bed test_miss0.bed --makeblup --sampvar test_miss0_phen.txt --pname sbp --cname age,height --out test_lmm
    Step 2) FARVAT for quantitative phenotypes C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --pheno test_miss0_phen.txt --pname sbp --cname age,height --genetest --blup test_lmm.SD.blup --est test_lmm.poly.est.res --farvat --out res_farvat_qu


MFARVAT [top]

MFARVAT is an extension of FARVAT. If there exists some covariate effects or phenotypes are quantitative, MFARVAT needs to be modified in the same way as FARVAT. According to an assumption on the relationship between phenotypes, MFARVAT is divided into two types of statistics, either of homogeneous(--mfhom) or heterogeneous(--mfhet).

  • Dichotomous phenotypes with heterogenous assumption, using familial kinship
  • MFARVAT C:\Users\WISARD> wisard --bed test_miss0.bed --pheno test_miss0_phen.txt --pname t2d,hypertens --set test_gene.txt --genetest --mfhet --kinship --out res_mfarvat_di
    If there is a single variant within a gene, an optimal test of MFARVAT is not calculated.
  • Coveriate adjustment for dichotomous phenotypes with same assumption
  • MFARVAT with adjustment of covariate effect from age and height C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --mfhet --kinship --sampvar test_miss0_phen.txt --pname t2d,hypertens --cname age,height --out res_mfarvat_dicov
  • Quantitative phenotypes with homogeneous assumption, covariate adjustment with two-step
  • Step 1) Residual estimation C:\Users\WISARD> wisard --bed test_miss0.bed --makeblup --sampvar test_miss0_phen.txt --pname sbp,dbp --cname age,height --out res_lmm_multi
    Step 2) MFARVAT for quantitative phenotypes C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --sampvar test_miss0_phen.txt --pname sbp,dbp --cname age,height --genetest --blup res_lmm_multi.SD.blup --est res_lmm_multi.poly.est.res --mfhom --out res_mfarvat_qu
    NOTE!
    With the heterogeneity assumption(--mfhet), the computational complexity will be significantly different as the number of phenotypes increasing compared to homogeneous assumption (--mfhom)!


rv-TDT and FB-SKAT [top]

rv-TDT is a rare variant extensions of the Transmission Disequilibrium Test (TDT). Since this statistical test requires permutation scheme, it might be take moderately longer time than the other tests in WISARD. Note that because of this test is an extension of TDT, this test only can be applicable to dichotomous phenotype, and family-based dataset.

FB-SKAT is a class of family-based association tests that includes as particular cases the burden test and the variance-component test (SKAT). Furthermore, these family-based tests correspond directly to existing population-based tests. Because of the limitation of our implementation, FB-SKAT is supplied only with rv-TDT.

NOTE!
This analysis is supported from version 1.1.0.6
  • Default command
  • Perform rv-TDT with default number of permutations (1,000) C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --rvtdt --out res_rvtdt
  • Larger number of permutations (take much shorter time but reports inaccurate result)
  • Perform rv-TDT with 100 times of permutation C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --rvtdt --out res_rvtdt --nperm 100
  • Perform FB-SKAT in same time with rv-TDT
  • Perform rv-TDT and FB-SKAT with default number of permutations (1,000) C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --rvtdt --fbskat --out res_rvtdt_and_fbskat




Edit this page
Last modified : 2017-09-08 08:52:25