WISARD official site

Select O/S : [?]

Case tutorial

Study design
Data management
Summary statistics
Association analysis
- Relationship matrix
- Population stratification
- Association analysis
- Association analysis with population stratification
- Family-based association analysis
- Epistasis analysis
- Gene-level association
- Gene-level association with population stratification
- Family-based gene-level association
- Pathway-level association
- Meta-analysis
Miscellaneous
- Output control
- Link with other tools

Gene-level Association with Population Stratification

Available statistics [top]

Population stratification invalidates results from rare-variant association analysis, and statistics under the absence of population stratification cannot be applied.

Set file which lists the variants belonging to a same gene should be provided and the format for set file can be found at the gene-level analysis page.

Statistical power is affected by several factors; definition of set, and homogeneity of effect of each rare variant on phenotypes. Depending on the characteristic of these factor, the most efficient statistic is different and several statistics should be considered at the same time.

Summary for available statistics:

Methods which are efficient when effects of rare variants are homogeneous

CMC test (fast): can be applied for dichotomous phenotypes. May be efficient if the presence/absence of rare alleles (not number of rare alleles) is related with the disease risk.
PEDCMC test (fast): PEDCMC test is an extension of CMC test for correlated samples, and may be more efficient than CMC test with genomic control under the population stratification.
Collapsing test (fast): can be applied for dichotomous and quantitative phenotypes. May be efficient if the number of rare alleles is related with the disease risk.

Method which is less sensitive to the definition of rare variants

FamVT test (moderate): can be applied for quantitative phenotypes. May be efficient if rarer variants have stronger effect on disease.

Method which is efficient if rare variants with both positive and negative effect on disease are grouped to a single set

SKAT test (fast): can be applied for dichotomous and quantitative phenotypes. May be efficient if rare variants with positive and negative effect on phenotype are grouped as a set.

Method which combines collapsing test and SKAT test, and is robust to the heterogeneity of effects of rare variants

SKAT-o test (moderate): can be applied for dichotomous and quantitative phenotypes. May be efficient if rare variants with positive and negative effect on phenotype are grouped as a set.
FARVAT (fast): This is an extension of MQLS to SKAT-o and useful for case-control design. It can be applied for dichotomous and quantitative phenotypes.

CMC test [top]

Combined Multivariate and Collapsing(CMC) test was originally suggested for caes-control design under absence of the population stratification. Under the population stratification, results from CMC is not valid and the genomic control approach should be applied. Fisher's exact method for CMC test cannot be used under the population stratification.

Example code

Calculating CMC test with --genetest option with adjustment of population stratification with genomic control

CMC test C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genoctrl --ibs --genetest --out res_collapsing

--genoctrl

--verbose

PEDCMC test [top]

PEDCMC test was suggested by Zhu and Xiong (Am J Hum Genet 2012). It is an extension of CMC test for correlated samples, and can be applied for dichotomous phenotypes. Covariate effect cannot be adjusted. Robust against the population stratification and may be more efficient than CMC test with adjustment of genomic control under the population stratification.

Example code

Calculating PEDCMC test with --genetest option and --pedcmc option, with population stratification

Analysis with PEDCMC C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --pedcmc --out res_pedcmc

this section

Collapsing-based test [top]

Collapsing-based test was suggested by Morris and Zeggini (Genet Epi 2010) and can be applied to dichotomous and quantitative phenotypes. Under the population stratification, PC scores should be calculated from genetic relationship matrix and be included as covariates for logistic regression for dichotomous phenotypes. Under the presence of population stratification, the linear mixed model should be applied with the variance covariance parameterized with ibs matrix.

Example code

Dichotomous phenotypes

Collapsing test C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --ibs --genetest --out res_collapsing

Quantitative phenotypes

FamVT test [top]

VT test was suggested by Price et al (Am J Hum Genet 2010). This idea was applied to Scoreseq (Lin and Tang, Am J Hum Genet 2011) to consider the various thresholds for gene-level analysis under the population stratification. It is a score test and can be applied for quantitative phenotype. Phenotypes are assumed to be normally distributed.

Final p-values for VT test are calcualted with numerical algorithm. If the number of variants is too large, p-values are calculated with Monte Carlo simulation and the number of iteration should be decided with the significance level. For instance if you are interested in the 0.05 significance level, then we suggest to iterate at least 1/0.05 *10 times. It should be noted that the maximum number of iteration is limited to 2^32 - 1

Example codes

Calculating FamVT

Calculating p-values with FamVT test C:\Users\WISARD> wisard --ped test_miss0.ped --pheno test_miss0_phen.txt --pname height --genetest --ibs --famvt --set test_gene.txt --nperm 10000 --out res_vt_10Kperm

NOTE!

In order to estimate proper IBS, more than 10,000 common(MAF>5%) variants are recommended!

--nperm

SKAT [top]

SNP-set/Sequence kernel association test (SKAT) was suggested by Wu et al (Am J Hum Genet 2011) and can be applied for dichotomous and quantitative phenotypes. Under the presence of population stratification, the linear mixed model must be used for quantitative phenotype with the variance covariance matrix parameterized with ibs matrix. For dichotomous phenotypes, PC scores should be calculated from genetic relationship matrix and be included as covariates for logistic regression.

Example code

Dichotomous phenotypes with PC adjustment

SKAT test C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --indep --pc2cov --pca --skat --out res_skato

Quantitative phenotypes with IBS estimation

SKAT test C:\Users\WISARD> wisard --bed test_miss0.bed --pheno test_miss0_phen.txt --pname height --set test_gene.txt --genetest --ibs --skat --out res_skato

SKAT-o [top]

SNP-set/Sequence Kernel Association Test-optimal(SKAT-o) is an extension of SKAT and was suggested by Lee et al (Biostatistics 2012). It is a mixture of burden-type test and SKAT, and can be applied to dichotomous and quantitative phenotypes. Under the presence of population stratification, the linear mixed model must be used for quantitative phenotype with the variance covariance matrix parameterized with ibs matrix. For dichotomous phenotypes, PC scores should be calculated from genetic relationship matrix and be included as covariates for logistic regression.

Example code

Dichotomous phenotypes with PC adjustment

SKAT-o test for dichotomous phenotype C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --indep --pc2cov --pca --skato --out res_skato

NOTE!

SKAT-o test cannot be calculated if a gene contains only a single variant.

Quantitative phenotypes with IBS estimation

SKAT-o test for continuous phenotype C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --ibs --skato --out res_skato

FARVAT [top]

It was suggested by Choi et al (2014) and is an extension of MQLS for rare variant association analysis. FARVAT is an optimized gene-level association test of dichotomous trait and under the population stratification, the genetic relationship matrix should be incorporated as a genetic correlation matrix. If there exists some covariate effects or phenotypes are quantitative, FARVAT can be modified but some power loss is expected. It has similar property with SKAT-o.

Example code

Dichotomous phenotype, using GRM

FARVAT for dichotomous phenotype C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --farvat --prevalence 0.12 --out res_farvat_di

NOTE!

--prevalence is required when running FARVAT without covariate adjustment!

If there is a single variant within a gene, an optimal test of FARVAT is not calculated.

NOTE!

In order to estimate proper GRM, more than 10,000 common(MAF>5%) variants are recommended!

Dichotomous phenotype with multi-ethnic population, using GRM

In order to running FARVAT with multi-ethnic population dataset without covariate adjustment, an appropriate assignment of population-wise prevalence is required. See the below example. Note that every ethnics must be included in assignment of --prevalence.

Dichotomous phenotype FARVAT with multi-ethnic dataset, with ethnic-wise prevalence C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --farvat --sampvar test_miss0_phen.txt --prevalence ASIA=0.12,EUROPE=0.08,AMERICA=0.15

NOTE!

Column name for population assignment must be POP_COUNT for this case!

Covariate adjustment for dichotomous phenotypes

FARVAT with adjustment of covariate effect from age and height C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --farvat --sampvar test_miss0_phen.txt --cname age,height --out res_farvat_dicov

Quantitative phenotypes, covariate adjustment with two-step

Step 1) Residual estimation C:\Users\WISARD> wisard --bed test_miss0.bed --makeblup --sampvar test_miss0_phen.txt --pname sbp --cname age,height --out test_lmm

Step 2) FARVAT for quantitative phenotypes C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --sampvar test_miss0_phen.txt --pname sbp --cname age,height --blup test_lmm.SD.blup --est test_lmm.poly.est.res --farvat --out res_farvat_qu

Edit this page

Last modified : 2017-09-09 13:47:20