Family-based samples are correlated and thus gene-level analysis with family-based samples should consider the phenotypic/genetic correlations between family members.
Under the presence of population stratification, family-based gene-level association analysis is same as the gene-level association analysis with population-based
samples, and see the gene-level analysis under population stratification page for statistics.
Set file which lists the variants belonging to a same gene should be provided and the format
for set file can be found at the gene-level analysis page.
Statistical power is affected by several factors; definition of set, and
homogeneity of effect of each rare variant on phenotypes.
Depending on the characteristic of these factor, the most efficient statistic is different and
several statistics should be considered at the same time.
Summary for available statistics:
Methods which are efficient when effects of rare variants are homogeneous
PEDCMC test(fast):
PEDCMC test is an extension of CMC test to family-based samples and be applied for dichotomous phenotypes
Collapsing test (fast): can be applied for
quantitative phenotypes. May be efficient if the number of rare alleles is related with the disease risk.
Method which is less sensitive to the definition of group
FamVT test (moderate): can be applied for quantitative phenotypes.
May be efficient if rarer variants have stronger effect on disease.
Method which is efficient if rare variants with both positive and negative effect on disease are grouped to a single set
SKAT test (fast): can be applied for quantitative
phenotypes. May be efficient if rare variants with positive and negative effect on phenotype are grouped as a set.
FB-SKAT(moderate): it is an extension of SKAT, introduced by Ionita-Laza et al., 2014, to regarding family structure in the analysis based on SKAT.
rv-TDT(slow): it is an extension of Transmission Disequilibrium Test (TDT) to regarding a set of rare variants, introduced by He et al., 2014.
Method which combines collapsing test and SKAT test, and is robust to the heterogeneity of effects of rare variants
SKAT-o test (moderate): can be applied for quantitative phenotypes.
May be efficient if rare variants with positive and negative effect on phenotype are grouped as a set.
FARVAT (fast): This is an extension of MQLS to SKAT-o and useful for case-control design.
It can be modified for quantitative phenotypes and for scenario where there exist covariate effects for dichotomous phenotypes.
MFARVAT (slow):
This is an extension of FARBAT for joint analysis of multiple phenotypes and genotypes.
PEDCMC test was suggested by Zhu and Xiong (Am J Hum Genet 2012). It is an extension of CMC test for family-based samples, and
can be applied for dichotomous phenotypes. Covariate effect cannot be adjusted.
Example code
Calculating PEDCMC test with --genetest option and --pedcmc option, with familial kinship
Calculating PEDCMC test with --genetest option and --pedcmc option, with genetic relationship matrix
NOTE!
This type of analysis requires lots of common(MAF>5%) markers to perform(say over 10,000)!
Collapsing-based test was suggested by Morris and Zeggini (Genet Epi 2010) and can be applied to family-based gene-level test for quantitative phenotypes.
The linear mixed model is applied with the variance covariance parameterized with kinship coefficient matrix.
VT test was suggested by Price et al (Am J Hum Genet 2010).
This idea was applied to Scoreseq (Lin and Tang, Am J Hum Genet 2011) to consider the various thresholds for family-based gene-level analysis.
It is a score test and can be applied for quantitative phenotype.
Phenotypes are assumed to be normally distributed.
Final p-values for VT test are calcualted with numerical algorithm.
If the number of variants is too large, p-values are calculated with
Monte Carlo simulation and the number of iteration should be decided with the significance level.
For instance if you are interested in the 0.05 significance level, then we suggest to iterate at least 1/0.05 *10 times.
It should be noted that the maximum number of iteration is limited to 2^32 - 1
Example code with familial kinship matrix estimation
SNP-set/Sequence kernel association test (SKAT) was suggested by Wu et al (Am J Hum Genet 2011) and can be applied for family-based gene-level test with quantitative phenotypes.
The linear mixed model must be used for quantitative phenotype with the variance covariance matrix parameterized
with kinship coefficient matrix. It cannot be used for family-based gene-level test with dichotomous phenotypes.
SNP-set/Sequence Kernel Association Test-optimal(SKAT-o) is an extension of SKAT and was suggested by Lee et al (Biostatistics 2012).
SKAT-o can be applied for family-based gene-level test with quantitative phenotypes.
The linear mixed model must be used for quantitative phenotype with the variance covariance matrix parameterized
with kinship coefficient matrix. It cannot be used for family-based gene-level test with dichotomous phenotypes.
It was proposed by Choi et al. (2014) and is an extension of MQLS for rare variant association analysis.
FARVAT is an optimized family-based gene-level association test of dichotomous trait and
the kinship coefficient matrix should be incorporated as a genetic correlation matrix.
If there exist some covariate effects or phenotypes are quantitative, FARVAT
can be modified but some power loss is expected. It has similar property with SKAT-o.
Example code
Dichotomous phenotypes, using familial kinship
If there is a single variant within a gene, an optimal test of FARVAT is not calculated.
Coveriate adjustment for dichotomous phenotypes
Quantitative phenotypes, covariate adjustment with two-step
MFARVAT is an extension of FARVAT. If there exists some covariate effects or phenotypes are quantitative,
MFARVAT needs to be modified in the same way as FARVAT.
According to an assumption on the relationship between phenotypes, MFARVAT is divided into two types of statistics,
either of homogeneous(--mfhom) or heterogeneous(--mfhet).
Dichotomous phenotypes with heterogenous assumption, using familial kinship
If there is a single variant within a gene, an optimal test of MFARVAT is not calculated.
Coveriate adjustment for dichotomous phenotypes with same assumption
Quantitative phenotypes with homogeneous assumption, covariate adjustment with two-step
NOTE!
With the heterogeneity assumption(--mfhet), the computational complexity will be significantly different as the number of phenotypes increasing compared to homogeneous assumption (--mfhom)!
rv-TDT is a rare variant extensions of the Transmission Disequilibrium Test (TDT).
Since this statistical test requires permutation scheme, it might be take moderately longer time than the other tests in WISARD.
Note that because of this test is an extension of TDT, this test only can be applicable to dichotomous phenotype, and family-based dataset.
FB-SKAT is a class of family-based association tests that includes as particular cases the burden test and the variance-component test (SKAT).
Furthermore, these family-based tests correspond directly to existing population-based tests.
Because of the limitation of our implementation, FB-SKAT is supplied only with rv-TDT.
NOTE!
This analysis is supported from version 1.1.0.6
Default command
Larger number of permutations (take much shorter time but reports inaccurate result)