Epistasis analysis is computationally very intensive and few statistics are available.
The most efficient statistic depends on the ascertainment condition, the property of phenotypes (dichotomous/quantitative),
the presence of covariates and the absence/presence of population stratification.
The meaning of gene x gene interaction has not been clearly understood, and statistical gene x gene interaction has
often been confused with biological gene x gene interaction.
Statistical interaction is defined as a departure from additivity in a linear model using a selected
measurement scale, and to infer biological interactions, statistically modeled interactions and main effect
terms should not be separately interpreted (Wang et al Nat Genet 2010).
Summary for available statistics:
BOOST: useful for case-control design. It is approximately similar to the logistic regreeion and covariate effect cannot be adjusted. Both statistical and biological interaction can be detected.
MDR: useful for case-control design. Covariate effect cannot be adjusted. It tests the marginal genetic effect and gene x gene statistical interaction effect, and
useful for biological interaction.
Generalized MDR (GMDR): extended MDR for quantitative phenotype and it can adjust the covariate effect. It tests the marginal genetic effect and gene x gene statistical interaction effect, and
useful for biological interaction.
FastEpistasis (PLINK): FastEpistasis method was orignally implemented in PLINK, to test SNP-SNP interaction in a relatively fast manner. However, it should be noted that the actual performance on an exhaustive search of SNP-SNP interaction of this method is actually slower than those of MDR or BOOST.
HiSCom-GGI: This method, Hierarchical Structural Component analysis for Gene-Gene interaction, investigates gene-gene interaction with natural hierarchy of variant-gene-phenotype. Unlike other analyses, this method has an advantage by providing variant-wise effect, gene-level effect and interaction effect.
BOolean Operation based Screening and Testing [top]
BOolean Operation based Screening and Testing (BOOST) method proposed by Wan et al(Am J Hum Genet 2010),
can evaluate exhaustive two-way combination in genome-wide case/control dataset in feasible time.
BOOST is approximately similar to the result from logistic regression but the distribution for p-value calculation is not clear.
Therefore, it is recommended to reanlyze most signification results from BOOST with logistic regression.
Example code
Calculating BOOST with WISARD
Reporting more significant results larger than a certain threshold
BOOST provides some measure which corresponds to the likelihood ratio test for logistic/log-linear model,
and the threshold for this measure is used for screening significant statistical interaction.
WISARD can set this threshold by using --thrboost option and the default vaule is 30.
30 corresponds to the unadjusted p-value =$4.89\times 10^{-6}$ and if larger value is set, smaller number of interaction analysis
will be reported.
Multifactor Dimensionality Reduction (MDR) was introduced by Ritchie et al. (Am J Hum Genet 2001),
to identify gene-gene interaction in genotype data, against binary trait. The original MDR cannot adjust covariate effects,
but it was extended to the generalized MDR (GMDR) which allows for covariate adjustment and the analysis of quantitative phenotypes
(Lou et al Am J Hum Genet 2007).
WISARD can calculate both of MDR and GMDR.
Running MDR
In default, it investigates first-order combination. Hence, it is required to set an additional parameter to find a real "interaction", an order of the combination.
Moreover, MDR in WISARD reports ALL results of the tested interactions. Instead, an additonal parameter --top controls the number of outputs, by sorting the result in descending order by the comparison measure (In default, it is Balanced Accuracy (BA)).
Although MDR can detect genetic interaction effectively,
an identification of high-order intearction with exhaustive apporach can be impractical,
due to an enormous number of possible combinations for high-order.
In order to such limitation, we implemented a new approach called hierarchical MDR,
an efficient method for high-order interaction in large-scale genetic data.
MDR is powerful to identify gene-gene interaction, but its computational burden is not achievable for large-scale genetic analysis.
This problem is alleviated by using block-based MDR and gene is a reasonable choice to define block.
WISARD provides gene-level MDR method originally proposed by Oh et al(BMC Bioinformatics 2012).
Example code
Running gene-level MDR
Analysis of gene-level MDR requires the gene set file which defines variants for each gene by using --set option.
The detila for gene-set file can be found at gene-level analysis page .
Options for GMDR and MDR such as --cv,--top N and --order also can be used for gene-level MDR.