WISARD[wɪzərd] Workbench for Integrated Superfast Association study with Related Data |
|
This section describes about
Linkage disequilibrium (LD) means the amount of nonrandomness of genotype distribution between two loci. WISARD can calculate r2 and D'(in default) ro correlation-based r2 (with --ldcor) as LD measures, and several ways to pair variants for LD are available. For instance, we consider the following example data:
In default, WISARD computes r2 and D' in LD computation, with --ld option.
In default, WISARD computes r2 and D' in LD computation. This can be replaced with correlation r2 with --ldcor option, and it can be applied to the options for LD, which are described below.
Exhaustive two-way pairs of variants indicates all possible pairs of variants and if there are m variants, 0.5m(m-1) LDs are calculated. For example data, there are 55 exhaustive two-way pairs. WISARD can generate exhaustive two-way LD by using --ld and this is calculated by default. Note that the computational burden for exhaustive two-way LD quadratically incrase.
WISARD can calculate LDs for all pairs of variants in each chromosome. For example data, there are 6 variants in chromosome 1 and 5 variants in chromosome 2, and thus 15+10 LDs are calculated. It can be calculated by adding --ldchr option.
Exhaustive window-wise pairs means pairs of variants of which distance(basepair) is less than the specified value.
It can be calculated by adding --ldsize options and window-size in basepair should be specified.
For example data, if we add --ldsize 1000 option, the following pairs of variants will be calculated:
- SNP1 and SNP1, SNP2, SNP3, SNP4
- SNP2 and SNP2, SNP3, SNP4
- SNP3 and SNP3, SNP4
- SNP4 and SNP4, SNP5
- SNP5 and SNP5
- SNP6 and SNP6
- SNP7 and SNP7, SNP8, SNP9
- SNP8 and SNP8, SNP9
- SNP9 and SNP9
- SNP10 and SNP10
- SNP11 and SNP11
WISARD can calculate LDs for certain number of consecutive pairs of variants by using --ldbin option. If we add "--ldbin k", LDs between each variant and its k consecutive variants are calculated.
Under the above command, the following pairs of variants are considered:
- SNP1 and SNP1, SNP2, SNP3
- SNP2 and SNP2, SNP3, SNP4
- SNP3 and SNP3, SNP4, SNP5
- SNP4 and SNP4, SNP5, SNP6
- SNP5 and SNP5, SNP6
- SNP6 and SNP6
- SNP7 and SNP7, SNP8, SNP9
- SNP8 and SNP8, SNP9, SNP10
- SNP9 and SNP9, SNP10, SNP11
- SNP10 and SNP10, SNP11
- SNP11 and SNP11
NOTE! |
An argument of --ldvar can be a file path which contains variant names per line |
Highly correlated variants can be simply clumped into single variable with WISARD, and this is also useful for rare variants. WISARD supports this functionality with --makeclgeno option. To define the variants which are clumped into a single variable, additional option is required. Note that an additonal option is required to which threshold is most appropriate to clumping variants into single variable.