WISARD[wɪzərd]
Workbench for Integrated Superfast Association study with Related Data
HOME  |   DOWNLOAD  |   OPTIONS  |   TROUBLE?  |   LOGIN
 

Summary Statistics

Quality control is a critical element in doing genetic assocation analysis, and since very large number of variants are tested, a small percentage of biased results can produce large number of false positives. QC can be considered to have two aspects - QC related to genotyping chips (i.e. issues related to making genotype calls from intensity measurements) and downstream QC. Summary statistics supported by WISARD can be used for downstream QC approaches, i.e. data cleaning procedures that can be applied once you already have genotype calls.

Summary statistics can be grouped into variant-based and subject-based statistics.

Variant-based statistics [top]

Various variant-based statistics can be calculated with WISARD :
  • Variant summary WISARD can find monotone, singleton and doubletone, and identify family-specific variants, that is, variants which are observed in a single family. Family-specific variants may indicate causal variant for family-specific phenotypes and provide useful information for family-based analysis. WISARD also provides other summary measures for each variant.
  • Minor allele frequency WISARD can estimate frequency of minor alleles. For family-based sample, it can be computed by using only founder or all individuals.
  • Hardy-Weinberg equilibrium WISARD tests whether HWE is preserved in sample. HWE test is often used for QC of variants because HWE is preserved for most of variants, and this option can be used to filter out the variants with WISARD.
  • Variant-specific missingness rate / genotype call rate WISARD can calculate the proportion of genotypes per variant with non-missing data.
  • Ts/Tv ratio It indicates a ratio of transition substitution number to transversion. WISARD computes this measure for variants in each specific window size or each chromosome.
  • Variant-specific Mendelian error WISARD calculates the relative proportion of incorrectly inherited alleles for each variant. High proportion indicates genotyping error or misspecified familial relationship.
  • Linkage disequilibrium LD indicates the level of nonrandomness between two variants. WISARD calculates LD for (1) all possible pairs of variants, (2) all possible pairs of variants in each chromosome (3) all possible pairs of variants in each window, (4) adjacent-variant pairs, and (5) pairs with one variant fixed.

Subject-based statistics [top]

Various subject-based statistics can be calculated with WISARD:
  • Family summary WISARD can summarize the family-related measure such as number of family, parental-offspring pair, siblings, etc.
  • Subject-specific missingness rate / individual call rate WISARD can calculate the proportion of genotypes per individual with non-missing data.
  • Subject-specific Mendelian error WISARD can calculate the relative proportion of incorrectly inherited alleles for each individual. High proportion indicates genotyping error or misspecified familial relationship.
  • Gender check WISARD automatically checks the pedigree-error (misspecified gender for father and mother), and it cannot be deactivated.
  • Inbred check WISARD automatically checks the presence of inbreds, and it cannot be deactivated.


Edit this page
Last modified : 2014-01-29 12:18:01