WISARD[wɪzərd]
Workbench for Integrated Superfast Association study with Related Data
HOME  |   DOWNLOAD  |   OPTIONS  |   TROUBLE?  |   LOGIN
 

Gene-level Summary

  • Related options : --genesummary, --set, --setconsec, --setoverlap, --setrandom, --setspan, --gmapsummary, --genesize, --genemiss, --selgene, --remgene

WISARD provides some useful functions for summarization of gene-level from dataset.

  • Loading gene information
    • Loading gene from pre-defined gene definition file
    • Generating gene mapping automatically based on the physical position
    • Generating gene mapping automatically using gene information
  • Summary from gene-variant mapping
    • Reporting actually mapped gene-variant definition
    • Find out the distribution of allele across variants in the gene
  • Selecting the subset of genes
    • Using only the genes having specific number of variants
    • Using only the genes by gene-wise genotyping rate

Loading gene information [top]

Loading gene from pre-defined gene definition file

Generating gene mapping automatically based on the physical position

Generating gene mapping automatically using gene information

Summary from gene-variant mapping [top]

Reporting actually mapped gene-variant definition

Regardless of gene-variant definition, actually mapped gene-variant definition can vary on the dataset. In order to identify which variants are actually mapped onto the gene from given dataset, --genesummary can be used.

Perform basic gene-level test using gene-variant definition, and generate mapping report C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --genesummary --out res_gene

Above command generates [prefix].summary.gene.lst, with below format.

summary.gene.lst is... A summary of analyzed gene-variant information (TSV)
Column Format Modifier Description
CHR non-negative real NONE Variance of entire sig
NAME string NONE Retrieved gene name
COUNT non-negative integer NONE Existing number of variants in the dataset for gene name
START non-negative integer NONE Minimum position of variant belongs to gene
END non-negative integer NONE Maximum position of variant belongs to gene
LIST string --verbose List of variants belongs to gene

Find out the distribution of allele across variants in the gene

After the identification of significant genes associated with given dichotomous trait using given dataset and gene-variant definition, a further investigation on the allele frequency of variants in the genes can be performed. WISARD provides this function with --gmapsummary option.

NOTE!
An assignment of single dichotomous phenotype is required to use this option!
Generate a summary of allele distribution of variants for each gene, using entire genes C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --gmapsummary --out res_gmap

Above command generates [prefix].summary.gmap.lst, with below format.

summary.gmap.lst is... A summary of variant distribution across case-control samples for each gene mapping (TSV)
Column Format Modifier Description
CHR string --verbose List of variants belongs to gene
NAME string NONE Retrieved gene name
VARIANT string NONE Variant name belongs to gene
MINOR string NONE Minor allele of variant
CASE0 non-negative integer NONE Number of case samples having major homozygote
CASE1 non-negative integer NONE Number of case samples having heterozygote
CASE2 non-negative integer NONE Number of case samples having minor homozygote
CTRL0 non-negative integer NONE Number of control samples having major homozygote
CTRL1 non-negative integer NONE Number of control samples having heterozygote
CTRL2 non-negative integer NONE Number of control samples having minor homozygote

Since above command investigates for all of genes in the gene-variant definition, it could be a huge task to be done. Thus it might be inefficient when the small portion of genes are interested. --selgene can be used for this situation.

Generate the genemap summary only for GENE12 and GENE14 genes C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --gmapsummary --selgene "GENE12|GENE14" --out res_gmap_subset

Selecting the subset of genes [top]

Using only the portion of gene-variant definition to the analyses is also possible in WISARD. Below options are available.

Using only the genes having specific number of variants

One of most intuitive filtering can be applied on the genes is the number of included variants on the gene. It can be done with --genesize option.

NOTE!
This option requires range type parameter!
Perform basic gene-level tests for the genes only having more than 5 variants C:\Users\WISARD> wisard --bed test_miss0.bed --set test_gene.txt --genetest --genesize ">5" --out res_genetest_over10

Using only the genes by gene-wise genotyping rate

Although WISARD provides several options for filtering variant with genotype calling rate, gene-level calling rate can be considered. To use gene-level calling rate filtering, use --genemiss option.

Perform basic gene-level tests for the genes only having its missing rate is under 1% C:\Users\WISARD> wisard --bed test_miss2.bed --set test_gene.txt --genetest --genemiss 0.01 --out res_genetst_1per


Edit this page
Last modified : 2017-09-13 16:18:15