WISARD[wɪzərd]
Workbench for Integrated Superfast Association study with Related Data
HOME  |   DOWNLOAD  |   OPTIONS  |   TROUBLE?  |   LOGIN
 

Inputs

This section describes about

  • Inputs for WISARD
    • Accepted formats for each input
      • Dataset file
      • Phenotype/covariates/sample information file
      • Gene/set and gene-set/pathway definition file
      • Selection file
      • Relatedness file
      • Sequence file

    Inputs for WISARD [top]

    Basically, WISARD runs with two things: options and inputs. Here, inputs are files that are required to perform specific task. For example, of course, the genotype file is required to perform a regression analysis of genetic dataset. But if the user want to investigate an association between variants and another phenotype which is stored in external file, that file will be another input of WISARD.

    Below are the list of possible inputs for WISARD.

    • Dataset file(s) to analyze or manipulate
    • Phenotype/covariates/sample information file to incorporate to analysis
    • Gene or set definition file that enumerates mapping between genes/sets and variants
    • Gene-set or pathway definition file that enumerates mapping between gene-sets/pathways and genes/sets
    • Selection file that lists samples/variants to select for selection/filtering
    • Relatedness file that defines relatedness across samples
    • Sequence file that help to rearrange sequence of variants/samples for data manipulation

    Each input requires a specific input type and accepts a number of pre-defined format. However, according to the input type, that pre-defined format can be flexible by assigning additional options(=modifiers).

    Accepted formats for each input [top]

    Dataset file

    WISARD accepts the below genetic/dosage/expression dataset formats.

      Genetic dataset format
    • PLINK PED format (.ped and .map)
    • Binary PED format (.bed, .bim and .fam)
    • Number-coded genotype format (.raw)
    • Long genotype format (.lgen)
    • Transposed PED format (.tped and .tfam)
    • Variant Calling File format (.vcf)
    • Binarized VCF format (.bcf)
    • Other general(CSV/TSV) genotype format (.tsv, .csv or .txt)
    • Dosage dataset format
    • BEAGLE dosage format
    • MaCH dosage format
    • GEN dosage format
    • Binary GEN dosage format
    • Other general(CSV/TSV) dosage format (.tsv, .csv or .txt)
    • Expression dataset format
    • Gene Expression Omnibus(GEO) experiment format
    • Other general(CSV/TSV) expression format (.tsv, .csv or .txt)

    Phenotype/covariates/sample information file

    For detailed explanation with working example, see this page.

    Gene/set and gene-set/pathway definition file

    WISARD supports various gene/set and gene-set/pathway definition file to minimize extra labor.

      Gene, location-based
    • RefSeq format that is provided by UCSC Genome Browser
    • Simple location format with chromosome, name, start and end position
    • Gene, paired
    • One-to-many format with one gene and all mapped variants for each line
    • One-to-one format with one gene and one variant for each line
    • Variant list format that list of variants is embraced by gene identifier
    • Gene-set/pathway
    • One-to-many format with one gene-set/pathway and all mapped genes for each line

    Selection file

    For detailed explanation with working example, see this page and this page.

    Relatedness file

    For detailed explanation on what relatedness file is and how to generate them and how to use them, see this page.

    Sequence file



    Edit this page
    Last modified : 2017-05-26 10:44:18