WISARD[wɪzərd] Workbench for Integrated Superfast Association study with Related Data |
|
This section describes about
WISARD provides several ways to incorporate covariates to the analyses. In brief, two ways are possible.
For the detail of way to get covariates from file, see Phenotype section.
Covariates can be modified via expression. For example, below code remains age, but squaring height and make new covariate 'age', 'height^2'.
NOTE! |
Factor-type covariate cannot be modified! |
An assignment of covariates can be accomplished with both options: --sampvar and --cname. An example is shown below with above sample covariates.
Above command will retrieve covariates from test_miss0_phen.txt and take AGE column as covariates.
If an assignment of multiple covariates is required, there are two ways to achieve it. First is assigning multiple column names with separator as comma (,) character. For example, below command will retrieve covariates from test_miss0_phen.txt and take height, weight and age as covariates.
NOTE! |
There is no whitespace in separator because any whitespace will be accepted as distinguished parameter by command prompt! |
NOTE! |
The hyphen (-) character can be used as column name, but this sort of naming can produce error! |
WISARD supports an assignment of covariates from multiple files. See below example.
Above code loads two sample variable files test_miss0_phen.txt and test_miss0_phen2.txt, then load age and sbp from test_miss0_phen.txt and bmi from test_miss0_phen2.txt. Below are some precautions for loading multiple sample variable files.
NOTE! |
Column names MUST be unique across sample variable files! |
WISARD supports two kinds of covariates: numeric type and factor type. During the process of covariate file, WISARD automatically determines the type of assigned covariates, and reports which types have determined against assigned covariates.
NOTE! |
Please be careful to read the log file because sometimes WISARD could make wrong decision if there is some erroneous expression of numeric value or wrongly added extra characters! |
There are two ways to retrieve covariate as factor, using --cname and --fname. When using --cname, covariate type is automatically determined by its composition. If specific covariate only consists of non-numerical literals, it is regarded as factor covariate. However, this way does not work in the case of retrieving numerical categorical covariate as factor. In this case, --fname is desired to assign that covariate as factor.
For the case of factor type covariate, WISARD apply contrast encoding, which requires a baseline. It is determined by the level that appeared at first for the given factor. If it is required to the specific selection of baseline; it is possible to adjust the baseline for the factor via --baseline option. However, some constraints are required to use --baseline option.
NOTE! |
If all samples having specific [VALUE] are excluded by the assigned filters, WISARD will produce error since there is no sample that can be a baseline for assigned covariates. |
An encoded covariates by WISARD can be found as same format via --makecov option, as below example.
As shown in below output that is produced by above command, covariate REGION assigned from --cname option was translated to factor-type covariate since it does not contains any numeric value.
In addition, since the value of REGION that appears first time was Busan, it became the baseline of REGION. However, the baseline can be altered into Daegu if the option with parameter --baseline region=ASIA is added to the above command.
WISARD provides several ways to make covariates automatically. This section describes ways to utilize those functions.
In particular case, a subset of variants can be incorporated to covariates as a form of conditional analysis. The code below shows an example.
Above code will incorporate variant 'SNP30' as covariate, and perform regression analysis. If the variants are listed as a file, it also can be passed to WISARD, using this code:
In this case, test_variant_list.txt should contain single variant name per line.
WISARD performs Principal Component Analysis (PCA) with --pca option, but it will just be performed and make an output. However, it is possible to immediately incorporate those computed PCs into subsequent analyses, via --pc2cov. For example, below command will add 5 PCs as covariates into the regression analyses following PCA.
For variant-level analyses can accept covariates, it is possible to add Gene-Environmental interaction (GxE) term as covariates. --gxe option will enable this functionality, and all the variant-level tests with covariates will be performed with GxE terms. Since this option incorporates all GxE interactions against all assigned covariates, it is possible to choose the covariates for GxE interactions via --gxecovs option.