Sample or variants within given data can be automatically reordered with options below,
or arbitrarily arranged in certain order. In case of automatic reordering,
an argument either of asc (sort ascending order) or desc (sort descending order) is available.
Since above ordering schemes cannot think like human, sometimes an unexpected result may be produced, as shown below.
If user sorts the above dataset using --sortsample asc and want to order from F1, F2, and F10, it will be ordered as below.
It is because computer's sorting algorithm first compares characters in same position.
Adding --natural will produce intended output.
If it is required to reorder sample or variants with user-specific order,
--sampleorder and --variantorderoption can be used.
Note that those options require a path of the file that contains a list of FID & IID pair (--sampleorder) or variant name (--variantorder) for each line.
Also those files must contain all samples or variants.
An error will be raised otherwise.
A specified number or proportion of genotypes can be marked as missing with --nageno option.
If an argument of--nageno is ranged between 0 and 1, it is considered a proportion.
Other integer argument will be considered as the number of genotypes missing.
If the argument is over the number of existing genotypes, it will be clamped.
Nullify selected samples
In other way to generate genotype missing is using --nasamp, an option that nullifies listed samples.
Here, nullifying means all genotypes of a sample is set to NA.
Nullify random samples
Nullification of random samples is also provided with --randnasamp option.
Similar to the parameter of --nageno, its parameter have two different meanings according to its value.
When the value is a real number and ranged from 0 to 1, it means the proportion against entire sample size.
When the value is a positive integer lower than the number of samples, it means an exact number of samples.
When using --updvariant, variant name matching will not be performed, will only base on its sequence!
Updating individual fields with variant name matching
NOTE!
--updpos/--updgdist/--updname/--updchr can be used in same time!
Updating allele information
With --updallele, it is possible to update alleles of variant(s) in the dataset.
It should be accomplished with a file that contains an information of variants to be updated,
original alleles of the variants and alleles to be updated, as below.
For consistency of and integrity of --updallele, the command confirms the below conditions.
However, variants that are not exist in --updallele file will keep their original alleles.
Whether the file contains exact number of columns as the above
Whether original alleles are identical to the alleles in --updallele file