Extract fields from sample-specific/genotype columns of a VCF with SnpSift Extract Fields


We are trying to use SnpSift Extract Fields on a vcf file to convert it to a tabular file. However, after multiple tries either on this and other account, we never managed to extract the information on the last field of our file, named “I2G_129”, which contains genotype information that matches the previous column - “FORMAT” - and right now is always outputing the value 0 instead of the values of the vcf file.

We really don’t understand what’s causing the issue and whether it’s a problem on our side or caused by Galaxy, and would deeply appreciate your kind help on this matter.

Best regards,

1 Like

Hi @Jorge

If you want to share a small example with 1) header 2) a few data lines and 3) query we might be able to help here to get that corrected, or to report a problem so we can fix it.

See the banner at this site for how to share your work for feedback, or review here directly:

The usage examples in the tool’s help section could be useful for you @Jorge. Since they are a bit complicated here’s some additional explanation:
Any columns following the FORMAT column are called “GENOTYPE” columns in VCF jargon. SnpSift lets you reference these columns by their index so, for example, GEN[0] refers to the first GENOTYPE column. Then to refer to certain keys listed in the FORMAT column (which serves as a legend for all GENOTYPE columns), you’d use . qualifier notation, for example, GEN[0].GT to refer to the genotype field of the first GENOTYPE column.
More syntax examples are provided in the second usage example of the tool help.


Thank you so much for your valuable comment. The usage of GEN[*] has worked perfectly for our intended use.
I might have overlooked the usage examples in the tool’s help section, but I tend to agree that they are a bit complicated and I only understood them after reading your comment.
Thanks again for your help!