merge multiple VCF files - variant analysis and sample organization

wm75 · January 13, 2020, 3:52pm

While you may be able to use the tool VCFcombine for that, please note that this is normally not what you would do because:

starting with separate VCFs for each patient you typically have no information about any variant site’s status in “unaffected” patients other than that the variant wasn’t called, i.e., if a site was judged homozygous ref in a patient it is not in the output of that patient and, thus, there are no stats about it.
because you’re lacking information you, generally, cannot rely on the INFO column after combining the files. GEMINI however relies on INFO column fields for many of the queries you can perform with it.
So unless you know exactly what you’re doing you may get very wrong answers from such queries.

Assuming that you expect your 40 patients (or subsets of them) to have something in common, joint variant calling assessing the data of all samples simultaneously can increase sensitivity in particular at low coverage sites.

For all of these reasons, I would recommend calling variants for all samples (or, at least, the ones that logically should be grouped) together with a tool like freebayes. You can then directly use the resulting multi-sample VCF dataset with GEMINI.

If all this sounds confusing, you may want to have a look at this tutorial:

which illustrates joint variant analysis for a family trio.

Topic		Replies	Views
Variant calling from VCF files chrominfo , vcf	3	756	October 16, 2023
Trying to import 1000 Genomes Project vcf files into Galaxy to be able to concatenate chromosome VCF for each sample in to one VCf per sample. usegalaxy.org.au support	3	604	April 30, 2021
How do I count the number of variants per gene from an annotated vcf file? usegalaxy.org support gemini , variant-analysis	6	2724	August 28, 2020
Project help...(variant analysis) usegalaxy.org support tool-deprecated , picard_markduplicates	1	2508	August 3, 2020
SNP variant analysis/MergeSamFiles? usegalaxy.eu support workflow , galaxy-local	3	630	May 4, 2020

merge multiple VCF files - variant analysis and sample organization

Related topics