download gnomad vcf.bgz.tbi dataset into galaxy

Gemini can currently only build one database per VCF, so to analyze several samples together you need to produce a multisample VCF first.

As I tried to explain in merge multiple VCF files - variant analysis and sample organization, whether that makes sense, strongly depends on what you’re analyzing:

As a rule of thumb:

  • if you’re trying to answer one question using several samples, you should probably use a variant caller like freebayes to produce one single VCF from multiple BAM files representing your samples.
    An example would be if you’re trying to find causative variants in families with a common genetic disease.
  • if samples don’t relate to a common question, then keep them separate throughout your analysis and build separate gemini databases for them. If, e.g., you’re analyzing a family trio to study one genetic disease, and two other family trios for a different disease, do multisample variant calling for the first trio, and separate variant calling for the other two and keep the analysis of the two VCFs separate.

In other words, if you have many samples representing, e.g., different patients with lots of different diseases, tumors, etc. analyze these cases individually using a workflow.
If you have one big group of patients to study the same disease, use joint variant calling to produce one big VCF of all of their mutations, then feed it to gemini.