Trying to import 1000 Genomes Project vcf files into Galaxy to be able to concatenate chromosome VCF for each sample in to one VCf per sample.

Hello everyone,

So my I’m looking to import the 30X WGS 1000 Genome VCF files directly into Galaxy bypassing having to download them individually. My goal is to create a control group of 200 VCF files I can use to compare to my samples and currently 1000 genomes only offers the samples with individual chromosome vcf files but not them concatenated together.

So if anyone knows how to do this or can assist me with a better alternative please let me know. Thank you.

Hi @screadore, are those the files that you need? Phase 3 VCF files

Regards

Yes they are. Thank you.

1 Like

Hi @screadore,
I have created some tabular datasets with the information required for downloading the VCF files by using the ruled-based uploader tool.

The datasets have been distributed in six files, according to the filenames.

Using the different fields you can easily organize the datasets into collections, and later merge the elements of the collections into a single VCF file. Let me know if you have any additional questions.

Regards