Upload data size limits and vcf datasets in a docker Galaxy

Hi @jennaj ,

I have a very similar question. If I am using a galaxy docker on my own server, can I upload files >50GB? I would like to use SNPsift to annotate data from the GNOMAD database (Exome: 58.81 GiB, Genome:460.93 GiB). What would I check for an input format when I have a vcf.bgz or vf.gz? Would I use vcf.bzip?

Thanks for the information.

1 Like

For “choose local file” or “paste URL” … probably … you’ll need to test it out.

Let the Upload tool decide by using “autodetect”. Those are two completely different datatypes – the first is a bundle of files, the second a single file.

Hi @jennaj,

thanks for your fast answer. I will test it out. How can I be sure that the vcf.bgz or vcf.gz will be unpacked when uploading it with autodetect? In order to use the vcf files for snpsift the need to be unpacked (.vcf). Can I just change the datatype to .vcf in order to be able to use it for further analysis?

Thanks!

Hi @jennaj,
I just read in another topic, that @wm75 recommends, loading the vcf file marking the as .vcf datatype, so that they get unpacked when uploaded (How/ where can I download Annotation exac03(hg19) database and import it to Galaxy? - #2 by wm75). So I guess this is better than my idea of using the pencil icon after the upload to change the datatype into .vcf? Can I somehow check if my data got decompressed?
Thank you ever so much!
All the best,
Rose