Bam input file is not available for the training session 'Calling variants in diploid systems'

Hello.
It seems the bam file ( GIAB-Ashkenazim-Trio-hg19.gz) in the training session ‘Calling variants in diploid systems’

are not able to process for variants calling.

I also used Samtools to convert this bam file to a sam file, and got the error:
[W::bam_hdr_read] EOF marker is absent. The input is probably truncated

Also, the size of the bam file is extremely small, only 52kb

Hi @wcq848677

The BAM is downsampled so it will run quickly when running through the tutorial.

This is a history that has that particular BAM loaded (size is 51.2 KB), and the content is as expected. I also converted Bam-to-Sam in that history, and that worked plus produced the right result. Galaxy | Accessible History | tutorial variants. When you run Freebayes, input the BAM, no need to convert to SAM. If you want to see the content of the BAM, it is small enough that Galaxy can produce the uncompressed “SAM” view (use the “eye” icon for the dataset).

Maybe you got a corrupted version? Try pasting just this link into the Upload tool in Galaxy. Leave all the other settings at default. https://zenodo.org/record/60520/files/GIAB-Ashkenazim-Trio-hg19.gz

You can also make a copy of that history to work with it. I’ll leave it shared for this week.

Tutorial: Calling variants in diploid systems

Note: You marked this as being “galaxy-local” and as if working at UseGalaxy.org.

  • If you are actually working at UseGalaxy.org, UseGalaxy.eu or UseGalaxy.org.au, this will work.
  • If you are trying to run the tutorial in a Local or Docker Galaxy on your own computer, other things may be going wrong, even if the downloaded BAM is the right size. Please explain a bit more about where you sourced Galaxy and what you have done to setup your server. Include details – is this the first dataset you have tried to load to that server? The first BAM dataset?

Hope that helps!

Hi Jennifer

Thank you so much for your reply, you are right, it is running now after I pasted the link you provided to me.
I guess the problem I encountered could be automatically decompressed for the .gz file when I downloaded it in my MacBook. Do you know the reason behind this.

Anyway, appreciate your help! it is quite useful

Charley

1 Like

Galaxy will automatically uncompress some datatypes. It is usually best to allow Galaxy to detect the datatype for any uploaded data. If the guess is incorrect, then you can explore potential format problems and fix them (common) or directly assign the datatype after upload if needed (rare). FAQs.