Comments on displaying VCF and GTF with IGV from Galaxy

, ,

Hi there. I have mapped some samples to the M. tuberculosis H37Rv genome (ASM19595v2 - downloaded from Ensembl Bacteria). I would now like to visualise those in IGV (v. 2.5.2 on Ubuntu 19.04, using the java 11 that comes with the IGV download). All of my samples are associated with the mycoTube_H37RV dbkey and I have made a all_fasta table entry for the H37Rv genome using the Create DBKey and Reference Genome tool (using the existing mycoTube_H37RV dbkey). I also locally created a corresponding genome in IGV.

If I load the VCF without creating a local version of the genome, I am informed by IGV that there is no genome matching mycoTube_H37RV. So I conclude that despite me adding the genome to the all_fasta table it is not provided to IGV? I then added a locally created genome file with the key mycoTube_H37RV and a chromosome name matching that used in my variant calling, which resolved the previous error.

My variants in Galaxy are part of a collection labelled by sample ID e.g. ERR2099775. However, when displayed in IGV they have names like snippy on data 99 etc. Is there a way to get these files to use the element labels from the collection they came from instead?

Secondly, when I try and load the GTF (also from Ensembl Bacteria, i.e. the annotation corresponding to this genome) I get the error “Error loading http://galaxy.sanbi.ac.za/display_application/a9321980c8cfd5ba/igv_gff/local_default/eb5abf90ef0d6e00/data/galaxy_a9321980c8cfd5ba.gtf: Unable to read index file, for input source: http://galaxy.sanbi.ac.za/display_application/a9321980c8cfd5ba/igv_gff/local_default/eb5abf90ef0d6e00/data/galaxy_a9321980c8cfd5ba.gtf” - it seems IGV requires an index for the track and Galaxy is not creating this? Is there some preparation that I can do on the Galaxy side that will create an index? When I load GTF files locally using IGV no index is required.

Thanks,
Peter

1 Like

Hi Peter,

Sounds like IGV is expecting some data that Galaxy doesn’t provide. You might just need to keep loading the data in IGV – there is likely some background process in IGV that is indexing the data upload upload.

BUT, that said, if you think Galaxy could be improved to address this (create whatever “index” IGV needs), I’d suggest opening a ticket against the /galaxy repository and see what feedback it gets.

Update: this is with respect to the GTF data.

For this part, I don’t think that is something Galaxy can control as-is. Maybe open another enhancement ticket? Perhaps the output from Galaxy could be renamed when sent to IGV based on collection metadata.

For this part, I don’t think there is a workaround on the Galaxy side. We’ve had feedback from the IGV developers before that creating a custom genome in IGV is how to get a genome fully set up (the actual genomic bases in the view, et cetera).

1 Like

Thanks @jennaj. I see @mvdbeek started working on this in https://github.com/galaxyproject/galaxy/pull/4027 which was subsequently closed. This was discussed in their comment https://github.com/galaxyproject/galaxy/issues/4029. I’ll comment on that issue.

Super, thanks Peter