Opening Unicycler assemblies with IGV local

vanina.guernier · December 12, 2019, 10:42pm

Hi
I have been using Unicycler for assemblies and have been following this tutorial:

However, when I try to open my assembly using “display with IGV local” (figure 14 of the tutorial) I get an error message:

It seems to say that the fasta file is available but the fai file is missing.
I do not have jobs still running, all my assemblies are showing as green, so I don’t see why it says “additional datasets require to be be generated”.
Did I miss a step when trying to use IGV local?

Thanks for help
Vanina

jennaj · December 13, 2019, 6:03pm

Hi @vanina.guernier

An index for IGV is being created at this preparatory step. This is another “job”, distinct from the assembly or other jobs run in the History. Your Unicyler job created a fasta output (without an index). Display in IGV requires an index, which is what is happening at this step.

When “Dataset Status” states “new”, then that input (a fa.fai index in your case) is still being generated.

When “Ready” for both are true, then the data can be visualized.

How long this prep step takes depends on the size/content of the original dataset and how busy the server you are working at is.

vanina.guernier · December 13, 2019, 6:56pm

Hi
So if I understand properly, the only way for me to know if both files are available is to check regularly using the IGV local button?
I thought that an analysis showed in green meant that it was all done, but then that’s not true for index files?
I’ll just wait then, Thanks

jennaj · December 14, 2019, 2:46am

The assembly is done, but moving the data into any 3rd party external application can require some reformatting or the creation of additional files that the other application is expecting – or they can require none – it depends on the external application.

In this case (IGV), a fasta index is required for new genomes. This would be true even if you decided to load the fasta directly into IGV as a new genome (instead of loading it directly from Galaxy). The “Galaxy-to-Local IGV” data web-based loading for fasta data is a convenience feature in Galaxy that skips the need for you to create the fasta index as a distinct step, download the fasta and fasta index, then load both into IGV directly.

To avoid this step, you could pre-load your assembly as a new genome in IGV. Then promote the assembly in Galaxy to a “Custom Build”. Use the same exact “database” aka “dbkey” name in both places, and assign that “database” to datasets you wish to visualize in IGV against that new genome/assembly.

Any fasta file can be used directly as a “Custom Genome” with tools wrapped in Galaxy natively, without any fasta or tool-specific indexing (that is done at job runtime, when needed). But, sometimes promoting a fasta/Custom genome to a “Custom Build” is helpful for other reasons – there are tools in Galaxy that also require an assigned “database” aka “dbkey”.

How to create/use a Custom Genome/Build in Galaxy:

Preparing and using a Custom Reference Genome or Build

IGV is a different application. The methods to create a “New Genome” in your own IGV are below. The preparations steps will require that you index the fasta with samtools as part of that process. Or you can create the fasta index in Galaxy and download that along with the fasta.

IGV: https://software.broadinstitute.org/software/igv/LoadGenome
Creating a fasta index in Galaxy that can be downloaded as a distinct dataset: click into the pencil-icon for a fasta dataset to reach the Edit Attributes forms. On the tab for “Convert”, pick the option to create an index:

Note: Certain other datatypes already have an attached index. An example is BAM data. The download icon for BAM data will contain two datasets: the .bam (mapping results) and the .bam.bai (index).

Whether or not to spend the extra effort to directly index your fasta in IGV as a new genome and create a Custom Build in Galaxy so that you can assign that custom “database” to other datasets is up to you. That will somewhat depend on how often you plan to use it as a reference genome for other analyses.

Hope that helps to explain the options

Topic		Replies	Views
Local IGV with Galaxy doesn't work server-admin , galaxy-local , igv , galaxy_1901 , virtualenv	4	2754	March 21, 2019
Is there a way to use the display in IGV tool so I don't have to download the genome separately? usegalaxy.org support igv	3	636	September 5, 2023
Is Samtools Faidx Available on Galaxy US? usegalaxy.org support troubleshooting , igv	4	243	April 9, 2024
Fail to load GFF3 into IGV usegalaxy.org support troubleshooting , igv	1	76	September 16, 2024
Display reference sequence and gff in one go to local IGV tool-dev , galaxy-local , igv	3	677	January 29, 2024

Opening Unicycler assemblies with IGV local

Related topics