Opening Unicycler assemblies with IGV local

The assembly is done, but moving the data into any 3rd party external application can require some reformatting or the creation of additional files that the other application is expecting – or they can require none – it depends on the external application.

In this case (IGV), a fasta index is required for new genomes. This would be true even if you decided to load the fasta directly into IGV as a new genome (instead of loading it directly from Galaxy). The “Galaxy-to-Local IGV” data web-based loading for fasta data is a convenience feature in Galaxy that skips the need for you to create the fasta index as a distinct step, download the fasta and fasta index, then load both into IGV directly.

To avoid this step, you could pre-load your assembly as a new genome in IGV. Then promote the assembly in Galaxy to a “Custom Build”. Use the same exact “database” aka “dbkey” name in both places, and assign that “database” to datasets you wish to visualize in IGV against that new genome/assembly.

Any fasta file can be used directly as a “Custom Genome” with tools wrapped in Galaxy natively, without any fasta or tool-specific indexing (that is done at job runtime, when needed). But, sometimes promoting a fasta/Custom genome to a “Custom Build” is helpful for other reasons – there are tools in Galaxy that also require an assigned “database” aka “dbkey”.

How to create/use a Custom Genome/Build in Galaxy:

IGV is a different application. The methods to create a “New Genome” in your own IGV are below. The preparations steps will require that you index the fasta with samtools as part of that process. Or you can create the fasta index in Galaxy and download that along with the fasta.

  • Note: Certain other datatypes already have an attached index. An example is BAM data. The download icon for BAM data will contain two datasets: the .bam (mapping results) and the .bam.bai (index).

Whether or not to spend the extra effort to directly index your fasta in IGV as a new genome and create a Custom Build in Galaxy so that you can assign that custom “database” to other datasets is up to you. That will somewhat depend on how often you plan to use it as a reference genome for other analyses.

Hope that helps to explain the options :slight_smile: