BAM index, fasta indexes, display applications, custom genome builds, IGV

Prabhsimran_Singh · September 22, 2025, 7:10am

I sequenced my plasmid, which is approximately 15.2kb, and obtained the .bed file. I would like to create a .bam.bai file to visualize the alignment on IGV. Could you please help me obtain this file? Please give me step by step guide, I didn’t know how to do this

igor · September 22, 2025, 7:15am

Hi @Prabhsimran_Singh,

If you map reads (create a bam file) in Galaxy, bam.bai index file will be created automatically. However, the plasmid size, 15.2 kb, might be too small for standard alignment tools.

Check this tutorial: Hands-on: Mapping / Mapping / Sequence analysis

It uses built-in genome for read mapping, but all mapping tools can take a custom genome (a fasta file in user history). During setup of mapping job change Source of genome from built-in to From history.

Kind regards,

Igor

Prabhsimran_Singh · September 22, 2025, 5:50pm

Okay, I have the following files after sequencing .bed, .gbk, .maf, .maf.index, .tsv, .fasta and .fastaq. I want to align the whole plasmid sequencing result with my designed plasmid, which is around 15.23kb.

I tried making a .bam.bai file in the Bowtie tool. It gives me .bam and .bai files (two separate files). After this I do not know how to create the .bam.bai file. Please give me step by step guide.

jennaj · September 22, 2025, 7:13pm

HI @Prabhsimran_Singh

If you already mapped with Bowtie in Galaxy, then @igor advice applies.

The output.bam will be in your history as a dataset. The output.bam.bai index is not directly displayed as a seperate dataset, but it is part of this compound dataset. You can download both files using the disc icon if you want to use this data outside of Galaxy. They already exist. Click on that disc icon – you’ll see the choice for both files. Download them separately to get both.

Screenshot of a bam dataset with the disc icon activated

bam-disc-icon756×856 44.1 KB

Now, you also have more choices!

Single file hosting for display applications WITHOUT genomic fasta indexes

To view just that single bam output in IGV, you can click on the visualize icon to access display applications. This still uses Galaxy as the data host, but when there is no database assignment, a single on-demand generic fasta index will be created for transfer over to display applications.

WARNINGS:

There will be no genomic DNA sequence reference included because there is no fasta/fasta.fai index attached to the dataset!
Additional files cannot be loaded into the same display.

bam-visualize-icon-WITHOUT-database-assigned1794×914 177 KB

Multiple file hosting for display applications WITH genomic fasta indexes

To view multiple datasets all together in a display application, you can also click on the visualize icon to access display applications. This uses Galaxy as the data host, and when there is a database assignment, this attaches your genome’s specific fasta index for transfer over to display applications.

BENIFITS:

Genome DNA backbone included in the display.
Any dataset sharing the same database assignment can be loaded into the same display.
Native database keys can be assigned and Custom database dbkeys can be created and assigned. Both will work the same!

Not sure how to create and assign a Custom database?

FAQ: How to use Custom Reference Genomes?
With many examples at this forum, see custom-genome custom-build

bam-visualize-icon-WITH-database-assigned1792×794 177 KB

What to do?

Since you have a genome fasta file already, creating a custom database build key, then assigning it to your datasets seems like a good choice! This will allow you to load all of the data into a local IGV application and view everything together.

Now, with IGV, you will also need to set up your custom genome! If you already have IGV set up with your custom genome, just make sure to label your custom genome in Galaxy the same way. The database “dbkey” label must be the same everywhere to instruct the applications to use the same fasta index. Avoid mixing up assemblies here or expect problems with the data coordinates.

This topic has more details, some that overlap with what is already above, but maybe it provides some more context?

Connecting it all together

IGV configured with your custom genome
Galaxy configured with your custom genome
the custom database dbkey (fasta index) assigned to datasets in Galaxy
then when the database dbkey in IGV is the same term as used in Galaxy, and you select local IGV display from Galaxy, your datasets can be loaded up all together into IGV!

You can also just download all your files and not use Galaxy to host the data, but you’ll still need to configure IGV with your custom genome for the display if you need the genomic DNA sequence as the reference and plan to view all the files together.

Please give that a try and let us know if you need more help!

Topic		Replies	Views
Missing bam index when downloading bam file usegalaxy.eu support mapping	2	1258	November 8, 2019
BamLeftAlign error unable to find fasta index -- use "fasta" version of genome or natively indexed genome custom-genome , galaxy-local , data-manager , picard_markduplicates	22	2978	July 18, 2019
Samtools index for indexing .bam files. usegalaxy.org support server-admin	3	4624	April 3, 2019
Is there a way to use the display in IGV tool so I don't have to download the genome separately? usegalaxy.org support igv	3	653	September 5, 2023
Visualize bwa mem on IGV or IGB mapping , bwa-mem , igv	0	788	March 13, 2021