How database assignments are used to access a fasta index: custom genomes, custom builds, server indexes

Choute · April 16, 2024, 8:23pm

I’m currently encountering some issues while running a dataset through the Qualimap BamQC tool. Specifically, I’m receiving the following errors:

Fatal error: Exit code 1
Input mapping file not found

Has anyone experienced similar issues or knows why this might be happening? Why would the tool not have access to the mapping file? I’ve double-checked the file paths. Any insights or suggestions would be greatly appreciated. I also want to mention that we are hosting the site on AWS if that is important.

Thank you in advance for your help!

jennaj · April 17, 2024, 6:35pm

Welcome, @Choute

Thanks for including the part about AWS! Knowing that the error is occurring on your own server is important

Some tool use the metadata assigned to input datasets to access server indexed reference files.

For this tool specifically, the metadata to pay attention to is the database assignment. The database key informs the tool about which fasta index (genome.fa.fai) to use during processing.

That fasta index can be a global reference accessible to all users of the instance or a custom reference specific to an account.

For a global reference, that could be an index that you created with Data Managers, or an indexed included in a mounted CVMFS resource.

GTN Materials Search (query=data+manager)
GTN Materials Search (query=cvmfs)

For a custom reference, how to create one is described at

https://training.galaxyproject.org/training-material/faqs/galaxy/reference_genomes_custom_genomes.html
Remember that a custom reference is account-specific. This means if one person creates it, the index will only be available to them and not other users.
For reference genomes that will be used by multiple people, as the administrator you will need to decide whether to index the genome locally, or to put the fasta into a common location like a Data Library and share how to create the database key. The first will be much simpler for users who are collaborating, and easier for you to support.

Since where this was broken might not be trapped perfectly by the tool, you’ll need to check the entire chain:

Confirm that the input BAM has a database assignment (required)
If a global reference, check to see if that index is mounted correctly (or possibly, indexed correctly?).
If a custom reference, check to see if that database is defined in the account the tool was run in.
And, I’m not sure exactly how this tool traps potential mismatches between the assigned database reference, and the actual content of that reference’s index versus the reference the input data was based on. So I would look at that if the other steps do not resolve the problem. As an example, this guide explains about the technical variations in common human genome builds → Reference genomes at public Galaxy servers: GRCh38/hg38 example

Let’s start there! There are other items to check depending on the parameters used.

Topic		Replies	Views
BamLeftAlign error unable to find fasta index -- use "fasta" version of genome or natively indexed genome custom-genome , galaxy-local , data-manager , picard_markduplicates	22	2953	July 18, 2019
RNA Star: Can I generate a temporary index with files from previous assemblies? reference-annotation , reference-genome	2	125	May 13, 2024
NCBI BLAST+ blastn tool-help , ncbi_blastn_wrapper	3	19	February 25, 2025
Custom set of reference indexes for transcriptomics processing tool-dev , salmon	2	503	November 2, 2019
Samtools mpileup usegalaxy.org support tool-help , samtools_mpileup	1	6	February 24, 2025

How database assignments are used to access a fasta index: custom genomes, custom builds, server indexes

Related topics