Samtool merge error after HISAT2

I have used samtool merge with a couple of HISAT2 bam files, and when completed, although the datasets are green, I get the following error:

[E::idx_find_and_load] Could not retrieve index file for ‘/galaxy-repl/main/files/043/098/dataset_43098901.dat’
[E::idx_find_and_load] Could not retrieve index file for ‘/galaxy-repl/main/files/043/098/dataset_43098901.dat’

The samtool merge .bam file can be opened on IGV, but when I try to convert it to .tdf file for normalisation, I get the message “error: null”, so I think something is wrong with the samtool merge step.

I have used the following settings:

Samtools merge

Dataset Information

Number: 54
Name: Samtools merge on data 50 and data 49
Created: Sun Jul 26 09:21:13 2020 (UTC)
Filesize: 3.4 GB
Dbkey: hg_g1k_v37
Format: bam

Job Information

Galaxy Tool ID:
Galaxy Tool Version: 1.9
Tool Version: Version: 1.9 (using htslib 1.9)
Tool Standard Output: stdout
Tool Standard Error: stderr
Tool Exit Code: 0
History Content API ID: bbd44e69cb8906b59ff9cf8439d59652
Job API ID: bbd44e69cb8906b5cdcb8eff35777ca2
History API ID: d8a0c9490972cc7d
UUID: edacc3d6-bcae-454e-919a-ba6e487dff6a

Tool Parameters

Input Parameter Value
Alignments in BAM format * 49: HISAT2 on data 14 and data 13: aligned reads (BAM)
* 50: HISAT2 on data 22 and data 21: aligned reads (BAM)
Merge files in a region Empty.
File to take @ headers from
Make @ RG headers unique False
Make @ PG headers unique False
random seed 1

Inheritance Chain

Samtools merge on data 50 and data 49

Job Dependencies

Dependency Dependency Type Version
samtools conda 1.9

Dataset peek

Binary bam alignments file


1 Like

Are you able to run igv and perform the conversion on the original files before the merge?

Yes I was able to convert the original bam files to tdf in IGV before the merge.
So that means the problem lies somewhere with the samtool merge?

I don’t know how to assist with the IGV conversion issue itself, but if it helps excluding error sources:

You are currently getting this stderr message for every BAM dataset because of what looks like a bug in the latest release of Galaxy. AFAIK, the BAM dataset itself is unaffected by this and will work just fine in downstream analyses.

Are you able to run any other tools successfully with the new set?

But I can’t use it for my tpm analyses because of the conversion.
Should I wait for a new release and redo this when the bug isn’t in it anymore?

@astrov can only load the sam set on IGV, nothing else

I guess you misunderstood what I was trying to say.
My point was that the error message you’re seeing in Galaxy is most likely unrelated to your issue. Don’t consider it in your debugging efforts, that’s all I wanted to say.