Cannot download bam csi index with HISAT2 or RNA STAR

Hi there,

I am having trouble using the ‘download bam_csi_index’ option after running alignments with both HISAT2 and RNA STAR on usegalaxy.org. The link gives me an HTTP error 500.

I am using a genome from my history and the alignment itself seems to work fine. The stderr (copied below) appears to suggest an error with retrieving the index.

Any suggestions would be useful,

Many thanks!
Amanda

Stderr:

Settings:
Output files: "genome..ht2"
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Local offset rate: 3 (one in 8)
Local fTable chars: 6
Local sequence length: 57344
Local sequence overlap between two consecutive indexes: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void
:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
genome.fa
Reading reference sizes
Time reading reference sizes: 00:03:48
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:44
Time to read SNPs and splice sites: 00:00:02
Using parameters --bmax 578742688 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 578742688 --dcv 1024
Constructing suffix-array element generator
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering GFM loop
Exited GFM loop
fchr[A]: 0
fchr[C]: 985157101
fchr[G]: 1543302884
fchr[T]: 2101138684
fchr[$]: 3086627671
Exiting GFM::buildToDisk()
Returning from initFromVector
Wrote 1033087383 bytes to primary GFM file: genome.1.ht2
Wrote 771656924 bytes to secondary GFM file: genome.2.ht2
Re-opening _in1 and _in2 as input streams
Returning from GFM constructor
Returning from initFromVector
Wrote 1354838193 bytes to primary GFM file: genome.5.ht2
Wrote 785784892 bytes to secondary GFM file: genome.6.ht2
Re-opening _in5 and _in5 as input streams
Returning from HierEbwt constructor
Headers:
len: 3086627671
gbwtLen: 3086627672
nodes: 3086627672
sz: 771656918
gbwtSz: 771656919
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 0
eftabSz: 0
ftabLen: 1048577
ftabSz: 4194308
offsLen: 192914230
offsSz: 771656920
lineSz: 64
sideSz: 64
sideGbwtSz: 48
sideGbwtLen: 192
numSides: 16076186
numLines: 16076186
gbwtTotLen: 1028875904
gbwtTotSz: 1028875904
reverse: 0
linearFM: Yes
Total time for call to driver() for forward index: 01:14:58
24981828 reads; of these:
24981828 (100.00%) were unpaired; of these:
862146 (3.45%) aligned 0 times
22885467 (91.61%) aligned exactly 1 time
1234215 (4.94%) aligned >1 times
96.55% overall alignment rate
[bam_sort_core] merging from 10 files and 1 in-memory blocks…
[E::idx_find_and_load] Could not retrieve index file for ‘/jetstream/scratch0/main/jobs/29626487/outputs/dataset_43130633.dat’
[E::idx_find_and_load] Could not retrieve index file for ‘/jetstream/scratch0/main/jobs/29626487/outputs/dataset_43130633.dat’

Hi,

It seems that the option to download the csi index is available even when the csi is not created. This is a bug and we are looking into it.

The csi indexes are a new feature and are created when the largest contig is longer than 2^29-1 bases. Are you expecting your contigs to be that long? Also the download may not be available for jobs that ran before the addition of the feature, when have your jobs been run ?

If these jobs are supposed to create csi index, that is another problem and we will look into it.

Yes, I ran the jobs yesterday using mSarHar1.11 (https://www.ncbi.nlm.nih.gov/genome/?term=tasmanian+devil) - the largest contig is about 716Mb.

Thanks for looking into this.