Hello,
I am getting an error when running HISAT2 using hg19 full specifically as the genome. When HISAT2 is run using hg19 canonical or hg38, the job completed without any issues. I re-ran HISAT2 on a previously completed job using hg19 full, and it throws an error. The error seems to be stemming from the hg19 index on CVMFS. I am posting the error and the command used below:
Error reading _ebwt[] array: 14118, 15360
Error: Encountered internal HISAT2 exception (#1)
Command: /home/galaxyp_user/galaxyp/galaxy/database/dependencies/_conda/envs/mulled-v1-ad7c0e574419219598c842c5e534a388c10d33e19c5718205a00928715733608/bin/hisat2-align-s --wrapper basic-0 -p 1 -x /cvmfs/data.galaxyproject.org/managed/hisat2_index/hg19/hg19 --pen-cansplice 0 --pen-noncansplice 12 --pen-canintronlen G,-8.0,1.0 --pen-noncanintronlen G,-8.0,1.0 --min-intronlen 20 --max-intronlen 500000 --dta --read-lengths 151 -1 /tmp/3452387.inpipe1 -2 /tmp/3452387.inpipe2
(ERR): hisat2-align exited with value 1
samtools sort: failed to read header from "-"
[main_samview] fail to read the header from "-".
The command is as follows:
set -o pipefail; ln -f -s '/home/galaxyp_user/galaxyp/galaxy/database/objects/d/2/b/dataset_d2b3484d-ef9f-4e56-9735-2d45a1493ed3.dat' input_f.fastq.gz && ln -f -s '/home/galaxyp_user/galaxyp/galaxy/database/objects/d/a/5/dataset_da5930b7-36d4-467e-afe4-76062554ab50.dat' input_r.fastq.gz && hisat2 -p ${GALAXY_SLOTS:-1} -x '/cvmfs/data.galaxyproject.org/managed/hisat2_index/hg19/hg19' -1 'input_f.fastq.gz' -2 'input_r.fastq.gz' --pen-cansplice 0 --pen-noncansplice 12 --pen-canintronlen G,-8.0,1.0 --pen-noncanintronlen G,-8.0,1.0 --min-intronlen 20 --max-intronlen 500000 --dta | samtools sort --no-PG -l 0 -T "${TMPDIR:-.}" -O bam | samtools view --no-PG -O bam -@ ${GALAXY_SLOTS:-1} -o '/home/galaxyp_user/galaxyp/galaxy/database/objects/e/a/c/dataset_eacad32c-e139-4a60-a286-a364d28d6d70.dat'
My research keeps pointing to a corrupt hg19 index file. When tested using command line interface and point Galaxy and HISAT2 to use local hg19 indexed files (downloaded from Index of /managed/hisat2_index/hg19/), it does not throw an error which suggests that the hg19 index on CVMFS seems to be corrupt.
Since we are running a local Galaxy instance and use CVMFS, is there a way to unmount and remount CVMFS to correct this issue?
Any other suggestions that you may have to solve this?
Thanks