HISAT2 - Could not display BAM file

Hello!

I have local Galaxy installed. Also I used Data Manager for creating indexes.

  • Galaxy - 19.05
  • HISAT2 - 2.1.0+galaxy5

I have just tried to run Hisat2 on usegalaxy.org with one of my files and it run successfully.
I uploaded this file to galaxy:

http://212.150.245.226/~mali/Galaxy/RNA_seq/a1.fastq

Then I run HISAT2 using this file as input (single end) fastq and hg38 as index , that’s all, very simple
In usegalaxy.org it works, in local galaxy it runs forever…

I don’t have any error, just eternal job. This only one thing that I can found this message:

Could not display BAM file, error was:
file has no sequences defined (mode=‘rb’) - is it SAM/BAM format? Consider opening with check_sq=False

Also I try the same run in console and notice the same behavior.

1 Like

Welcome @artem

It seems that your computer is running out of resources when running the mapping job locally.

Tools use about the same resources whether run line-command or in Galaxy. The hg38 database is very large, and your input fastq data may be large, too.

Make sure that you set up your local properly to work with large data. Instructions/tutorials for configuration can be found at:

I’ll also ping the developers at Gitter to see if they can tell what else might be going wrong. Feel free to join in the conversation. They may reply here or there: https://gitter.im/galaxyproject/Lobby?at=5daa1be5e3646f24c74d65ea

Thanks!

@jennaj

Thanks for your answer!

Right now I use AWS instance t3.2xlarge (8 CPU and 32Gb RAM), HISAT2 job has been started more then 4 hours ago, but any result. Also don’t see any errors or etc.

Maybe I missed something, I’m a technical specialist, and unfortunately I do not understand the intricacies of Galaxy.

Thanks!

I have modified file “job_conf.xml”, add “local_slots” option. And noticing, that hisat2 running with “-p 8” option, but only one processor is utilizing and not use RAM.

I have waiting 24h, but HISAT2 job not finished.

1 Like

Thanks for the clarification about your Galaxy, @artem

So you have tested in both a GVL cloud Galaxy instance and a 19.05 local version of Galaxy, same problem (job errors)? But the same job works at the public Galaxy server https://usegalaxy.org?

One of these could be going wrong:

  1. The genome was not fully indexed. Any genome should have Data Managers run in this order: Indexing reference genomes with Data Managers: Resources, tutorials, troubleshooting. The cloud version of Galaxy would have this genome already indexed. In a local Galaxy, you’ll need to do the indexing.

  2. Your data is in fastqsanger format and the first read has an exact human genome match (hg19, hg38), so the problem doesn’t seem to be with either of those – unless you manually assigned the datatype to be fastqsanger.gz (compressed fastq). When the assigned datatype is a mismatch for the actual data content, errors are produced.

    • Try running FastQC to see if it was fully uploaded. If that tool fails, the dataset was likely truncated upon Upload. Loading again is usually how to fix that type of problem. It doesn’t seem like that original dataset is truncated, if it worked at usegalaxy.org. Be sure to use “autodetect” for the datatype assignment.

FAQs: https://galaxyproject.org/support/

Thanks!

Thanks for the answer!

Indeed, I checked the hisat2 index files, and noticed that there were not enough files, although the Hisat2 index was successful (via Data Manager). I recreated the index and it worked, thanks !!!

1 Like

Super, glad this is working for you now, and appreciate the feedback :slight_smile: