Another STAR error

in processing some RAT RNA-seq data I recieved the following error:

slurmstepd: error: couldn’t chdir to `/srv/pulsar/main/venv/lib/python3.6/site-packages’: No such file or directory: going to /tmp instead

EXITING because of FATAL ERROR: could not open genome file /cvmfs/
SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permsissions

Sep 11 20:53:44 … FATAL ERROR, exiting
[E::hts_open_format] Failed to open file “/jetstream/scratch0/main/jobs/30550923/outputs/dataset_45038405.dat” : No such file or directory
[E::hts_open_format] Failed to open file “/jetstream/scratch0/main/jobs/30550923/outputs/dataset_45038405.dat” : No such file or directory
[E::hts_open_format] Failed to open file “/jetstream/scratch0/main/jobs/30550923/outputs/dataset_45038406.dat” : No such file or directory

Is this a problem with the GTF file and if so can someone point me towards the correctly formatted file type?

1 Like

^^ This part of the error can be ignored – it is just a log message produced by the remote cluster

^^ This part describes a tool/server/resource path problem – specifically, the double slash //.

We’d like to take a closer look. Would you please send in a bug report from the red error dataset? Leave the inputs/outputs undeleted, include a link to this topic in the comments, and post back here once sent in so we know when to look for it.

The indexes were recently reorganized for RNA-Star (older versions) plus new indexes were created for the latest version as available at Other tests passed, so we are curious about what exact inputs/settings/parameters led to this type of error.

It doesn’t appear to be a problem with a reference annotation (gtf) input, but we can check that at the same time and provide feedback if there is some secondary issue. If the tool itself ran correctly, problems with gtf content would create a different type of error, or possibly just odd yet putatively successful results.

There is much prior Q&A about troubleshooting issues with reference annotation at this forum. Search with a keyword like “gtf” to find those. Help is also in this FAQ:


cc @nate @dave

Thanks for sending the bug report in – reviewing.

Update 1: Other tests still running to address the path problem, but the gtf is also problem. The chromosome identifiers are a mismatch with UCSC’s rn6 genome chromosomes. The FAQ I linked before will help to get that corrected. UCSC has a version (linked in FAQ), as does Gencode for GRCm38/rn6 here: If you get the Gencode version, you’ll need to remove the header lines. Many topics cover it, this is a good one (refers to human but the same instructions apply to mouse): Wrong! Update 9/15/20: For rn6 UCSC does not have GTF in the right format (why explained here: and Gencode does not have it at all (only human/mouse are supported).

But iGenomes does – scroll down in the topic below for instructions about how to get the data into Galaxy from that data provider. Or, you may be able to convert the identifiers in the current GTF with this tool Replace column by values which are defined in a convert file (Galaxy Version 0.2). One source for “convert files” is linked on that tool form, scroll down into the help. You probably want this file Rnor_6.0_ensembl2UCSC.txt – and the “raw” URL for that data here should be good to paste into the Upload tool without any manipulations re format/datatype: You can test either method out and use the resulting GTF with HISAT2 to see if it is complete/correct.

Update 2: Execution issues with RNA-Star are confirmed. An alternative tool is HISAT2. Issue ticket (will close out once fixed):

1 Like

My data is not of the greatest quality, but I wanted to create a work flow for other data I have coming soon

Thanks for working on this, just to clarify I’m working with the rat genome, not mouse

1 Like

The test cases included in the ticket were chosen for specific purposes related to the technical nature of the presenting problems (there is more than one factor involved).

So – another way of stating the goal is to get the most commonly used model genomes indexed, functional at the target clusters, and in the correct format for use with the latest version of the RNA-Star wrapper. Rat rn6 will be part of the priority genome set – I’ll add a note to the ticket to make that clearer.

Update 9/15/20 – Sorry, missed what you were referring to re mouse/rat. I updated the above help for better options: 1) the iGenomes source for a GTF based on rn6 UCSC identifiers and 2) how to possibly convert your existing GTF from Ensembl-to-UCSC chromosome identifiers. You’ll need to try those and see if it works or not (should). Maybe do both and compare, you may simply prefer the annotation content from one versus the other. :nerd_face:

1 Like