StringeTie Generating Empty Transcript and Gene Counts

I am not receiving any errors, but the transcript counts and gene counts files are completely empty. The annotation file I am using has been used in previous studies and these are teh exact same sample types. I generated the coverage file and it also includes data so I am not sure why the counts are empty. It isn’t even generating a 0 for counts it is just completely empty. The assembled transcripts file also contains the normal data. In addition I used the exact same files on featurecounts and it worked completely fine. Is there an issue with the server and StringTie or what is the cause of the counts files being completely empty?

This is the message that is with it, but again doesn’t flag it as an error: /usr/local/bin/prepDE.py:72: SyntaxWarning: invalid escape sequence ‘-’
RE_COVERAGE=re.compile(‘cov “([-+\d.]+)”’)
/usr/local/bin/prepDE.py:75: SyntaxWarning: invalid escape sequence ‘-’
RE_GFILE=re.compile(‘-G\s*(\S+)’) #assume filepath withou

Hi @KMKLOHONATZ
Search the forum for StringTie. This issue was reported several times.
Hope that helps.
Kind regards,
Igor

Hi Igor,

I tried searching but the issues posted are regarding getting counts equal to zero, not completely blank files. I have tried setting HISAT to give output directly for StringTie, I am lost as to what to do next and would really prefer not not to use FeatureCount.

If I am missing a thread please direct me to it, but their issues seem to be different.

Hi @KMKLOHONATZ
Maybe share the history via URL, so I can check what is going on. History options (three bars icon at the top right corner of the history panel) > Share or publish > In the middle window Make history accessible > Copy the URL and paste into reply .
Kind regards,
Igor

Hi @igor ,

Thank you so much for taking a look at this as I am at a complete loss. Here is the URL. I had deleted all of the other StringeTie attempts, but have the most recent at the top as well as the FeatureCounts using the exact same files that work.

Thank you!

Hi @KMKLOHONATZ
I searched the forum for the error message and found this post:

Does it help?
Consider using featureCounts. It work with your data.
Kind regards,
Igor

Hi @igor ,

I found this previously and the most recent HISAT2 file you see in my history is with that specific advanced setting for Stringtie files as they suggested and it is still yielding the same result. That’s why I am so confused because I know it is not the GTF file as that has been used for a previous experiment.

Hi @KMKLOHONATZ
Can you use featureCounts? I tested it on one of your samples and got the results - see datasets #347 and #348 here.
It seems only relatively small proportion of reads was assigned to genes. Many reads are not mapped, or mapped outside of the annotated features.

It looks like an issue with read counting mode on the ORG server. My test StringTie job also produced no counts (datasets #369-#371), while I got counts on Galaxy Australia.

You may also consider using other servers, for example, Galaxy Europe, but I have not tested StringTie on the European server. Personally, I prefer featureCounts.

Hi @jennaj, I am sorry to trouble you, but it seems StringTie on ORG server has an issue in read counting mode. It produces empty counting files with SyntaxWarning: invalid escape sequence ‘-’ - see the shared history above. I got the expected results including count files on Galaxy Australia with the test files. Maybe be you can ping someone to have a look.

Thank you!
Igor

HI @igor ,

Thank you so much for looking in to this. I am not sure what portion of reads we are anticipating and this genome is not well annotated at all, which doesn;t help, but also why we were hoping to use Stringtie to identify other transcripts.

I hate to ask you another question with all of the help you have been, but how do I use a different server?

Thakn you so much!

Hi @KMKLOHONATZ

I have not checked StringTie gene model building mode on ORG server. I hope it works. You already have a file with StringTie predictions. I thought it is the final (merged) gene models.

As I said, you can count reads with featureCounts.

You can register on any other public Galaxy server, such as usegalaxy.eu in the same way as on ORG server. It is OK to have accounts on different servers. Upload data or copy the files from ORG to Europe. Maybe try it on one sample first, to make sure it works. Originally I tested the tool on Australian server.

Kind regards,
Igor

Yes, I’ve seen this before, and it has always been some data problem, but I am willing to look again, especially if the same exact data works on other usegalaxy servers. :slight_smile: Maybe there is some corner case issue with a dependency package.

I grabbed a copy of the history – reviewing this week, maybe even today. Thanks!

Hi @jennaj , Just a followup on this, it is not a data problem as I can run it on the Austrailia server with no issues and it is the exact same files.

1 Like

Hi @KMKLOHONATZ

Yes, this is still on my list. What is going wrong isn’t clear yet but we are working on it. Great that the tool is working fine at the UseGalaxy.eu and UseGalaxy.org.au servers, so that is good advice for anyone else having the same problem: try to run your work there for this step instead for now. Thanks! :slight_smile:

@jennaj

Unfortunately I am running into different issues on the .au server with Stringtie now.

Yes, I saw that. Keep following up with Igor about that please. Thanks!