Error in RNA STAR - wrong read ID line format

When running samples on RNA STAR, 15/16 samples ran perfectly with no errors. One sample returned the following fatal error:

I can see that it’s an issue with the input file, but I’m lost as to how to address this. The input files should all be in identical formats, and appear the same to me on Galaxy. Here are screenshots of the first few lines of two FASTQ file uploads from this set - the top screen shot is the file that is returning an error, and the other is a file that ran successfully.

Both appear to be formatted identically to me, so I’m not sure why Galaxy is reading one sample as HTML. All samples are correctly identified as fastqsanger files at all earlier steps in the workflow.

Could anyone help me understand where this error might be coming from, and at what point in the process I could address the issue?

Hi @AEJ,

my first guess would be that the dataset you are inspecting and showing in your screenshot is not the one that the tool receives as input. Please make sure there is not an actual html file that you are passing to the tool.

For any further debugging I would need to see your workflow or your history.

You may get helpful advice faster if you just submit a bug report from the failed dataset directly. This way people from the .org team will get all relevant information about your job and won’t have to guess like I’m doing here.

Cheers,

Wolfgang

Hi @AEJ

Thanks for sending in the bug report as @wm75 suggested!

I replied to your report with the message below. In short, when a tool reports an issue with fastq data content, starting with content checks is where to start. This is how I replied to your email.

However, after taking a closer look at your most recent reruns (and QA), the reads themselves seem fine now, and I can see that different samples may fail with your exact reruns on the full collection. This was an important clue!

The jobs are likely failing for a technical issue. We had some cluster hiccups over the last week or so but this can happen at any time (thousands of jobs are processing, constantly!). Some jobs will fail by chance and produce an unreliable error message.

What to do

Try a rerun! You won’t want to rerun the entire collection. Instead, only rerun the failed job(s).

Full details with screenshots are here (including the Planemo option for bunk reruns!). → Rerunning only failed jobs in a workflow: Replace and Resume functions - #2 by jennaj

1 Like