Error in RNA STAR - wrong read ID line format

AEJ · March 23, 2026, 7:57pm

When running samples on RNA STAR, 15/16 samples ran perfectly with no errors. One sample returned the following fatal error:

I can see that it’s an issue with the input file, but I’m lost as to how to address this. The input files should all be in identical formats, and appear the same to me on Galaxy. Here are screenshots of the first few lines of two FASTQ file uploads from this set - the top screen shot is the file that is returning an error, and the other is a file that ran successfully.

Both appear to be formatted identically to me, so I’m not sure why Galaxy is reading one sample as HTML. All samples are correctly identified as fastqsanger files at all earlier steps in the workflow.

Could anyone help me understand where this error might be coming from, and at what point in the process I could address the issue?

wm75 · March 23, 2026, 9:30pm

Hi @AEJ,

my first guess would be that the dataset you are inspecting and showing in your screenshot is not the one that the tool receives as input. Please make sure there is not an actual html file that you are passing to the tool.

For any further debugging I would need to see your workflow or your history.

You may get helpful advice faster if you just submit a bug report from the failed dataset directly. This way people from the .org team will get all relevant information about your job and won’t have to guess like I’m doing here.

Cheers,

Wolfgang

jennaj · April 1, 2026, 6:11pm

Hi @AEJ

Thanks for sending in the bug report as @wm75 suggested!

I replied to your report with the message below. In short, when a tool reports an issue with fastq data content, starting with content checks is where to start. This is how I replied to your email.

However, after taking a closer look at your most recent reruns (and QA), the reads themselves seem fine now, and I can see that different samples may fail with your exact reruns on the full collection. This was an important clue!

The jobs are likely failing for a technical issue. We had some cluster hiccups over the last week or so but this can happen at any time (thousands of jobs are processing, constantly!). Some jobs will fail by chance and produce an unreliable error message.

What to do

Try a rerun! You won’t want to rerun the entire collection. Instead, only rerun the failed job(s).

Workflows: job queues and managing partial failures with Resume and Replace!

Then, the advice I would have for larger batches of work at the public servers. 1. start up the processing for an entire batch. 2. Later, if some fail, click into the collection and use the rerun icon (FAQ: Different dataset icons and their usage) on each failed dataset. This brings up the original tool form and you can run it again for just that single sample.

Bonus: If the failed job was run as part of a workflow, right above the Run Tool button (on the tool form) will be two new extra options. This is useful when running longer, complicated workflows, but also shorter ones. The result is indistinguishable from runs where everything worked perfectly to start with.

Replace elements – this sorts the new output back into the original collection output nesting (with the already successful inputs).

Resume dependencies – this starts up any downstream tools that were paused because of these upstream failures (assuming the new job works!)

Full details with screenshots are here (including the Planemo option for bunk reruns!). → Rerunning only failed jobs in a workflow: Replace and Resume functions - #2 by jennaj