Troubleshooting: fastq read content and shape. Collections, interlaced/interleaved reads, quality assurance

Pedro_Fernandes_de_S · November 1, 2025, 12:42pm

Hello, I’ve been having some problems, I’m doing a de novo assembly in rnaSPAdes and I’m suposed to use 4 archives at once, the total amounts at 20 gigabytes and when i try to run it, it returns the information that the job can not be done because there are no destinations able to fulfill, what can i do?

jennaj · November 3, 2025, 10:38pm

Hi @Pedro_Fernandes_de_S

This type of error “usually” indicates some type of server-side issue when connecting to the clusters.

These events are transient and rerun can solve the problem. But you can also check with us here! Have you tried a rerun yet? If not, please try that first.

Your job seems to be in the range we can process .. but assembly is a bit more complicated than the overall data size. The read content matters just as much. If you haven’t run any QA on the reads yet, backing up and doing that would be a good place to proceed after the rerun. You’ll be checking to eliminate technical issues, then deciding to apply items like read quality trimming.

Quality Control Start Here! multQC issue and guidance?

Then, for a persistent error, you are welcome to share back your history for more specific feedback.

How to get faster help with your question

There are not any known issues at the UseGalaxy.org server (now, or from the last few days). But if you can share an example, we can look into the administrative details closer to confirm, then help to solve any outstanding usage/data issues.

Please let us know what happens!

Pedro_Fernandes_de_S · November 3, 2025, 10:58pm

I retried sometimes but it never worked, and the data is high quality

jennaj · November 5, 2025, 4:57pm

Hi @Pedro_Fernandes_de_S

Thanks for trying again! Do you want to share the error back here?

Pedro_Fernandes_de_S · November 5, 2025, 5:04pm

It was the Same error

jennaj · November 5, 2025, 5:20pm

Thanks @Pedro_Fernandes_de_S

Would you be able to share the details so I can locate your job? If you are concerned about privacy, we can move into a direct chat. I’ll start that up and you decide.

Pedro_Fernandes_de_S · November 5, 2025, 5:25pm

Can u sendo the First message, still trying to figure out the site

jennaj · November 5, 2025, 7:04pm

Hi @Pedro_Fernandes_de_S

I see your issue. Interlaced reads were specified, but individual forward and reverse reads were input. This was detected by rnaSPAdes and the tool failed.

This is a good questions, so I’m going to explain with more details both for you and for anyone else reading this later on! Please jump down to the last section What to do to skip over the parts of this you don’t need!

Data content versus shape

When using a tool like rnaSPAdes, it is important to set the tool form options to reflect the content and shape of your data.

For the context of fastq reads from an RNA-seq sequencing project, this would include:

Content: the type of sequence (rna-seq, dna, other) and the actual read itself (nucleotide bases, protein amino acids, other) then the quality score scaling scheme (fastqsanger in Galaxy refers to Sanger Phred +33 offset scaling).
Shape: how the data is arranged. Fastq data can have both ends of a read combined into a single file (interlaced paired end) or the forward and reverse reads can be in separate files (individual paired end).

Quality assurance is performed to address the first. Sample organization is performed to address the second.

How to use rnaSPAdes

This screenshot is showing one way to organized individual paired end reads on the rnaSPAdes tool form.

The data content are individual fasta files – forward and reverse.
The shape is a list of pairs inside a collection folder.

The collection’s nested content can be navigated!

If I had instead made the choice for interlaced (aka interleaved) reads, then my data wouldn’t be an available input option.

With the individual file option

Or the collection option

Why?

The data shape in my history is not a match for this parameter.

The tool is expecting individual fastq files or a List type of collection containing those files – one per sample.

If I did have individual fastq files in my history, unhidden, then the shape would be a match for the fist. This is how your history and job were configured. But the content would then fail next.
The data content are currently individual fasta files, not interlaced files.

The original tool also has a parameter to specify interlaced or not, and that flows down to the version hosted in Galaxy. So, even if I unhide my collection datasets…

..or decide to not use a collection at all, this input would either produce an error, or my results may not be correct. Again, this is similar to your usage case.

Quality control

The rnaSPAdes tool happens to perform some built-in content checks! But you cannot rely on this. As the scientist running a tool, you’ll need to double check. Discovering problems with content is complicated, and often doesn’t show up until much later in an analysis when reviewing the data reduction statistics. Instead, getting the data organized and understanding the tool will help to avoid problems like this.

A tool failure in Galaxy only means that the tool couldn’t run! It is not by itself a test to review if the results are scientifically correct or not. This is how tools work when you use them directly on the command-line too. Tools will produce logs if they can (sometimes they abort, and the logs cannot be captured, if there were any at all!).

How to review job Details: input summary, command line, and logs. → FAQ: Troubleshooting errors
Tools that do not accept any input usually have a shape problem but content issues are still possible!
Tools that accept inputs then fail quickly indicate a content problem.
A tool like Fastq Info is one way to review technical details of fastq data.
Then Falco or FastQC can inspect the scientific content (plus a few technical details).
These reports, along with trimming tool reports and many others, can be combined into a MultiQC summary report. This is useful for tracking the changes made by tools throughout an analysis project.

Resources → Quality Control Start Here! multQC issue and guidance?.

Be sure to see the workflow! Most people using paired-end read data can use this to run an entire QA pipeline with just two clicks – 1) select your List of Pairs collection on the workflow form, then click on 2) Run.

If you don’t like the results, go in and make adjustments to your copy. Instructions are annotated inside of it. If the workflow fails, that is an early warning sign that something is wrong with your data content or shape, and the workflow and tool logs will be informative!

What to do

Your sample reads are individual forward and reverse fastq files. You can either combined these reads into an interlaced format, or you can organize the forward/reverse reads into a List of Pairs collection folder type.

Option 1: Convert to Interlaced

seqtk_mergepe interleave two unpaired FASTA/Q files for a paired-end file
or, FASTQ interlacer on paired end reads

Option 2: Create a List of Pairs

An example for doing this with data already loaded is here → Why collections?
Or, you can Upload data in a collection. → Upload data directly into a collection! Solves: No compatible list of pairs dataset collections available - #2 by jennaj

List of Pairs is usually preferred since the collection can be created at the very start of a project and the shape is understood by more tools. You’ll also have data grouped by sample! The sample flows down through tools, making it easier to navigate complex work. But you could also create your interlaced fastq data and put those interlaced files into a flat List collection to keep things tidy another way.

Resources

Description of common data content types for the kind of analysis you are doing. Includes an closer inspection of what “interlaced” reads look like. → Hands-on: NGS data logistics / NGS data logistics / Introduction to Galaxy Analyses
GTN Materials Search (query=collection) with this a good place to start. → FAQ: Datasets versus collections

I hope this give you some useful choices for resolving the error! Please let us know if this actually solves the problem or if you need more feedback about any of this.