Maximum number of reads for SPAdes input

Hi everyone. What is the maximum number of paired-end reads that can be used as an input for a SPAdes assembly to avoid getting a “dataset too large” error? Thanks!

Welcome, @David_Read

If an assembly is running out of processing memory during execution, there are a few things you can do. These would all be important when running a read assembly job (any of the assembly tools) through a public Galaxy server, or usually even when running these tools directly outside of Galaxy.

Use Case
The current work is running out of processing memory during job execution. This is different from the data storage available in your account.

What to try

  • Run the reads through quality control to improve the content the tool is attempting to assemble.
  • Set more input parameters to improve how the tool is attempting to assemble.
  • Reduce the number of reads submitted with a tool like Sub-sample sequences. Some protocols involve pre-clustering with other strategies, too.
  • Try at a different public server. As far as I know, UseGalaxy.eu currently offers the most processing memory for SPAdes.

For this part

Let’s ask the EU administrators if there are any fixed limits (total file size or similar). ping @wm75 :slight_smile: