More memory than allocated error - diffbind

Hi @anushree

Yes, it is very difficult to get this to run on larger datasets due to the memory usage. Trying at UseGalaxy.eu might help. I would also suggest organizing all of the data into collections, since the pairing between the files needs to be exact. You can see how the files are pairing on the job details view to see what I mean about this, or topics like this one → Diffbind memory issues - #2 by jennaj

Technically, it is best to start off with collections, since that will add in the same element identifier to both parts of a particular peak-bam pair by default, but this can be applied after if needed. See the Collection Operation tools. Guides on each with tutorials at the bottom. Ask if you get stuck please! :slight_smile:

To clarify for this part:

The 1 TB storage space is where output files are written. This is distinct from the runtime computational memory a tool might need. So – storage versus compute. The compute on the public servers is truly large but sometimes still not enough, and is different across servers for technical reasons. The EU server will execute some tools with a bit more memory. This is one of them, which is why I am suggesting to give it a try there before deciding that a public Galaxy cannot process the work. More details about this type of error (any tool) → This job was terminated because itused more memory than it wasallocated. - #2 by jennaj

As a last suggestion: I’ve seen a lot of errors like yours that were due to very minor usage issues. You could also try going into the RStudio environment at any Galaxy server and running the tool directly in R, following Bioconductor’s exact procedures. This would access the same server clusters but is one way to test to see if the issue is actually your data size/complexity, or how the job is set up on the form. A test with a smaller subset like a single chromosome from each sample would be a good way to test the job logic in either environment.

Hope this helps!