Hi @jradfor8
A “gray” colored job is waiting for resources to become available. Meaning, it is in the job queue already.
A “yellow” job is actively executing. It will end in either a “green” (successful) or “red” (error) state at UseGalaxy.org. Some other public Galaxy servers do this a bit differently, and will automatically rerun a job one or more times if it originally failed. An example public Galaxy server that automatically reruns all failed jobs one time is UseGalaxy.eu.
That said, the cluster that runs Trinity
jobs at UseGalaxy.org did undergo some configuration changes over the last month. Some jobs were impacted and administratively rerun. My guess is that is what happened in your case as you explain it, and from what I can see in your account.
The Trinity
job was started on Feb 12. It is batched with 8 total paired-end inputs, pooled = yes, and no QA/QC was done on the reads, or rather, no QA was included in the history where this work is included. The data appears to have been downloaded from SRA and directly input to Trinity
(?).
“Pooled” can be set to either yes or no.
- For “pooled = yes”, that means all inputs will be combined into one assembly. The datasets are quite large to be pooled, between ~10-20 GB for each end of each pair. An assembly at a public Galaxy server (any) is unlikely to be successful with that much data run as “pooled = yes”. It will run out of resources and fail, for either runtime or memory reasons.
- For “pooled = no”, there will be one assembly job/result per paired-end read input. Some of your pairs would probably assemble, but it is unlikely that all would, especially the first pair in the collection as it is now (the largest, with no QA).
We granted you some extra quota space already, so sent you links about the ways to use Galaxy, but I also added them below again for others reading. Public servers usually not a good choice for any single job that involves a very large amount of data. Many smaller individual jobs are usually fine. Quota space is storage space, and unrelated to the amount of memory/resources a tool uses when executing. Public Galaxy servers have significant computational resource allocations but not as much as what you can set up yourself in your own Galaxy (for practical reasons).
What to do from here:
If you didn’t intend to pool the data, then you should rerun it as “pooled = no”. You can still use a collection input. Definitely do some QA first. If any of the jobs fail, then you’ll either need to adjust the inputs (downsample the reads with a tool like Seqtk
) or run the job at a Galaxy server with more resources allocated (your own).
If you did intend to pool the data, expect this job to fail for resource reasons. QA won’t make a difference by itself at UseGalaxy.org, but you should do some QA if/when you decide to run the job at your own Galaxy server. Downsampling + QA might help at UseGalaxy.org, but you would need to test that to see how much is needed to get a successful job – and decide if the assembly using that much downsampling is still of value to you. The maximum total size of all inputs sent to a Trinity
job is between 20-35 GB when working at public servers. This is not exact since the size of the data is just one factor – the read content is another (and is why a QA step is strongly recommended).
Ways to use Galaxy:
Hope that helps you to make some decisions about how to proceed.
Jen