During standard RNA-seq alignment of files downloaded from SRA, files the queue had been stuck in TrimGalore, which has been running continuously since Saturday. After downloading, the paired fastQ files were processed with FASTQgroomer without problems. The read counts match in the forward and reverse files, and I can’t find anything wrong with them. I’m losing a lot of time with a process that should be routine and hope there is something I can fix.
Hi @jlocker
You don’t need to run the FastqGroomer tool for any short reads that are from an Illumina 1.7 or later sequencing pipeline. Meaning, you can skip this tool entirely and save processing time.
→ Instead, just load up the files and let Galaxy autodetect the datatype (includes quality score scaling detection for fastq datasets). The guess should be fastqsanger or fastqsangergz. If you don’t have that detected, it means more in going on: truncated or corrupt file is the usual reason. Getting Data into Galaxy
For the TrimGalore step, do you mean that the datasets are queued (gray in color)? That is normal when working at a public Galaxy server. Some of your jobs run, some of other people’s jobs run, more of yours, repeat.
Next time, the good way to get SRA data into Galaxy is with the tool Faster Download and Extract Reads in FASTQ format from NCBI SRA. Input the accessions and use all defaults. You can use a list of accessions for batches.
And, you could consider using a workflow. Even if this is just for those two tools – put both into a workflow, start it up, and set a notification. Everything will stream and process while you are away, including the download step.
You can extract that workflow from a history you already have, or use a template from the GTN, or create from scratch. The workflow editor looks much like the regular analysis view and all of the same options are available. Find the tool in the workflow tool panel, add it to the canvas, set parameters, and connect the input.
The quality control workflow here is a good template. You can swap out tools and fully customize. Hands-on: Quality Control / Sequence analysis
Hope this helps! Please explain more if I am misunderstanding
I’m running old datasets from an SRA archive, so my routine is to use FastqGroomer, but your tip could save time…
The problem I’m having is that files are stuck in FastqGroomer. They have been running continuously (in orange) since Saturday afternoon. I’ve been waiting for one to crash so I could contact you through the error message, but they are still running. Right now, my entire queue seems stuck. I noticed that it was not counting the files accurately, so I have been pruning out files, but that doesn’t help.
This morning, I downloaded a couple of FastqGroomer output files and sent them to Usegalaxy.eu. On that site, TrimGalore processed them in a few minutes, and RNA Star gave perfect alignments, also very quickly.
Thanks for responding,
Joe
Hi @jlocker
Ok, the jobs at the EU server started up because you didn’t already have a bunch of jobs running.
From here, if you have more fastq files ready to use, you can copy them into a new history, then purge the original history. That will cancel all of the groomer jobs that you don’t need, and prevent them from using up your “current jobs” quota.
Using the servers together is also fine to speed things up.