Hi, I’m trying to use Vsearch search to find and retain reads that have 90% query coverage to a primer sequence. Every time I run this job, I get an error “This job was terminated because it used more memory than it was allocated.” I read the job and tool error help which says that this error is often due to the user inputs or tool parameters. I also tried to read the Vsearch manual which honestly felt like I was reading a foreign language. Unfortunately, I have extremely limited knowledge on this subject and I don’t know what I am doing wrong.
Expand the input datasets and if you do not see any warnings about a mismatched datatype, then the data is probably OK. You could also “redetect” the datatype to double-check (pencil icon > Edit Attributes > Datatype > Redetect datatype).
That said, it sounds like the job may be running out of resources due to the number of reads (probably memory resources during indexing). Downsampling/subsampling the reads is one option. Setting up a Galaxy server with more dedicated resources is another.
The GVL version of cloudman is the recommended option for scientists.
Using the latest version of the tool is also strongly recommended (check with the “Versions” pull-down menu at the top of the tool form).
For parameters: remove all filters, review the results (hits), then tune from there.
The Vsearch publication itself might be easier to interpret. Then the manual can be used as a specific reference. There is also a Google forum for this tool suite: https://groups.google.com/forum/#!forum/vsearch-forum
Thank you @jennaj for your detailed reply, it was very helpful!
I have around 13 million reads so you are probably right about this being the reason why I keep getting this error. I’d like to keep all my reads so am now trying to run vsearch directly on my computer.
Thanks for all the additional resources I will check out the galaxy tutorials and refer to the Google forum if I need more help with vsearch.