Silva reference database

Funnyme186 · September 1, 2021, 12:40pm

Hi, I am trying to align my sequences with the silva 138.1 version but the downloaded file is empty (same for version 132),
can anyone advise please?
I am looking for bacterial and archaeal community in my samples amplified with 341F and 806 R for 16S V3V4 region.
Thank you

jennaj · September 1, 2021, 5:13pm

Hi @Funnyme186

Try this data provider as a source: https://www.arb-silva.de/

You’ll want the fasta version of the data. Example link: Archive

Tutorials: Galaxy Training!

Funnyme186 · September 2, 2021, 6:21am

hi @jennaj Thank you for your reply!
I did download the uncompressed version of 132 (9.9 GB), and used it to align my sequences because I didn’t really know how to extract the fasta from arb.

The problem now is that when I get to cluster.split , it is taking too long and error at the end.

Do you have an idea of how to resolve this problem?

If yes, could you please advise?

Regards

Stephanie

jennaj · September 2, 2021, 3:28pm

Hum – is the datatype fasta or fasta.gz. Expand the dataset to check. If fasta.gz you might just need to uncompress it. Click on the pencil icon for the dataset to bring up the Edit Attributes forms. The tab for Convert has a drop-down menu with an “uncompress” function. Some tools do not interpret compressed fasta well. If that works, please let me know.

Or, maybe the full fasta is too large for the server to process. There are several posts at Biostars discussing how others are sub-setting the fasta to just the regions of interest.

EMBOSS Fuzznuc is a tool in Galaxy already (referenced in the second post above).

The https://www.arb-silva.de/ site has a search/filter function. I haven’t used it much but might be another way to subset and output fasta. The site has documentation and examples. Many functions involve using the ARB tool package for data manipulation. That said, most functions can probably be translated to alternative tools in Galaxy.

If that is not enough, where are you working now (server URL)?

If usegalaxy.org, would you please send in a bug report from the new error? Include a link to this topic in the comments so I can find it. Would like to review it.
If working at a different public Galaxy server, you can share your history with me in a direct message here. You won’t be able to do that yet – so write back if needed and I’ll start one up. That keeps your shared history link private. If you don’t care about privacy, the history share link can just be posted back in this thread. How to: Galaxy Training!. Also post back the dataset number with the failure to make sure we are looking at the same thing.

Let’s start there. Can bring in some domain specialists if needed, but better to rule out technical issues first.

Funnyme186 · September 6, 2021, 7:05am

Hi @jennaj
Thank you again for your prompt reply.

My datatype is fastqsanger.

Yes I tried to create a custom reference file with my regions but didn’t use it.

Yes the full silva 132 fasta is too large (9.9 GB) but the readme to compress it was too complicated, I will try to extract the fasta again.

I am working on the usegalaxy.org

I shared my history with your email address.

I sent in a bug report today the new error of cluster.split.

Thank you a lot for your help regarding this issue, I really appreciate it.

Best regards

Stephanie

jennaj · September 20, 2021, 7:08pm

HI @Funnyme186

Well, there were some issues at the server last week. Those may have impacted your work. It looks like you have had a successful rerun by now with the same inputs that originally timed out and failed.

Apologies for the delayed reply – I also needed to wait for things to get back to (mostly) normal before reviewing/replying. All should be Ok now. Everyone should expect slightly longer job queue timing until the banner on the server clears, but everything else is fine.

Funnyme186 · September 22, 2021, 7:07am

Hi @jennaj,

No worries, thank you for your reply!

Yes I waited for the server’s maintenance to finish and run another batch of samples, but I am getting stuck again at the cluster split step with opticlust as a clustering method.

When changing the parameter of clustering method to vsearch, it works.

Is there a way to make it work with opticlust?? Since it is failing after 10 hours.

I am sharing this new history with you as well.

Thank you in advance for any help provided!!

Regards

Stephanie

Topic		Replies	Views
16S V3/V4 database needed metagenomics , microgalaxy	6	79	December 28, 2024
align.seq tool- Fatal error 153 usegalaxy.eu support metagenomics , mothur , evolution	3	231	February 5, 2024
Adding Silva Database to Galaxy usegalaxy.org support metagenomics	3	1281	October 2, 2020
Align.seq Error 153 usegalaxy.eu support metagenomics , mothur	1	26	October 20, 2024
Exploring Mothur hits to a SILVA database reference-index , troubleshooting , mothur	1	29	November 11, 2024

Silva reference database

Related topics