Error with make.contigs tool in 16s microbial analysis with mothur tuto

Hello, I am going thrugh this tutorial in the training website and after trying with the dataset from the training (the dada did not upload, I made another post about that) I decided to try with my data, but now I have an error when I want to use the make.contigs tool.
See screenshot attached, please help me.
Thank you

Lorena

Hi @lore77

Please post a question just one time, we’ll see it. I’m consolidating your other question into this topic. Xref → Getting data from tutorial 16s Microbial analysis with Mothur

For your question, please explain a bit more and we can help.

  • Tutorial link. Which tutorial are you following, and at what step? You can copy/paste the link to the tutorial step back here. There are two associated with this tool, so being specific will help us to clarify where things are going wrong (and fix on our side if something is not clear).
  • Shared History link. What did your error look like exactly? Have you tried a rerun? You can generate and post back a share link to your history for troubleshooting help. See the banner at this forum for how to do that, or here directly. → How to get faster help with your question

Loading data from a tutorial should definitely work, so it might a problem in how that was done with the Upload tool. We can help to clarify, and even adjust the tutorial instructions to make it clearer for everyone.

Running a tool on your own data, that later errors, can be due to a much larger range of reasons. It could a scientific problem in the data, or how that data was organized, or even parameter choice issue. Solving that is what most questions at this forum are about. Bioinformatics can be quite complicated, even when doing your research through a web interface! :hammer_and_wrench:

Let’s start there! Once you share everything we can help to exactly address the problems. Let’s start with the tutorial part first, since creating your “answer key” history is a great way to learn how the processing works before attempting to do this with your own data. Thanks! :slight_smile:

Update

The test run through the tutorial ran without any problems. Hopefully you are able to do the same. Any problems, share back your work and we can try to help more! :slight_smile:


Hi again @lore77

You can read my answer on the other question, and maybe that solves the Upload problem for you?

I’ve also started up a test run through the full tutorial here, just as an example. I’m loading the data, and the workflow associated with the tutorial. I’m still at the loading step. After that, I will organize the data following the tutorial, then I’ll launch the workflow. This creates the reference history that can be used to compare to when working through the tutorial directly (hands-on) or when running new data through the steps.

I’ll let that run today and we can check back. If a tool fails, I’ll just rerun it and check the box on the form to re-run the dependent jobs (any tools that run after the failed tool). Some small fraction of jobs will fail by chance, so I may get no failed but I won’t worry too much if something does. I’m also going to set a notification – this emails me about the status, plus alerts me in the application if I happen to still be logged in, so I know when to check back. The whole thing will probably take a few hours to run.

I would suggest that you do that same thing, even if you plan on doing the hands-on. This analysis is a bit complicated, and the files are named the same at a few steps which can be confusing the first time through. Plus, if your plans are to run your own data in a similar analysis, working through the hands-on will explain the steps where you might need to make parameter adjustments to better fit your own data, and explain how the reference data choices matter, to help when picking your own choices.

Hope this helps! :rocket:

Hi @jennaj , good morning,

Thank you very much for your messages and your help. I’m sorry that I posted two questions, since there were two different stories and different problems with the same tuto, I did not want to mix up.
I am new in the world of bioinfo and metagenomics, so I wanted to do the extended tutorial to know exactly what I’m doing in each step, but I will follow first your advice to be sure that it works.
About the data, I let it loading but I do not see any changes from yesterday, just one sequence more was loaded.
I share with you the histories
1- The one that follows the extended tutorial with the library data
(tutorial link) Redirecting…
(History link) Galaxy
(library link) Galaxy

2- The one that follows the same tutorial but with my data (error in make.contigs)
(history link) Galaxy

I will go through your shared history and I will update you.

Thanks again!
Best wishes
Lorena

Hi again, @jennaj ,
About the links that I sent in the last post, I don’t know why they loos like this, but they work for me, let me know if they don’t work for you please.
About the shared workflow, the link doesnt work for me, the following message appears:
StoredWorkflow is not accessible to the current user
I will try to make it from the tuto.
The link with your shared history works perfectly.
Thanks again!
Best
Lorena

1 Like

Hi @lore77

Thank you for sharing the details, these really help!

You are getting the data from the Data Library path, and that seems to matter and seems like a server issue. I’ll confirm and reach out to the EU administrators so they can get that fixed. Great that you reported this!

What you can do now is get the data from the Zenodo links instead (the tutorial includes two ways to get the data – so I am suggesting that you try the other way). That will let you move forward.

For the rest, I’m wondering if sharing is working correctly!

  1. I’m not able to access the second history here. You can try toggling the share link on/off again.
  1. And, I can reproduce that this workflow share link (from me) is not working! I’m not sure why, and I already did the toggle on my end. Hum… It was just an import from the GTN, so yes, please get it from there instead.

It is great to flush out minor hiccups, so thank you for following up! Hopefully you have enough to keep going while we sort out these other items. If you find anything else, please do report back here (with details again) so we can squash whatever is going wrong.

More soon and thanks! :slight_smile:

Update: The workflow probably doesn’t share since it was unchanged from the source (meaning, I am not the “author” of anything novel). This is desirable, otherwise millions :face_with_spiral_eyes: of copies of exact workflows will proliferate. Better to get from the source. If I make any changes and save, then it shares fine. I’ll confirm that this was intended behavior.

The best advice I have is to either try again, or to use the Zenodo links instead.

Hi @wm75 – Do you want to add more, or do you think there is something that can be done now to make this smoother? Thanks!

Hi @jennaj , good morning!
Thank you for you reply and your efforts to solve the problems, I was off yesterday, sorry for the late reply.
1- I try again to share my history that follows the full tutorial of 16s microbial analysis with mothur:

I hope this works, I tried copying the full workflow and run it, but work is still queued since Tuesday, I suppose because of the data that is still not uploaded.
I will try creating a new history and uploading the links directly from zenodo and I will let you know if it works. Thank you!
2- About the tutorial with my own data, I also launched the full workflow from the tuto on Tuesday, but some errors came out, this time with the tool unique.seqs (data 64 and 65) and the jobs after the error are paused. I share the history here:

I hope the links works now.
Thank you again and hace a nice day!
Lorena

Hi again, I don’t know why the links tot h e history loosks like this in the previous message, I try again here (sorry for multiple messages)
1- Link to the history with the full tuto and data from the library:

History A tutorial data: https://usegalaxy.eu/u/lore77/h/16s-microbial-analysis-with-mothur-tuto

2- Link to the history with my own data:

History B1 custom data: https://usegalaxy.eu/u/lore77/h/16s-microbial-analysis-with-mothur-pha-data

Hope it works, thanks!

admin minor format edits

Hi @jennaj , it’s me again :slightly_smiling_face:
Just to let you know that I’m doing the tuto with the data from zenodo and untill now everything goes well.
About the make.contigs error, I created a new history with another dataset from my own data and It also finishes with an error, I share with you the new history here:

History B2 custom data: https://usegalaxy.eu/u/lore77/h/16s-microbial-analysis-with-mothur-phab-data

Thank you!

Lorena

admin minor format edits

Thanks @lore77

I’m reviewing. I made some minor label changes to help us discuss these better. :slight_smile: . More feedback about the content soon.

Hi again… First, thanks for posting everything back. The sharing worked great this time!! :slight_smile:

Now, let’s start with the training tutorial.

The links you are using for the data loading are odd. These were copied form the names of the data library links, yes? Instead I would like for you to use the Zenodo links from the tutorial.

You should start over in a new history. Give that history a unique name. Then I’d like for you to follow the instructions in the tutorial exactly without any custom choices (later on you can experiment!).

Link to the step → Hands-on: 16S Microbial Analysis with mothur (extended) / 16S Microbial Analysis with mothur (extended) / Microbiome (#hands-on-obtaining-our-data)

Screenshots

Get the links to the fastq reads from here

And to the reference files from here

All you need to do is to copy the links, and paste them into the Upload tool. Use all default settings. The tutorial includes help for renaming, but that won’t be needed when loading data at the EU server for this tutorial when using the Zenodo links.

TIP: You can and should click on the files to review the metadata – it should be correct at this point, but learning where those metadata are displayed is part of following the tutorial. If anything looks wrong, that is an important clue that something didn’t work right, and you can ask about it here :slight_smile:

Once all the files complete the uploading process and are green in color, you can create the paired-end collection. This involves just the fastq reads and has tutorial instructions. It is great that you learned how to do this already!! Using collection folders is an important part of batch work.

Later on you can experiment with custom settings and fixing up files that have problems. But for this, let’s start at the very beginning without anything extra added in. This is so you can create a reference history to use. When processing your own data, you can refer to the scientific choices for parameter options by comparing to this reference. Custom data – from anyone – will probably need slightly different choices, and the tutorial explains those choices at each step.

Please give that a try, and let us know if you can get this to work! :rocket:

Hi @jennaj ,
I did the full tutorial and it finally works until the end! Thank you very much!

Now…trying to do it with my own data, but I still cannot manage to solve the make contigs error.
I could identify the cause, that is probaly because my data is done in the V3-V4 region of the 16s, but I cannot find any information about how to change the parameters to make it work.
Could you please send me any tip, or if you know where can I find something?
Thank you!
Best wishes,
Lorena

Hi @lore77

First, congratulations on completing the Mothur protocol! Great news.

Then, when using your own data, you had advice about this from the scientists helping out with the Galaxy Training Academy last week, yes? But even if that wasn’t your question, I think the advice is a fit.

That advice included:

  • (from Paul Zierep) I would recommend the dada2 toturial or using Lotus2, both should allow for less problems when executing and provide state of the Art asv/otu results

And someone else had a similar question about V3-V4 protocol adjustments. That person had a data issue introduced upstream (during QA steps). Data preparation steps can make or break later scientific steps of course, so seems relevant.

  • (@bernt-matthias) FastQC could not find any of the standard adapters. So this might not be the problem.
    For the trimming you have chosen 240 + 160nt which is 400nt. If I’m not wrong then the V3+V4 region has a length of 400 to 500nt, isn’t it?
    This would mean that after trimming with the chosen parameters it’s very unlikely to get any overlap. So my suggestion would be to rerun with larger sequence length. My suggestion would be to cut only a few bases for the forward reads (since the avg quality is over 20 nearly for the whole read length) – maybe use 290? For the reverse reads you could also try with 200.
    To sum this up: the sum of these two numbers should definitely be larger than 400 – ideally something close to 500 (+ the overlap needed by dada2 (12nt))
    Note that the data used in the tutorial was only V3.
    See also Hands-on: Building an amplicon sequence variant (ASV) table from 16S data using DADA2 / Building an amplicon sequence variant (ASV) table from 16S data using DADA2 / Microbiome

Since both converged on that other tutorial, I would also suggest starting there. Especially since it sounds like you are not getting overlaps, or enough overlaps (?), with Make.contigs.

Making contigs is a type of assembly, and all assembly is super sensitive to the quality and content of the reads being assembled. That means reviewing your scientific choices upstream as a potential place to make adjustments.

Hope this helps again! :slight_smile:

Hi @jennaj ! Good morning, thank you again for your sugestions!
Yes, I asked the same question in the training last week, I forgot to mention that I tried the dada2 tutorial but I couldn’t adjust that to my data, that’s why I came back here, I will try again and also lotus2 and I will let you know. Have a nice day! Lorena