Make.contigs (Mothur error)

Hi everyone,
I am facing some issue with the mothur analysis, i read about the tutorials and tried, but when I have started with my own data, it is showing error, and some files become converted items, I am sharing my history, please help me out.

Thanks,
Shweta

1 Like

Hi @Shweta.203

This is what your error message is reporting (the stderr).

File size limit exceeded(core dumped)

This means that the data is “too large” for the tool to process. This is somewhat unusual for the UseGalaxy.eu server, so the limit is probably from Mothur itself.

You skipped a few data preparation steps at a minimum, so that is probably what is going on, but it is impossible to guess more. There could be scientific reasons, too, but you won’t know until the technical parts are addressed.

What to do:

  1. Don’t skip the data preparation steps, including read QA. Making contigs is a form of assembly, and all assembly is sensitive to the content inside of the reads – the base pairs. You want to remove all sequencing artifact and low quality data as much as possible before attempting to assembly anything, with any tool. The tutorials have examples of what to try.

  2. Consider the parameters you are using with this tool. I would probably start with just a few pairs while testing out my steps, and tuning parameters to fit my data. Then once that is working, extract my workflow, polish and test that on the smaller subset of data, then once that workflow is working on parts of my data try to run the whole batch. This will let you rerun many times until you get the scientific parts the way you want them, without all that tedious clicking per tool. You can also use the tutorial’s workflow as a template, and tune that. Your choice, but don’t try to troubleshoot with the whole large set of dataset pairs, use just parts, then scale up once working.

Hope this helps! :slight_smile:

ps: Thank you for sharing your history! It made helping super fast on our side!

1 Like

Hi Jenna,
Thanks a lot for the suggestion.
But, I would like to ask if its possible to have a zoom meeting as I have many doubts and I am struggling to analyze my data. I missed the training programme organized by galaxy last month.

Thanks,
Shweta

Hi @Shweta.203

There isn’t a 1-1 option for help, but if you post to the GTN training chat, you will reach the scientists who wrapped this tool (made it available in Galaxy). These are the same people that created the training materials, and who help out during events. I don’t think your problem was a tool problem but a data problem, specifically, the data preparation steps were missed so the reads were not ready for this mini assembly-clustering step. But you can get more opinions of course. :slight_smile:

Find a link to the chat under the Help menu here → https://training.galaxyproject.org/

And I think you know where the event information is already, but anyone else who stumbles across this topic, find it here. All the materials are online “forever” and accessible just like it would be if you attended (minus the special Slack, but you can use the GTN chat, or ask questions here for support). → Galaxy Training Academy 2024

I know you are overwhelmed, but really, start slow. Pick a few pieces of representative data, and start from there. You can come back if you have trouble with those, too!

1 Like

Hii Jenna,

Thanks for the suggestions. I will start with subsets of data, including the data preparation steps.
Additionally, I would like to ask you if there is any option for paid training as my PI is willing to pay for my training. I have many doubts and queries regarding data analysis, tutorials are helpful but cannot completely answer my queries. Analysing data looks like neverending loop for me. Please help me out.

Thanks,
Shweta

Hi @Shweta.203 You could ask at the GTN chat. Maybe someone is available for freelance tutoring. The community is huge. :slight_smile:

Hii Jenna,
Thank you for this suggestion, I will go for it.
As suggested by you I have tried with a small subset with only 12 samples from two groups, it is displaying same issue with make.contigs tool (following the mothur extended tutorial). I have shared my history, please suggest me solution. Additionally, when I prepare the dataset pair in galaxy eu, and further I go for using any tool, it shows as implicit datatype conversion. what does it mean for. I used galaxy main but it never showed this type.

Thanks,
Shweta

Hi @Shweta.203

The problem appears to be with the read quality.

Try this to see the same

  1. Run Flatten Collection on your samples. Use defaults. Each file needs to have a unique name.
  2. Then run FastQC on that output collection.
  3. Then run MultiQC on the Raw FastQC results.

How to interpret the report is on those tool’s pages, and both also have links to tutorials with more. You can also review prior topics at this forum under quality-control

You should evaluate all of the samples. I can’t tell if trimming is enough or if more is needed, or if something went wrong in the lab upstream.

Not great news but this is at least a start, and explains why you had trouble making contigs. :slight_smile:

Hi Jenna,
Thanks a lot for your prompt response for my queries I have tried doing quality control by following tutorial and sharing my history with you.

I have checked the quality control parameters, it seems fine. It will be good be if you can check any one file. I have also used the cutadapt tool.
Can I use these cutadapt output files and rename them to proceed with contigs as they are trimmed and without adapter files.
I know I am bothering you much, but I want to learn data anaylsis.
Thank you,
Shweta

Hi @Shweta.203

Just try this and see what happens. Trim and run QA both before it after – you want the artifact at the start of the sequences gone. Keep going with trimming until that happens. That might be enough to do the next step with Mothur tools.

Hi Jenna,

Thanks for your suggestion, I trimmed and run QA for my samples. I was able to run the further mothur tools.

1 Like

Great @Shweta.203 so glad that worked, and I replied to the issue you had with the downstream tools. Let’s continue there after you try my suggestions. :slight_smile: