Metagenomics analysis: Tutorials

I have the metagenome data as fastq.gz files i have uploaded them and unable to form contigs (make.contigs). what would be the possible reason? what type of data is required for the contigs?

1 Like

Hi @Ghazal_Aziz

The tool expects reads in fastqsanger format. Try Upload without specifying the datatype. Galaxy will detect and assign the datatype that matches the best. If the upload fails or the datatype is not what you expect, this indicates a format or content problem. You may need to reload the data or possibly correct the data if the source is your own computer. QA or data manipulation tools can be used to validate most uploaded data.

  • GTN tutorials that cover Mothur and related Metagenomics tools/workflows are here:
  • If you are not sure how to Upload read data in a way that tools can interpret, please review the NGS data logistics tutorial here
  • and the Quality Control tutorial here

Yes I did upload them likewise. Tell me If I have metagenome sequencing data of 151bp and 150bp sequences are 29385616 with 16% sequence loss of original.
Can This length be OK to form contigs and what should be the kmer value?

1 Like

Hi @Ghazal_Aziz 16S amplicon reads of that length will have overlap. Kmer values are not user-specified with Mother make.contigs. If you think there might be a data quality problem, either review the QA/QC results already done or consider doing those steps.

No the quality is all looking good. but the contigs are not creating for some reason they always says some errors.

1 Like

@Ghazal_Aziz – To be clear, the make.contigs job is ending with a failed red error dataset?

Three things to help troubleshoot:

  1. If you post back the content from the bug report message and the stdout plus stderr we can try to help that way. How to find that information: Galaxy Training!
  2. When you post back, please include the URL of the public Galaxy server you are working at. If not at a public site, please describe. This is to confirm – the server may matter in some cases (see note below).
  3. Sending in a bug report would also be helpful. Include a link to this topic in the comments so we can associate the two.

Note: If you are working at usegalaxy.org, a rerun might solve the problem. We are migrating data to a new storage system, and a very small fraction of jobs may fail for technical reasons. A rerun usually addresses that particular problem. And, for certain tools, inputting uncompressed fastqsanger data will work when compressed fastasanger.gz originally fails. The banner on the server has more details, as does this topic: Long queue times at UseGalaxy.org 9/27/2021: Please allow jobs to stay queued for fastest processing

  1. This job was terminated because it used more memory than it was allocated. Please click the bug icon to report this problem if you need help

  2. https://galaxy.msi.umn.edu/datasets/error

tool error
An error occurred with this dataset:
This job was terminated because it used more memory than it was allocated. Please click the bug icon to report this pro

while making contigs is it necessary that R1 and R2 have same number of sequences?
like when i trim seq for 50-150 both have same number of sequences
but
when i trimmed them over to 151 bp then the sequence number is changed.

I am getting no response. from the reports send to help