Hello, I am trying to run the make.contigs commend of mothur to a list:paired collection and no matter how big or small my collection is, it crasshes due to file size load. Did you have the same problem? Do you know what can I do?
Welcome @Ioanna_Gkoni
We can probably help with the usage here. Would you like to share back your history with the errors as a starting place?
We’ll also share some of the troubleshooting help below in case you would like to try on your own first. And maybe the context about what we would be reviewing helps anyway. ![]()
My initial guess is that the collection folder content is a mismatch for the expected collection shape. So, let’s start there.
Collection Folder Shapes
Why use Collections? A collection organizes data by sample throughout an analysis project. This can make it easier to track data from tool to tool. A collection also lets you use tools on large batches of data all at once, even without a workflow! Fewer clicks is
. Video → Why collections?
Some prior background about “shape” and “format” are in topics like these:
- Troubleshooting: fastq read content and shape. Collections, interlaced/interleaved reads, quality assurance - #8 by jennaj
- rnaSPADES --nanopore: Confirming data format and shape for an input area - #2 by jennaj
In short: a collection folder can have a few shapes! The basic shapes most people will use are a List and a List of Pairs.
List
A simple listing of files that are all of the same datatype format.
The data is grouped by a common sample identifier (element identifier).
These could be single-end fastq reads, or fasta files, or any other type of file, including assembled contigs and BAM files.
List of Pairs
These are files that are also all of the same same datatype format.
The data is grouped by a common sample identifier (element identifier).
These are usually paired-end fastq reads! Forward and reverse are nested under each sample.
Using Mothur make.contigs
The tool form for this tool can accept many different data shapes, including individual files (one or many). Choosing the correct shape for your collection input will be important since this informs the tool about how to read in your data for processing. Why? Context such as which files are the forward reads and which are the reverse reads is needed to process your samples!
Screenshot of the choices on this tool’s form
Two simple fastq files (forward and reverse)
This can be individual files, or you can have two List collections: one with forward reads and one with reverse reads.
One pair
This is a legacy Paired format and your data is unlikely to use it! We retain it for reproducibility reasons. Getting a single pair into and out of this collection folder format is difficult and we’ll never recommend using this now. If you think you have this – try building your collection again (from the original unhidden individual files) into a List of Pairs instead! Ask if you need help!
Multiple pairs - Combo mode (list:paired collection)
This is the List of Pairs collection folder shape. If you have more than one sample, this is probably what you will be using.
Note: But you could also Collection Operations → Unzip your List of Pairs collection into the forward and reverse simple List collection shapes to use that instead (the first choice above).
Either of the two choices – two List (s) or one List of Pairs will work the same! The tool just needs to know what to expect with the initial toggle parameter.
Please give this a review and test to see if it helps! If you are still stuck, we can help! Seeing what you have now will allow us to offer the best advice about what to try next. And if you are able to solve the problem, please let us know! ![]()
