Splitting interleaved/interlaced fastq data and Extracting fastq data from an sra archive

jennaj · April 9, 2020, 12:13am

For your Trinity questions:

Trinity requires that paired-end inputs are “matched pairs”. Meaning, both ends of the same read are input.
If one of the ends fails QA/QC (Trimmomatic, Fastp, others), then the other associated end cannot be used, even if it happens to passes QA.
When extracting from SRR, the original data will be paired.
When running through QA tools, the data can become un-paired. That said, the output from QA tools, for example Trimmomatic, the data will be sorted into four datasets.
1. Paired forward + Paired reverse = use these for assembly inputs
2. Single forward + Single reverse = do not use these for assembly inputs. One end of the original pair did not pass QA, and the assembly will fail if input.

Please be aware of a few current factors that can impact assembly success/failures when using the public Galaxy Main https://usegalaxy.org server right now. There is a banner on the server explaining. More details:

Trinity and Unicycler are running with reduced memory allocation at this time.
Make sure to use the most current version of all tools, or unexpected problems can occur. The most current version of any tool’s form will load from the Tool Panel.
If your job fails, confirm that you are using the most current tool version.

If not, rerun using the updated version.
If yes, then the failure may be due to the reduced memory resources. Try one rerun. If that fails again, there may be some other problem with your inputs. How to check for common input problems is discussed in the topic below.
My inputs are Ok – How to work-around the reduced memory allocation? a) Consider using an alternative public Galaxy server b) Decide if down/sub-sampling your inputs will meet your goals (see Seqtk tools).

Hope that helps!

Topic		Replies	Views
Paird-end Fastq-dump Manipulation - Fastq De-Interlacer	3	2518	May 7, 2019
NCBI SRA Fastq (convert SRA files from GEO into fastq files) usegalaxy.org support metadata , sra , quality-control	7	1793	June 22, 2021
Error recognizing FASTQ format for SRR files from NCBI usegalaxy.org support upload , tool-help , fasterq_dump	4	58	September 30, 2024
Separate/split replicates in single SRA file usegalaxy.org support sra	2	1338	June 4, 2020
Faster download FASTQ from NCBI SRA_ERROR usegalaxy.org support upload , troubleshooting , transcriptomics	3	35	November 6, 2024

Splitting interleaved/interlaced fastq data and Extracting fastq data from an sra archive

Related topics