Trinity - assembly error -- Input problems

greetings ! i tried to do a de-novo assembly PE of transcriptomes from seven samples. the raw files were trimmed and quality check which remained satisfactory. while running Trinity it ran into errors and when i rerean the job there were only 4 assembled transcripts. all the parameters for trimmomatic and Trinity were set to default with insilico - normalization enable. all data set were made to two files (R1 and R2) before analysis. there error is as follows.

Left read files: $VAR1 = [ ‘/pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051971.dat’ ]; Right read files: $VAR1 = [ ‘/pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051973.dat’ ]; Saturday, June 27, 2020: 11:16:39 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /cvmfs/main.galaxyproject.org/deps/_conda/envs/__trinity@2.9.1/opt/trinity-2.9.1/util/support_scripts/ExitTester.jar 0 Picked up _JAVA_OPTIONS: -Dsun.zip.disableMemoryMapping=true Saturday, June 27, 2020: 11:16:41 CMD: java -Xmx4g -XX:ParallelGCThreads=2 -jar /cvmfs/main.galaxyproject.org/deps/_conda/envs/__trinity@2.9.1/opt/trinity-2.9.1/util/support_scripts/ExitTester.jar 1 Picked up _JAVA_OPTIONS: -Dsun.zip.disableMemoryMapping=true Saturday, June 27, 2020: 11:16:42 CMD: mkdir -p /pylon5/mc48nsp/xcgalaxy/main/staging/29093926/working/trinity_out_dir Saturday, June 27, 2020: 11:16:42 CMD: mkdir -p /pylon5/mc48nsp/xcgalaxy/main/staging/29093926/working/trinity_out_dir/chrysalis ---------------------------------------------------------------------------------- -------------- Trinity Phase 1: Clustering of RNA-Seq Reads --------------------- ---------------------------------------------------------------------------------- --------------------------------------------------------------- ------------ In silico Read Normalization --------------------- – (Removing Excess Reads Beyond 200 Coverage – --------------------------------------------------------------- # running normalization on reads: $VAR1 = [ [ ‘/pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051971.dat’ ], [ ‘/pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051973.dat’ ] ]; Saturday, June 27, 2020: 11:16:43 CMD: /cvmfs/main.galaxyproject.org/deps/_conda/envs/__trinity@2.9.1/opt/trinity-2.9.1/util/insilico_read_normalization.pl --seqType fq --JM 720G --max_cov 200 --min_cov 1 --CPU 16 --output /pylon5/mc48nsp/xcgalaxy/main/staging/29093926/working/trinity_out_dir/insilico_read_normalization --max_CV 10000 --SS_lib_type FR --left /pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051971.dat --right /pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051973.dat --pairs_together --PARALLEL_STATS -prepping seqs Converting input files. (both directions in parallel)CMD: seqtk-trinity seq -A -R 1 /pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051971.dat >> left.fa CMD: seqtk-trinity seq -r -A -R 2 /pylon5/mc48nsp/xcgalaxy/main/staging//29093926/inputs/dataset_42051973.dat >> right.fa CMD finished (1164 seconds) CMD finished (1188 seconds) CMD: touch left.fa.ok CMD finished (0 seconds) CMD: touch right.fa.ok CMD finished (0 seconds) Done converting input files.CMD: cat left.fa right.fa > both.fa CMD finished (169 seconds) CMD: touch both.fa.ok CMD finished (0 seconds) -kmer counting. ------------------------------------------- ----------- Jellyfish -------------------- – (building a k-mer catalog from reads) – ------------------------------------------- CMD: jellyfish count -t 16 -m 25 -s 103245940669 both.fa CMD finished (2327 seconds) CMD: jellyfish histo -t 16 -o jellyfish.K25.min2.kmers.fa.histo mer_counts.jf CMD finished (231 seconds) CMD: jellyfish dump -L 2 mer_counts.jf > jellyfish.K25.min2.kmers.fa CMD finished (668 seconds) CMD: touch jellyfish.K25.min2.kmers.fa.success CMD finished (0 seconds) -generating stats files CMD: /cvmfs/main.galaxyproject.org/deps/_conda/envs/__trinity@2.9.1/opt/trinity-2.9.1/util/…//Inchworm/bin/fastaToKmerCoverageStats --reads left.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 8 > left.fa.K25.stats CMD: /cvmfs/main.galaxyproject.org/deps/_conda/envs/__trinity@2.9.1/opt/trinity-2.9.1/util/…//Inchworm/bin/fastaToKmerCoverageStats --reads right.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 8 > right.fa.K25.stats -reading Kmer occurrences… -reading Kmer occurrences… done parsing 429493222 Kmers, 332051587 added, taking 19712 seconds. Sequence: ATTCTTCCNis smaller than 25 base pairs, skipping Sequence: TTTGTGSequence: is smaller than ATCGCCACGCGCACCGCCGis smaller than 25Sequence: base pairs, skippingCCGCAGTCGAGCGCCGis smaller than 25Sequence: CGGATACAGAGAAGGTCTGCG is smaller than 25Sequence: GCAAGCTTTGGTTCCCCTGGCC25is smaller than 25 base pairs, skipping base pairs, skippingSequence: Sequence: base pairs, skippingGACCGCCGis smaller than 25 base pairs, skipping base pairs, skipping Sequence: AGAAAGCGTCGGASequence: is smaller than 25GGGGGTTGGAGCCAAGCAGGAGTTTACGCGis smaller than is smaller than 2525 base pairs, skipping Sequence: ATGAGACAATTTCATCGCAG base pairs, skippingSequence: base pairs, skipping is smaller than GTACCGCTCTCGC25 base pairs, skippingSequence: Sequence: CACCACGGGCGGCGGAis smaller than is smaller than 25is smaller than base pairs, skippingSequence: 25TGGG is smaller than 2525 base pairs, skipping base pairs, skipping base pairs, skipping Sequence: Sequence: GCTCCATCACGGAGGGCGGCGAGGAGGGACGCGis smaller than is smaller than 25 base pairs, skipping25 Sequence: base pairs, skippingAGGAAGG is smaller than Sequence: Sequence: Sequence: 25GCTTTCCTGAAGGTGGSequence: Sequence: CGTCCATCATCCCATGCCis smaller than is smaller than 25GATAGAGAGis smaller than 25 base pairs, skipping base pairs, skippingSequence: AGAGGGACACATGGTGAGAGis smaller than 25 base pairs, skipping TACCCCAACGGTTCACGGis smaller than Sequence: ATCCTCCAGGTGGAGCCCA25 base pairs, skipping base pairs, skippingCCTCGTGCGATATCCAAAG25 base pairs, skipping is smaller than 25 base pairs, skipping is smaller than 25 base pairs, skipping Sequence: Sequence: CCTCTTTCGAATCCTCGSequence: AAATTGGGis smaller than is smaller than 2525Sequence: is smaller than 25 base pairs, skippingGCGG base pairs, skipping base pairs, skippingSequence: is smaller than 25 base pairs, skippingTTTC is smaller than 25 base pairs, skipping Sequence: TGATCCCCTCCGATCTCGis smaller than 25 base pairs, skipping Sequence: GATGCGis smaller than 25Sequence: CGTCTCCGCGGAGGTATGCASequence: base pairs, skippingis smaller than Sequence: CGGATTTACATTTAGGAAAGSequence: is smaller than 25CTTTCGCCGCC25Sequence: Sequence: CGGGCGAAGAAGAGGAAGA base pairs, skipping GTGTTCAAGGCGGATCTGCCCGis smaller than base pairs, skipping is smaller than 25is smaller than TTAGAGAACACCGAG25 base pairs, skipping25 Sequence: AGCGis smaller than is smaller than base pairs, skipping25 base pairs, skipping base pairs, skipping Sequence: 25AAATGCTGAAT is smaller than Sequence: GAGGTG25is smaller than 25 base pairs, skipping base pairs, skipping base pairs, skipping Sequence: Sequence: AACCTCGTCAGGTCAAGGSequence: is smaller than TGGACAAGACCAAGis smaller than Sequence: 2525is smaller than GTACATATTAAC base pairs, skipping25 base pairs, skipping base pairs, skippingis smaller than Sequence: GGACGAAGGATGAGGGACCTTATis smaller than 2525 base pairs, skipping base pairs, skipping Sequence: Sequence: ACAGTCATCACGCGTGGGCGTTAGAGCATTGAGis smaller than Sequence: 25Sequence: base pairs, skipping GTGCGCCTGATis smaller than is smaller than CGATTGTAT25is smaller than base pairs, skipping25 Sequence: CGCCTGATTCTTCGCG base pairs, skipping25 base pairs, skippingis smaller than 25 base pairs, skipping Sequence: Sequence: GAAATGGTGTTGCAAGGATGTCTTCAACCGCGis smaller than is smaller than 2525 base pairs, skipping base pairs, skippingSequence: Sequence: CTCCAGGCGCTTGCSequence: Sequence: GTGACAGGTGCACTATACCACGTAAATGACCGGGis smaller than is smaller than 2525 base pairs, skippingis smaller than Sequence: base pairs, skippingSequence: is smaller than AGTAGCATTTTCAAGCTCGG25 base pairs, skippingis smaller than Sequence: AATAAAGAAGGG25is smaller than 25 base pairs, skippingSequence: 25GTGGGCG base pairs, skippingSequence: is smaller than ATCATCCCCAAGTGAAGAC25CTTGCACCG base pairs, skipping base pairs, skipping is smaller than is smaller than 2525 base pairs, skipping base pairs, skipping Sequence: AGGTCAACCCCGAGSequence: GCGTCGCCCGCCGAAGAAGCis smaller than is smaller than 2525 base pairs, skipping base pairs, skippingSequence: Sequence: CACTCGCCGCTGATCTGGAGCAAGis smaller than is smaller than 25Sequence: TGTAAG25Sequence: base pairs, skippingAGCACCCC base pairs, skipping Sequence: is smaller than is smaller than ACG25 base pairs, skippingis smaller than 2525 Sequence: base pairs, skippingATCCCTACCCAC base pairs, skippingis smaller than 25 base pairs, skipping

and so on…

any help is appriciated and thanks in advance…

1 Like

Hi @Senthilkumar_Shanmug

I found your bug report about the same error.

The job is likely failing for exceeding memory resources. A group of specific tools was problematic over the last few weeks, and a notice about reduced memory is in a banner at usegalaxy.org. Trinity is part of that group.

There are short reads in your inputs that will never assemble (what all the warnings are about) but that isn’t the root reason for the failure. Even smaller test runs failed over the weekend. We may need to update the banner again to state that Trinity is non-functional.

More details here, including workarounds: SPADES - Remote job server indicated a problem running or monitoring this job.

Thanks for reporting the problem!