can't get trinity to assemble - "problematic input choices"

I am unable to get trinity to complete assembly of even simple data sets. I fetch an SRA file from NCBI, run trimmomatic, and wind up with R1-paired and R2-paired files (8.5G each) which I use as inputs to trinity, and I get errors every time. I assembled many transcriptomes with a similar trinity workflow last year with no problems - I started to work with some new species just last month, and nothing completes. I backtracked to files that worked last year, they all crash. I tried every possible permutation of concatenating the input files, making them a collection, they all crash. I cannot get it to complete on anything.

I’m sure there’s been a new version of trinity since last year, so I am hopeful that there is some parameter or input format that is new, a simple fix, or perhaps I need to somehow reformat my data. But I am completely stumped - would appreciate any help, or any test files that I could try.

1 Like

Hi @grakster . I have the same problem with two different data sets. One of these is from a SRA file from NCBI with R1-paired and R2-paired files and the other one is from an example dataset (tutorial). In both cases I run Trimmomatic without problems. But when used Trinity (Galaxy Version 2.9.1 or 2.8.5) on the Trimomatric paired files, the analyses stop with error: ‘This tool could not be run because of a misconfiguration in the Galaxy job running system, please report this error’.

Did you resolve the problem? Any help would be appreciated.

1 Like

Sorry, I have no solution for this. I’ve even tried the fastq groomer on the trimmomatic files to try and catch any format errors that might cause problems, but no luck. I have also had increasing problems with other tools not working - even something as simple as a small FASTA file that I try makeblastdb on, and it crashes red. Even old files that have worked before. Your error message definitely sounds like an internal issue, as does mine, but I don’t know how it can be solved from our end.

1 Like

@Carlos_Aguirre @grakster

Would you each please send in a bug report from your Trinity error jobs? Include a link to this post in the bug report comments, then write back here so we know when to look for it.

Note: For any job error that was run within the time window of 8-14 hours ago, it was likely a server-side issue related to network issues, so try a rerun first. And if that fails again, then send in the bug report. If you leave the inputs/outputs undeleted, we can provide more feedback.

Assembly jobs with paired-end inputs over about 6GB each could be failing for resource reasons (the job is too large to execute on public servers) but we can help to determine if that is a factor or not. Tutorial data should not fail, but we can sort that out too.

Thanks for reporting the issues!

The problem persists with the test dataset. I created an error report.
In the other case (my real samples) the problem could be related to the data set size (paired-end inputs of 8.5 Gb each, R1 and R2). In this case, is there any possibility to analize it in a different server? Any recommendation?

Thanks @jennaj

1 Like

@grakster For your assembly, the issue is related to inputting a collection to Trinity. This is a known issue and we are working to resolve that again today. When this ticket closes, that particular issue will be resolved: https://github.com/galaxyproject/usegalaxy-playbook/issues/308

@Carlos_Aguirre Your data inputs are single F/R inputs that appear to be running out of resources during job execution. Try downsampling your data if you want to continue working at usegalaxy.org. One tool choice is Sektk seq. Update: Data quality issues found with FastQC that need to be addressed.


For full large assemblies and more resources, setting up a cloud Galaxy is the alternative. If a job runs out of resources at usegalaxy.org, it would also likely run out of resources at other usegalaxy.* servers. Note: If data has quality problems, that needs to be addressed whether working within Galaxy or not.

Choices matrix: https://galaxyproject.org/choices/
Ways to use Galaxy: https://galaxyproject.org/use/

The GVL version of Cloudman is a common choice for large work/simple administration/preconfigured yet customizable personal Galaxy servers. Non-technical researchers choose that option every day. AWS has always offered grants, but the options were expanded recently. https://aws.amazon.com/grants/