"these should be of equal length" when inputting collections to Unicycler

I am running Unicycler and getting the message when I try to run a job where I got 9 paired files for my pair ended reads and 3 files with Pacbio data.

“The server could not complete the request. Please contact the Galaxy Team if this error persists. Received 9 inputs for ‘paired_unpaired|fastq_input1’ and 3 inputs for ‘long’, these should be of equal length”

1 Like

The answer was available in the SARS-CoV-2 genome assembly tutorial. The Unicycler tool does not have a way to merge datasets so that needs to be done beforehand with the “Collapse collection” tool. Otherwise the Unicycler tool will try to run a batch job where the first dataset from the forward, reverse and long dataset produce assembly 1, the 2nd dataset in each produce assembly 2 and so on. I should have realized this when looking at the text below the collection selection box in the Unicycler tool interface which clearly states that “This is a batch mode input field” but which I for some reason did not think of when looking at the problem. I posted this question as I am sure other people will make the same mistake from time to time.

To merge reads from several samples into a combined final assembly, we need to pass the data to Unicycler tool in partially merged form. The forward and reverse reads of paired-end data should be kept separate, and so should short and long reads. However, the tool has no option to combine data from individual samples, so we need to merge the forward, reverse, and the long reads data, respectively, across samples. Conveniently for us, the outputs of the earlier Samtools fastx tool runs have already returned the data structured into three corresponding collections for us.

2 Likes

Thanks @TKlingstrom for posting back your solution.

For others reading, Galaxy’s SARS-CoV-2 projects and related resources can be found here:

https://covid19.galaxyproject.org/