Funnanotate failing in various ways

dairon · June 21, 2023, 9:57am

Hello! While doing work on Funnanotate, it has failed in several ways without performing its task:

Arthropoda 2023 data base does not work:

[Jun 21 10:21 AM]: OS: Rocky Linux 8.6, 125 cores, ~ 438 GB RAM. Python: 3.8.10
[Jun 21 10:21 AM]: Running funannotate v1.8.9
[Jun 21 10:21 AM]: ERROR: arthropoda busco database is not found, install with funannotate setup -b arthropoda

While using the 2022 database:

Names are too long:

[Jun 21 11:23 AM]: OS: Rocky Linux 8.6, 125 cores, ~ 438 GB RAM. Python: 3.8.10
[Jun 21 11:23 AM]: Running funannotate v1.8.9
[Jun 21 11:23 AM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[Jun 21 11:23 AM]: Skipping CodingQuarry as no --rna_bam passed
[Jun 21 11:23 AM]: Parsed training data, run ab-initio gene predictors as follows:
e[4mProgram Training-Methode[0m
augustus busco
glimmerhmm busco
snap busco
[Jun 21 11:24 AM]: Genome assembly error: headers contain more characters than the max (16), reformat headers to continue.
[Jun 21 11:24 AM]: First 5 headers that failed names:
JAAFCF010006596.1
JAAFCF010006680.1
JAAFCF010005931.1
JAAFCF010005748.1
JAAFCF010003739.1

When names are just fine:

[Jun 21 11:25 AM]: OS: Rocky Linux 8.6, 125 cores, ~ 438 GB RAM. Python: 3.8.10
[Jun 21 11:25 AM]: Running funannotate v1.8.9
[Jun 21 11:25 AM]: Can’t find Repeat Database at /data/db/data_managers/funannotate/2022-01-17-193541/repeats.dmnd, you may need to re-run funannotate setup

In conclusion, the tool seems to be failing no matter the configuration selected. Thanks for your attention and sorry for the inconveniences!

jennaj · June 21, 2023, 6:51pm

Hi @dairon

Yes, the 2023 database is known to be a problem. Ref: problem with funannotate - #3 by jennaj

The 2022 database had been working for others … but maybe there is still some problem!

We’ll need to wait for the indexes to be fixed for your use case. All indexes are undergoing a reorganization this summer, so in a few months this will be sorted out Thanks for letting us know about the problems!

wm75 · June 21, 2023, 7:43pm

@dairon Jen’s remark about the 2023 database is, unfortunately, accurate.
Regarding your second issue, it is the error message that is accurate though.
In funannotate FASTA sequence names cannot be longer than 16 characters (see Preparing your Assembly — Funannotate 1.8.14 documentation) and if I’m counting right, the mentioned ones have 17 characters.

dairon · June 22, 2023, 7:46am

Hi there! Thanks for the heads up

I understand the first two errors, regarding the 2023 database and the name issue, but is there a workaround for the last error message? I mean, I am not able to use the tool no matter the configuration, thanks again!!

wm75 · June 22, 2023, 4:31pm

sure, you can shorten the read identifiers with one of the Replace Text tools Galaxy has. You could, for example drop the first few characters and later add them back in in the result.
How you should shorten them exactly, depends on your data since you want to keep ids unique.

Topic		Replies	Views
problem with funannotate usegalaxy.eu support reference-index , server-open-issue , funannotate	3	656	June 1, 2023
Funannotate Missing Genemark Environmental Variables usegalaxy.eu support funannotate , genmark	10	888	June 16, 2023
Can't identify the error in Funannotate usegalaxy.eu support troubleshooting , funannotate	2	280	April 1, 2024
Funannotate fails, maybe trying to create a file usegalaxy.eu support transcriptomics , funannotate	3	124	July 10, 2024
Funannotate-Genmark error usegalaxy.eu support genome-annotation , funannotate , genmark	1	210	April 9, 2024

Funnanotate failing in various ways

Related topics