Hello,
I have started a GTF6GFF conversion job using GTF data created by Braker3 on Galaxy,
This job runs now for ca. 24 hrs, but did not complete.
Before cancelling the job, is there any possibility to find out, why this job did not complete?
UUID 3aec77c2-c6d0-43d2-a203-a6580bc33481
standard output and standard error are empty.
@jennaj
Correct! But a GTF to GFF3 conversion jobs should complete within seconds and not run for 4 days. Even if my job is somewheren in the queue, I would expect to complete within a reasonable time.
Would you mind to look at the job itself or forward this issue to some sys admin?
Sure. You can share the history with the job that is currently running and Iāll check. See the banner at the top of the forum for the how-to, or see How to get faster help with your question. This will need a history share link, not just screenshots.
If you donāt mind to share you braker3 file (via GitHub issue here Issues Ā· NBISweden/AGAT Ā· GitHub or by mail) I may have a look to see AGATās performance on its own (outside of Galaxy)
It looks like standardizing the fasta file was enough. Or did I not understand the data?
Some tools assume that everything on the > lines is the āidentifierā. Removing everything after the first whitespace can be done with NormalizeFasta to adjust the data directly in Galaxy next time. Always worth a try to do that when an error comes up: different tools, different authors, different data assumptions. Sometimes the Galaxy wrapper around a tool can do the adjustment but not always possible.
I guess you could report this to the author at that same repository as an enhancement request for some future release. Considering just the first āwordā on the fasta title line as the sequence identifier is a pretty universal standard, and data providers tend to put descriptions there (on purpose). And if it turns out that it is not the underlying toolās requirement, but related to how the tool was wrapped for Galaxy, they could help us to adjust it. Would make the tool friendlier.
XRef for others that run into something similar. This tool is not unique about the format assumption. Meaning, some tools can parse the identifiers out of a > line fine, some cannot. Standardizing doesnāt hurt unless a tool is actually using the description content, and youāll probably recognize the few cases when that is needed. ā FAQ: Working with very large fasta datasets
It was the AGAT gtf2gff job, that did not complete, the Braker3 job was ok after removing description from the header lines.
I could confirm the results by the AGAT developers, the gtf2gff conversion works on a standalone AGAT.
Best,