Which server did you run
Convert GTF to BED12 on? That is a server-side dependency problem.
Update: I just ran the `Convert GTF to BED12 tool successfully at Galaxy EU https://usegalaxy.eu. It isn’t available at Galaxy Main https://usegalaxy.org (but probably should be, I’ll make a request to add it). Converting to a BED6 is possible at both, but you need a full BED12 for this particular operation.
If your error was at usegalaxy.eu, try a rerun, maybe there was a transient cluster issue leading to the dependency not being found at runtime.
Update2: There are many ways to compare coordinates between files, find overlaps, then report, reformat, summarize, etc.
For example, the first genome you mentioned has annotation at NCBI. One of the formats available is “Tabular” (here: https://www.ncbi.nlm.nih.gov/genome/proteins/17?genome_assembly_id=360969). That data could be loaded into Galaxy and converted into an
interval format (or the more stringent
This would involve a few steps – reformatting the chromosome names (probably, depends what these are in your peaks file), subtracting “1” from the start coordinate (start coordinates are 0-based in interval/bed files but are 1-based in the NCBI file), rearranging/restricting columns of data (for bed format, for interval it wouldn’t matter), then assigning the proper datatype at the end.
Much of what this tutorial is describing is how to format data into compatible formats so that their genomic coordinates can be compared accurately. It uses functions/tools under the top-level tool grouping “GENERAL TEXT TOOLS”. The manipulations in the tutorial are specific to those particular files/datatypes but the reason why it is in the “Introduction” topic section, and contains so many manipulations, is to help people get familiar with some of those tools and manipulating data in general. Many of these tools are command-line utility analogs.
Some of this is explained in “Part2” of the tutorial. Biomart doesn’t have your particular two genome’s annotation, but NCBI does. If you are confused about what the dataset (file) formats should be like, or how to change metadata, or why primary keys like “chromsomes” names need to match up, these FAQs should help:
Please try to reformat the annotation yourself. It is important to learn how to do this, and that will take some trial and error. But if you get completely stuck, write back and we can help more. I might ask for a history share link (can be sent privately). Keep the history as small as possible (just this analysis) and make sure it contains your peak file and the gene annotation files you have been working with (should include the original GFF3 and the NCBI tabular annotation plus your attempts to manipulate those). It can just be for one of the genomes. I’m assuming that both have the same format for the peak data, so whatever solution works for one will work for the other.
If your peak data is from a public source, or if you don’t mind making it public (at least in part/some subset), we could work out a clean solution then post all of that back here, so others can learn from the example. Or, we could just post back the steps to manipulate the NCBI tabular annotation into an interval dataset (simple history + workflow).