No Reference GFF file available in ClosestBed tool

Hi,

I was trying to match my ChIP-seq peaks with nearest genes, for what I thought I’d use the ClosestBed tool. However there are no built-in GFF files available. Could you please add one for mouse?

Many thanks!

1 Like

Hi,

At least for the near term, there will be no built-in reference annotation added to this tool. Instead, locate and upload a annotation source and use it “from the history”.

Mouse GTFs can be obtained from Gencode and iGenomes. These versions of the annotation will have the most utility across tools.

This prior Q&A is about human (hg38 + hg19) but the same sources/formatting advice applies for mouse (mm10 + mm9):

1 Like

Thanks for your reply. I have downloaded one from iGenomes (ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Mus_musculus/Ensembl/GRCm38/Mus_musculus_Ensembl_GRCm38.tar.gz) and tried to launch the tool. I’m getting an error:
Fatal error: Exit code 1 ()
Error: Unable to determine type for file /galaxy-repl/main/files/030/235/dataset_30235478.dat

(The file name doesn’t correspond to the name of the genome file, but it must be it since my peak bed file is fine).

Could you please suggest what to do with that?
Many thanks

1 Like

iGenomes: The entire “tar.gz” archive should not be uploaded to Galaxy. Just the genes.gtf dataset included in it. Also, it looks like you picked the Ensembl version of the mouse genome. The genome/build must match the genome/build that you originally mapped against. At Galaxy Main, this would be UCSC’s mm9 (GRCm37) or mm10 (GRCm38).

If the correct annotation is not picked, other problems can come up: FAQ

Gencode: The gtf data can be directly imported with Upload but will need some formatting standardization after. The data here is formatted with UCSC genome build identifiers. The most current release will be mm10 (GRCm38) and the prior release is mm9 (GRCm37).

Both will work, with Gencode sometimes easier for people to get (does not involve downloading, uncompressing, etc locally on your own computer).

1 Like

Hi Jen,
It’s been a while but I eventually came back to the same problem… I started from scratch and downloaded the mm10 annotation from Gencode, successfully removed the headers and changed attributes to GTF. Then launced the ClosestBed tools with my MACS2 peaks and received the following error (see attached). Could you please suggest what could be wrong? Thank you.

1 Like

Hi @Sheryss, sorry to hear that you ran into more problems.

The error message suggests that duplicated input datasets were entered on the tool form. This is usually not intentional, just a simple mistake that can be fixed.

Click on the job-details job details button for the error dataset. On that report, review the input selections chosen for that specific job execution and find out if there was a duplicated dataset entered, then fix it when rerunning.

The sort order portion of the error … I’m not sure about. Tools can error in all sorts of odd ways when there is a problem with inputs. Dataset content, duplications, or a selected input is mismatched with the intended input entry area on the tool form. Suspect the latter in your case – maybe the reference GTF was entered twice? Or in the wrong place?

This is how to input your data:

If that doesn’t solve the problem, please write back and we can troubleshoot from there. Include a screenshot of the job details page. Just the portion of the report about the inputs/parameters – to protect your data privacy – other parts of the report are not needed yet and can be shared privately if needed later (direct message to me here, or sent in as a bug report with a link to this post).

If you have already sent in the bug report – is your email address used here at Galaxy Help the same as your registered account email address at Galaxy Main https://usegalaxy.org? If different, you don’t need to post your email publically – (and never share your password with anyone!) – instead send it to me in a direct message so I can find it and associate/consolidate the two issue reports. I’m an admin at both places, so the Galaxy Main account email address is enough for me to review problems at the server in detail, and privately.

All data/solutions will be anonymized for public posts, to help others that may run into the same usage problem in the future. Private portions of data/solutions (as needed) will be discussed in a direct message and/or email.

How to direct message others at Galaxy Help. My user name is @jennaj.

ghelp-direct-message

1 Like

Hi @jennaj
Thank you so much for your quick reply. I double checked that I’m not submitting duplicated inputs, and unfortunately the tool is still not working. Just submitted an error report from the account linked to the same email address as I use here – would appreciate any suggestions! Thanks again.

1 Like

Great, I’ll review