Tools to get precise annotated genomic regions for bedgraph input data

Please help me to find tools to get precise annotated genomic regions such as inton, exon, 5’ or 3’ UTR, promoter for the mouse genome, I have bedGraph files with genomic coordinates and I would like to get based on the genomic coordinates to overlap with an annotated region of a gene. I have tried the Data Integrator but it gives so much information about a gene. I appreciate it your help.

1 Like

Welcome, @yaklich1904!

Try these two tools:

  • Gene BED To Exon/Intron/Codon BED expander
  • bedtools AnnotateBed

For the first tool, you’ll need to provide a BED12 reference annotation dataset based on the same mouse genome/build that your Bedgraph data is based on. UCSC might be a good choice for this use case. The idea is to create one bed dataset for each of the features of interest.

For the second tool, enter your Bedgraph data first (the query) and input the per-feature bed datasets created by the first tool (the targets). This tool calculates coverage.

If you need more detailed annotation per query Bedgraph region (not just a summary of feature coverage), review the tools in the group:

  • Operate on Genomic Intervals

There are many tools in here that intersect and compare bed datasets in different ways.

Hope that helps!

After thinking about this some more, I think that a tool that does specifically what you are asking for, in one step, is something we should consider creating. I made the request here if interested. Please feel free to comment in the issue ticket to show interest or to clarify/add in other functions you think would be helpful to include on the tool form. Others reading are also welcome to comment.

Update:

Please try the tool ChIPseeker hosted at Galaxy EU https://usegalaxy.eu. It looks to produce the kind of output you want. Discussion at the ticket linked above, including a test history with example usage. I also linked in a ticket to see if installing the tool at Galaxy Main https://usegalaxy.org would be appropriate (that is under review, and wouldn’t happen immediately).

Thank you so much. I will try.

1 Like

Hi @jennaj ,
Can we completely(Promoter, UTR, Intron, Exon, etc) annotate the genome with the information files of: raw data of genome sequence, RNA seq, and transcriptome. NO REFERENCE SEQUENCE IS AVAILABLE. On Galaxy do we have any tool to use with this available data? even if it is possible with combination of tools I would like to know the detail process.

Can you please help me with this.

Thank you
YKV