Convert Galaxy output HiC file to .hic file for use on UCSC Genome Browser

I have been working through the HiCExplorer hands on tutorial in anticipation of developing a file that could be used on the UCSC genome browser. (I was given to understand that this would be possible.) However, if I understand correctly, the end file from HiC Explorer will be .h5. I can’t find any tool to convert a file from .h5 to .hic, which is the format useable on the UCSC Browser. Hoping for some clarification. Thanks!

Hi @acta1

I haven’t done this before, so let’s start at the beginning and figure it out. We can pool in more help as needed.

As a baseline, I just started up a fresh run of the Hands-on: Hi-C analysis of Drosophila melanogaster cells using HiCExplorer / Hi-C analysis of Drosophila melanogaster cells using HiCExplorer / Epigenetics tutorial in this shared history https://usegalaxy.eu/u/jenj/h/galaxy-workflow-galaxy-hi-c-1.

That will take some time to process :slight_smile:

Hi, Jennifer,
Thank you for your response. I have now found tools to convert from cool to hic, and from h5 to cool. I am going to see how far I get with that. I plan to try it with the .h5 file that I got as an endpoint in the tutorial. This may take me a while, but I will get back to you.

Update: the dm3 index with BWA-MEM has a technical problem on the server. Using dm6 for now instead is one alternative. Using a custom reference genome is another. Ticket BWA-MEM corrupted indexes · Issue #53 · galaxyproject/idc · GitHub

Hi @acta1

I have some feedback from our developers… it turns out that the tool usage in the GTN workflow needs an update!

I’ve ticketed that here, along with instructions about how to do this immediately, so you don’t need to wait. Bug fix: update version of BWA-MEM in Hi-C workflow to avoid fatal index error · Issue #5278 · galaxyproject/training-material · GitHub

You could also grab a copy of my testing workflow. :slight_smile:

Hi, Jenna,
Thank you for getting back to me. I need some help understanding your messages, as I am strictly a user, and am learning terminology by the seat of my pants. In your messages above of 7 d ago, is the tool usage to which you are referring the tools for changing file formats. I finished the first part of theHiC tutorial with dm3 and stopped at the .h5 file which could be used to plot a matrix. My intention was to change the format of that file to .hic so I could rtest it on the UCSC genome browser, where at that time I was hoping to analyze. I found Galaxy tools for changing file format, but none directly from .h5 to .hic. But, I could theoretically do it in a 2-step process. .h5 to cool to .hic. However, when I tried that, the final file type extension was not just .hic, but .JUICEBOX_HIC. This didn’t seem right. Were these the tools you mentioned were needing an update. When I checked the list of Galaxy tools for format change a couple days ago, there was no longer anything there for getting to either cool or hic. I may have missed something.

Is GTN Galaxy Training N??

Sadly, I don’t know what BWA-MEM is, other than an alignment software? I am not clear on where that bug would fit into my problem of changing formats? Was the .h5 file from the tutorial incomprehensible to the format-change tool?

Whatever help you can give is much appreciated. I am sensing that you work for Galaxy as a developer? Since I last visited the homepage for HiC Explorer, it has all sorts of new workflows - including for ChIA-PET. I have been getting tting set up to use ChIA-PIPE, but will definitely look at the version on Galaxy as well.

Finally, where do I grab a copy of your testing workflow. I am not sure it would be helpful to someone at my level of understanding, but perhaps it WILL help me to understand what is going on.

Thank you,
Sandy Sharp
I am actually a professor emerita from Cal State LA, attempting to add relevant analysis of publicly available HiC info to a manuscript based on wet-bench data from my lab.

Hi @acta1

I was just posting back an update related to a Galaxy Training Network (GTN) tutorial. I wasn’t sure which workflow you were using, and other students might see the posts here later on. The point was to update to the most current version of the BWA-MEM mapping tool if the current training tutorial is used (before we are able to update it).

It sounds like you are using workflows directly from the Hi-C wider community, so this change probably doesn’t impact you?

But – if you are using the workflow from the training site, then definitely update the mapping tool to the most current version. You can get my copy of an updated workflow from that issue ticket I posted. Copy the workflow link into the Workflow → Import workflow form. You can compare to my shared history for what the expected output should be.

For your original issue about the format conversions, I don’t have a good solution for you. I also tried to do the conversions, in Galaxy with the available tools, and the tools were outputting format variations with different format datatypes! Frustrating. There seems to be a gap in the available conversion tools, and a few different groups of tools from different development teams … This is something the developers working on those tools will need to resolve first, then they will need to push those tools out to Galaxy so people can use them here.

Now, once you have the correct format (maybe generated outside of Galaxy), I think you should be able to at least upload that to Galaxy. We can host that sort of file by URL for use in display applications like UCSC.

Not a great solution, and hopefully this resolves itself soon. :slight_smile:

Hi, Jenna,
Thanks. I understand the purpose of your reply now.

I was using The Galaxy Training Network tutorial that uses Hi-C data from Drosophila melanogaster and is hosted on zenodo. And you are right, my problem was changing formats.

I appreciate your feedback!

Sandy

1 Like