Converting bedgraph to bigwig, product files 0 bytes

Dear Galaxy Team,

Thank you for taking the time to read this post and for your ongoing support to the community.

I’m encountering an issue while processing my ChIP-seq data and would greatly appreciate your help:

After running MACS2 bdgcmp, I generated bedGraph files and am now attempting to convert them to BigWig files for downstream analysis. However, when I use the wigToBigWig tool, the resulting files are 0 bytes in size. I also tried the Convert BedGraph to BigWig tool, but the process failed with a red alert.

For context, I am using a custom genome file rather than a standard or general genome file, which may be causing formatting issues. but currently I have no idea about how to solve this problem.

Could you please let me know what additional information I should provide to assist with troubleshooting? For example, would you need access to my history for debugging?

Thank you so much for your time and guidance—I truly appreciate your help.

Best regards,
Yang

Hi @Yang_Ye

You can use a custom reference genome with these tools, including the parts where you may need to create a custom database key to assign to datasets (a “custom build”).

How the data should be formatted and how to use the functions are here. → FAQ: How to use Custom Reference Genomes?

More examples can be found under custom-genome and custom-build

Please give that a try, since it tends to solve the troubles people have with these tools. But should you get stuck, you can generate a share link to your history and post that back here in a reply. How to generate the link is in the banner at this forum, or please see here directly → How to get faster help with your question

Please let us know if you solve this, too!

Let’s start there, thanks! :slight_smile:

Hi Jenna,

Thank you so much for your kind response and willingness to assist! I truly appreciate your help.

I reviewed the FAQ page you suggested and followed some process, but I’m still uncertain if my custom genome is configured correctly. We’ve successfully used this same custom genome in previous analyses through Galaxy, so I suspect there might be something else at play.

To make things clearer, I’ve shared my history for your reference. Here’s the link: Galaxy.

Here’s a quick summary of the relevant datasets in the history:

  • Data 2: The custom genome file. It is based on hg38 with an additional sequence, “FRT Tir1 v3,” treated as an extra “chromosome.”
  • Data 173: paired ChIP-seq data.
  • Data 319: Outputs from MACS2 bdgcmp, which will fail the conversion in wigtobigwig tool, resulting 0 bytes product.

Please let me know if you need more details or specific actions from my side to help troubleshoot the issue. I’m happy to try any recommendations you have.

Thanks again for your support!

Best regards,
Eddy

Great, thanks for sharing the history @Yang_Ye

Looking at your custom reference genome, the novel chromosome name is good. Notice how you used a term that is all one_Word1.

FRT_Tir1

Then this is your custom build identifier. Notice how it includes a + character and a space. Tools are having trouble with that (the underlying tool – this is not just a “Galaxy” requirement).

hg38+FRT Tri1

Try using something like this instead. Notice how the only special character is an underscore, and there are no spaces inside the name.

hg38_FRT_Tri1

Try to use terms that are all one_Word1. Letters, numbers, underscores, no other special characters and no spaces and (usually) not starting with a number. Some tools will interpret the underscores but not the ones you are using. When I doubt, using a very simple oneWord1 type of format is usually “safe” for any computational tool.

What to do:

  1. Recreate the custom build using a simplified name
  2. Change the database assignment on your datasets to that new build
  3. Rerun any tools that failed, and they will likely find the new fasta index

You can delete the original custom build, or ignore it.

If that is not enough, you can share back your history with those changes and we can follow up more. Thanks! :slight_smile:

Hi Jenna,

Thank you for pointing out the potential naming issue! I followed your instructions carefully:

  1. I imported the custom genome file from my previous working history, renamed it as “onewordFRT”, and assigned it as my custom build in “User - Preferences - Manage Custom Builds”.
  2. I updated the database assignment for my MACS2 bdgcmp data collection to onewordFRT.
  3. I reran the wigtobigwig conversion using the updated dataset.

Unfortunately, the output files are still 0 bytes.

The history link is still the same one as I just made all the changeds in that history.

  • Data 387: The newly renamed genome FASTA file (identical to Data 2, just renamed).
  • Data 396: The collection of MACS2 bdgcmp outputs with the database updated to onewordFRT.
  • Data 406: The resulting files from wigtobigwig, all of which are 0 bytes.

Should I convert all the database of previous data products in this history to onewordFRT? or should I re-generate the custom genome file again from concatenate FASTA files instead of just renaming it?

Please let me know if you need additional details or specific actions on my side to troubleshoot further. I’m happy to try any other suggestions you might have.

Thank you for your patience and generous support!

Best regards,
Eddy

Hi @Yang_Ye

I was able to get this working. All of the data is in the screenshot. I’ve numbered the items to make this clearer.

  1. I clicked the rerun icon for one of the results in data 406 to bring the form back up. I just picked out one of them from the collection to test with.
  2. This is the input area for the “lengths file”. For this tool, it needs to be a two column file. The tip under the input area has one tool you can use for that. You have a fasta file input here – but this tool doesn’t expect a fasta file, and that is why your job didn’t work.
  3. This is your fasta file.
  4. This is is that fasta file converted to a length file.
  5. This is the new output from a rerun of the job I brought up in 0 above but with data 415 input instead of data 387. It seems to work.

I’m sharing this history back so that you can look at it closer in case something wasn’t clear. → https://usegalaxy.org/u/jen-galaxyproject/h/copy-of-tdp1-2-results-httpshelpgalaxyprojectorgtconverting-bedgraph-to-bigwig-product-files-0-bytes139912-updated

Important points:

  1. You can create a custom build to assign to datasets. Some tools will interpret that. You can give this any label you want as long as it follows the format rules: letters, numbers, optional underscores, no spaces.
  2. Always review tools forms to see what kind of files that particular tool is expecting to process. Toggle open the “accepted formats” under each input section to learn what datatype format that input file should be.
  3. Check the tool form itself where the input area is: there might be extra help, plus down further on the form as usual.
  4. Check the job details view: expanded datasets and job logs for clues if you get unexpected results or an error. These jobs had a message about the missing two column tabular file in the expanded output dataset and the job logs.
  5. Once you have things working, consider extracting a workflow for reuse. Then you won’t have to remember all the tedious steps each time.

Hope this helps! :scientist:

Hi Jenna,

Thank you so much for your detailed response! I can’t believe I overlooked the file type requirement. Previously, we used the hg38 FASTA file without any issues, so the potential problem didn’t occur to us.

I used your converted length file for wigtobigwig, and it worked as your example. I’m now continuing with the downstream processes, and I appreciate your help if new questions arise.

I do have a small request: I tried to search for and favorite the tool you mentioned, CONVERTER_fasta_to_len, but I couldn’t find it on my Galaxy interface. Could you please provide a direct link or suggest an alternative method to access and favorite this tool? I’ll be working with a lot of data in this pipeline soon, and having this tool readily available would be necessary for those jobs. If there are any other tools that can perform the same function, I’d be happy to consider those as well.

Thank you again for your tremendous help and support!

Best regards,
Eddy

1 Like

Hi @Yang_Ye

Great that the prior help actually helped!

Conversion tools do sometimes have multiple versions (technical reasons) but I did find the one you are looking for in the tool panel. I searched with the datatypes involved, so “fasta len”.

Please also give that a try.

Hope this also helps or that you found it already! :slight_smile: