Bedgraph to bigwig problem

Hi- I am having another issue with bedgraph to bigwig conversion. I am using a custom genome and processed files through MACS2 then attempted to use the “Convert bedgraph to bigwig” tool as we discussed previously. Now I get a different error:

Couldn't open /cvmfs/data.galaxyproject.org/managed/len/ucsc/?.len

Not sure what this means or how to get around this.

Also tried doing this conversion with wigtobigwig and this appears to work but the output is 0 bytes.

Any help you could give would be appreciated. Thanks!

Hi @TTP

Have you tried custom builds? FAQ: Adding a custom database/build (dbkey)

Kind regards,

Igor

I used a custom build in this case.

Also, I discovered that the general failure of wigtobigwig in processing MACS2 bedgraph output has to do with the size of the files; recently I tried processing several bedgraph files using wigtobigwig and 2 of them actually worked while the rest failed as they normally do. The two that worked were significantly smaller files than the rest so I think that there is a size limitation that is preventing our use of this tool, not the formatting problem with MACS2 output.

Hi @TTP,

By any chance, do you see any Out of Memory messages when you click at outputs from the failed job? Click at name of any output from failed job and check mini-preview in History panel.

If you share the history, people can check what is going on.

Kind regards,

Igor

I solved the previous specific problem another way but there is still the general problem with tools failing to run using MACS2 callpeak-generated bedgraph files, which I think is mostly related to the size of the files although there may be other factors here that I am not recognizing. I put 4 bedgraph files in this example history that range in file sizes. You can see that wigtobigwig appears to process 2 of them although one of those that worked (7) actually has zero content. Convert bedgraph to bigwig fails on the largest file only. Lastly, MACS2bdgcmp fails with two comparisons but weirdly does run successfully with one combination (13). If there are solutions to this apart from “you should have smaller files”, it would be great to know. Thanks.

[

Galaxy
usegalaxy.org

favicon.ico

](https://usegalaxy.org/u/tpaull/h/test-history)

Hi @TTP

The format conversion looks like a data related issue. Click at job 11 (conversion of data 2). In mini-preview in the history panel you’ll see: “Expecting 4 words line 44921181 of stdin got 2“.

I removed first 44921180 lines from the input (dataset 2), just before the line in question. Mini-preview of the intermediate file shows only two columns, preview shows only one line. I cut the first line from the intermediate file. It supposed to have text and numbers in four columns, but its size is 880 Kb, which is a lot. For comparison, I’ve cut a single line with normal interval from one of the files: its size is 32 b.

I don’t know what is wrong with the input file. It just does not look good.

For MACS2 error (job 14) click at the output from the failed job, click at Error icon, the one looking like lady bird beetle. You’ll see the error log in the middle window. It has messages like this one:

AssertionError: Lambda must > 0, however we got 0

Google returns several hits. Check this post: fold enrichment · Issue #259 · macs3-project/MACS · GitHub

The reply points to another post. The error might be caused by the job setup (method used). I’ve completed MACS2 job on the same data using subtract method (in methods pull-down menu).

Hope that helps.

Kind regards,

Igor

Thanks Igor.

| igor
January 19 |

  • | - |

Hi @TTP

The format conversion looks like a data related issue. Click at job 11 (conversion of data 2). In mini-preview in the history panel you’ll see: “Expecting 4 words line 44921181 of stdin got 2“.

I removed first 44921180 lines from the input (dataset 2), just before the line in question. Mini-preview of the intermediate file shows only two columns, preview shows only one line. I cut the first line from the intermediate file. It supposed to have text and numbers in four columns, but its size is 880 Kb, which is a lot. For comparison, I’ve cut a single line with normal interval from one of the files: its size is 32 b.

I don’t know what is wrong with the input file. It just does not look good.

OK, but all of these files came out of MACS2 callpeak. Are you saying that file #2 was somehow corrupted and if I were to generate it again it would work? Why would I get a bedgraph output from MACS2 callpeak that does not have the correct format?

For MACS2 error (job 14) click at the output from the failed job, click at Error icon, the one looking like lady bird beetle. You’ll see the error log in the middle window. It has messages like this one:

AssertionError: Lambda must > 0, however we got 0

Google returns several hits. Check this post: fold enrichment · Issue #259 · macs3-project/MACS · GitHub

The reply points to another post. The error might be caused by the job setup (method used). I’ve completed MACS2 job on the same data using subtract method (in methods pull-down menu).

OK; I think the problems arise when an IP sample is used as a control. The actual comparison we would be making is 1 versus 2, since 2 is an input sample, as this worked even with a very large control file so that is fine.

Hi @TTP,

Yes, it seems dataset #2 is somehow corrupted at line/interval 44921181. Check dataset #18 in Galaxy - I cut line 44921181 from dataset #2. It does not look right. Also, check the file size via Info (Dataset Details) icon. I don’t have access to command line on the ORG server, and cannot check the file content. Galaxy preview shows nothing.

If you generate the file again, it might work, or it might not. If the file was produced in Galaxy, the issue with data corruption should not happen. Check dataset #19 in the shared history. It has ten lines starting from the corrupt line. Note the big gap in the start coordinate between the corrupt and the following line. It just does not look right.

What about this: process sample for dataset #2 in a separate history from the very start, and if the conversion fails again, share the history here? The issue is not with the conversion, but with the input data (previous step(s)), so, we should check outputs from the preceding steps.

By any chance, have you transferred the file (precursor of dataset #2) from Linux to Windows system and back or edited or previewed it in Notepad or similar Windows software?

It seems MACS2 issue is resolved…

Kind regards,

Igor

PS. Please, in the future, create a separate post for an issue. Easier to deal with, and easier for other users, when they search for an answer.