ATAC Seq wig to bigwig transformation of MACS2 data

Hello everyone,
I followed the ATAC-Seq tutorial to get a first insight into my ATAC Seq Data. Anyhow, when I try to convert my MACS2 bedgraph file into a bigwig file I receive this error:

bedGraph error line 24443185 of trackless: chromosome chr17 has size 81195210 but item ends at 81195213

I also found the post in this forum where someone had a similar problem, so I checked the reference genome (I used hg19) and the whole workflow up to this step, but I couldn’t find my mistake so far. I also asked KI for help, but that was also not helpful. So I try to figure out, how to get rid of those three bases, that inhibit the ongoing analysis. In the mentioned post, they were able to solve the problem, but since they didn’t share the details public, I don’t understand, which steps are required. Can someone maybe explain to me, where the problem is, and what I have to do to go on?

Many thanks in advance!

Hi @ldawson,

Do you use only hg19 data? Any chance you have coordinate-based files from other assemblies, for example, hg38? Note that change of database in Galaxy does not change the data. liftOver does remapping of coordinates (intervals) from one genome assembly to another.

Can it be over-hanged (clipped reads) or an interval expanded over the chromosome/contig edge?

If you share the history, I can have a look, but it might take some time.

Kind regards,
Igor

1 Like

Hey Igor! Thanks for answering. I am totaly new to bioinformatics and galaxy, I am sorry.
So since last week i started all over again, and I couldn’t find the mistake I made.
I share the link with you. So in the converter tool under additinal options the “clip” option is activated. I am pretty sure that I only used hg19 cononical reference, because the ctcf file from ucsc is only available for hg19.
I am gonna share my history know, maybe that will help:

Thanks in advance.
Lisa

Hi Lisa,

It is an interesting and complicated situation.

I don’t know what tutorial you use as a guide. I looked at this one.

The error indicates one of the intervals expands past the end of chromosome 17:
bedGraph error line 24443185 of trackless: chromosome chr17 has size 81195210 but item ends at 81195213

This may happen during peak predictions - see the MACS2 job setup parameters:
Set extension size: 200
Set shift size: -100
Check description of these settings on the job setup page.

I’ve re-run the MACS2 job with 150 and -75 values, but the wigtobigwig conversion failed with another message:
slurmstepd: error: JOB 3362540 ON js2-large8 CANCELLED AT 2025-06-04T00:45:04 DUE TO TIME LIMIT
The message does not point to the data anymore. I moved the bedgraph file to Galaxy Australia and converted it with no issue.

As for the cause(s)… I spotted several potential issues. It seems you swapped forward and reverse reads in Cutadapt job and this issue was propagated to bowtie2. Not sure if it is critical or not. You used a different version of MACS2 compared to the tutorial (again, I don’t know which one you use as a guide). It is possible that your data has a peak at the chromosome end, real or artifact of repeats often present at chromosome ends, and it was expanded over the chromosome end because of the parameters used for the MACS2 job setup. The latter might be the best explanation.

We will notify people from the ORG server about the wigtobigwig error.

Hi @jennaj, I got unusual error message for a failed wigtobigwig job on the ORG server:
slurmstepd: error: JOB 3362540 ON js2-large8 CANCELLED AT 2025-06-04T00:45:04 DUE TO TIME LIMIT
See datasets 133 and 134 in this history.
I copied the input file to Galaxy Australia and converted it without any issue. I am sorry: don’t have much time at the moment for additional tests, and I will be away for next couple weeks.

Kind regards,
Igor

2 Likes

Thanks a lot Igor! I will try to work out the adapter issue in the meantime. I used the same guide you suggested, and I will run the peak calling again with the right version of MACS2.

So I looked into the MACS2 data in IGB and zoomed into the end of Chr17 and i found out that there might be no peak, but the error is visible here, maybe that will help?

1 Like

Hi @ldawson and thanks for the ping @igor !!

I reviewed the history and noticed two problems, and can explain the cryptic error message:

  1. MACS2 callpeak

The parameters for calling peaks was set for single-end instead of paired-end. This will have scientific implications, too, and should be corrected, since it is a mismatch for the input BAM/BED data. Sometimes this will trigger an error at this step, but not always, and the issue shows up in downstream data reduction step errors, or even just odd scientific “results” that are not so easy to detect. Later on you’ll be using workflows to avoid manual entry problems.

For now, you can review the Job Details for upstream jobs and maybe spot the problems (eye or pencil icon on a dataset, then toggle into the tab).

  1. wigtobigwig

MACS2 will sometimes create peaks that “overhang” the chromosome ends (known behavior, anywhere it is used). Using wigtobigwig with an advanced option to clip these off is a good idea to avoid problems.

The default is to not clip since it can be an important warning that the reference genome assembly versions used were maybe mixed up in earlier steps, as discussed already). You have the correct assemblies from what I saw with a quick review, so clipping can be applied. More details → Reference genomes at public Galaxy servers: GRCh38/hg38 example and example of a mismatch here → ATAC-Seq Analysis Using MACS2 Callpeak

Toggle open the advanced parameters here

Prior discussion, as a reference → ATAC-seq data analysis tutorial: troubleshooting - #5 by jennaj (includes a link to the MACS2 forum where the same situation is discussed with the base tool) and → tools that require bigwig input cannot use bedgraph (as bigwig) file - #15 by jennaj (I think this one was already seen, so I’m linking it all together for anyone else who may run into this little wrinkle in the future!)

  1. DUE TO TIME LIMIT error

For this error, the reason can be some problem with the data/parameters that create a job that executes for longer than the maximum time limit on our clusters (varies by tool, but tends to be about 48 hours).

I was able to reproduce this when testing with the clipped chromosomes. Adjusting the paired versus single end status will likely correct this. Even if the result was putatively “successful” at the AU server, I would be cautious about using the result. As an additional exercise in understanding how MACS2 calls peaks, maybe explore the result with the parameters set correctly against not and see if you can notice the differences?

Actually large analysis can trigger this situation too, and trying at a different server is certainly a good option since the clusters resources can differ (and you might get interesting job logs if the tool can process far enough along to report them before crashing out), but still consider any error/log a warning. Confirming that you understand why the job is unexpectedly long running might matter eg are the parameters optimal? is there another way to organize the data for the same result? are there scientific reasons for the error, not just technical? those kinds of considerations. You’ll learn how to recognize these situations the more you work with specific tools and/or a particular class of data sample.

Explanation → FAQ: Understanding walltime error messages



Hope this helps! :slight_smile: