Hi, I would like to ask for help with MACS2. I follow tutorial Epigenetics: CUT&RUN data analysis and I have problem with MACS2 tool.
I tried to check: files are not empty, format of the file should be fine (I did single end), ref. genome correctly define ???
Thanks for posting the error along with the history! Very helpful.
The job grew into a process that exceeded the expected memory allocation. The EU server has significant resources, so there is likely a problem with the data content.
From a quick look, it seems your data (Rep1) was paired end, but the option on this tool form was set to single end. Paired end is handled slightly differently. Correcting this setting is what I would suggest starting with.
How to review settings: check the Job Details report
I tried to run also with the pair-end with the same result.
With the single-end settings: I tried to run it in different times and days - I deleted the files,but it behave the same, grey - orange for short time - red.
I will follow ATAC-seq tutorial, there is recoding and than I will start cut and tag from scratch and see.
I keep this history for some time, if you will have any idea, I will be happy to try and learn.
AH, thank you for clarifying. Iâm not sure why it is done that way. Maybe this is a specific choice to allow MACS2 to process this newer type of data. Letâs ask one of the scientists working on this, Hi @pavanvidem, would you be able to confirm the usage for this option â using single-end settings with paired-end data for this type of data?
I also ran through the tutorialâs workflow with the example data, and everything was successful, at two of the UseGalaxy servers. The single-end choice with MACS2 was applied, so we know the pipeline is technically âworkingâ and and the issue is still centered around the data preparation. You will be able to access the shared histories and the workflow itself from the workflow invocation links, and these are fresh runs from today. Maybe you can notice what is different from these examples?
Hi @bezinkaI took another look at your original history, from the very start â and think I found the problem. It can be seen in the data tagging. It looks like the pairs were mixed up when building the collection, and that could certainly cause scientific algorithm problems later on, usually at the data reduction steps. Try correcting this first.
How the data is tagged now. Notice how the sample names are not consistent for both the forward and reverse reads. The tag is for the sampleID, not the R1/R2 notation that designates the read direction within a sample.
I would suggest starting completely over. Load the data fresh from the tutorial, and first make sure the tags are assigned for each sample pair correctly based on the file name (not R1/R2 strand) and build the collection again. Using the Auto Build List function will be the easiest and should make a correct guess (it uses the file name, not tags).
If you have trouble, please can capture a screenshot of those steps and post it back and I can clarify exactly what to do, and Iâll see it tomorrow, or you can share back your new history with errors and Iâll check it.
Hope this helps! This is a good example of how getting sample data prepared at the very start is so important. It can be an unpleasant lesson, but everyone doing bioinformatics has it happen, I promise! And we can still get feedback about the single/paired option with MACS2 because I am curious about it too.
Hi,
thank you very much for your help.
Such a (stupid) mistake, but I learned a lot.
I finished and anyone can find histories below (I will keep it as long as possible).
ATAC-seq:
mistake: I download gene v36, but âbedtools Intersect intervalsâ doesnât work with it; it has to be v38 (I put tag on wrong data)
history: ATAC-seq history
Cun and run:
mistakes:
I found only labeling of replicates - the MACS2 did not run when mislabeled.
I misread BAM instead of BED for the first time` I left it in the history.
history: Cut and run 2 history
Single-end BED:
From my understanding that is because I converted a BAM file into BED,so I lost pair-end information, so then I had to use single-end.
I went through tutorials at the end of the MACS2 tool and the pair-end is specified for the BAM file, or detected from the treatment file.
Great! Iâm glad this worked even if I am not exactly clear on why it does. BED can still be paired end. Maybe it doesnât matter for the chemistry behind these reads.
And a HUGE THANK YOU for posting back your completed histories! This should be super helpful for anyone else with a similar error later on. I am going to mark your answer and the solution, and add this topic into the ones that show up at the top of searches.