ATAC-seq Analysis Using MACS2 Callpeak

Hey! I am currently trying to go through the ATAC-Seq Galaxy tutorial (Hands-on: ATAC-Seq data analysis / ATAC-Seq data analysis / Epigenetics) and I am on the MACS2 callpeak step. I received an error and I am unsure of how to fix it. Any insight would be greatly appreciated!

Here is what I input:

Number 73
Name MACS2 callpeak on data 62 (Peaks in tabular format)
Created Tuesday Nov 12th 9:19:05 2024 GMT-5
Filesize -
Dbkey mm10
Format tabular
File contents contents
History Content API ID f9cad7b01a4721354ef60effa34ef599
History API ID bbd44e69cb8906b5392a7444d65eb960
UUID 858aa35b-c60e-49d3-bcf7-54024dba6a08
Full Path /corral4/main/objects/8/5/8/dataset_858aa35b-c60e-49d3-bcf7-54024dba6a08.dat

Tool Parameters

Input Parameter Value
Are you pooling Treatment Files? 0
ChIP-Seq Treatment File 62: MarkDuplicates on data 57: BAM (as BED)

#SRR891268_R1

#SRR891268_R2|
|Do you have a Control File?|0|
|Format of Input Files|Single-end BAM|
|Effective genome size|1870000000|
|Build Model|nomodel|
|Set extension size|200|
|Set shift size|-100|
|Peak detection based on|qvalue|
|Minimum FDR (q-value) cutoff for peak detection|0.05|
|Additional Outputs|Peaks as tabular file (compatible wih MultiQC) Peak summits Scores in bedGraph files (–bdg)|
|advanced_options||
|When set, scale the small sample up to the bigger sample|0|
|Use fixed background lambda as local lambda for every peak region|0|
|Save signal per million reads for fragment pileup profiles|0|
|When set, use a custom scaling ratio of ChIP/control (e.g. calculated using NCIS) for linear scaling|Not available.|
|The small nearby region in basepairs to calculate dynamic lambda|Not available.|
|The large nearby region in basepairs to calculate dynamic lambda|Not available.|
|Composite broad regions|nobroad|
|Use a more sophisticated signal processing approach to find subpeak summits in each enriched peak region|1|
|How many duplicate tags at the exact same location are allowed?|all|
|Minimum fragment size in basepair|20|
|Buffer size|100000|

Here is the error I received:

Job State error
Command Line export PYTHON_EGG_CACHE=pwd && (macs2 callpeak -t ‘/corral4/main/objects/e/8/c/dataset_e8caa316-80eb-411d-8233-08819f152174.dat’ --name MarkDuplicates_on_data_57__BAM__as_BED_ --format BAM --gsize ‘1870000000’ --call-summits --keep-dup ‘all’ --d-min 20 --buffer-size 100000 --bdg --qvalue ‘0.05’ --nomodel --extsize ‘200’ --shift ‘-100’ 2>&1 > macs2_stderr) && cp MarkDuplicates_on_data_57__BAM__as_BED__peaks.xls ‘/corral4/main/jobs/062/436/62436865/outputs/dataset_858aa35b-c60e-49d3-bcf7-54024dba6a08.dat’ && exit_code_for_galaxy=$? && cat macs2_stderr 2>&1 && (exit $exit_code_for_galaxy)
Tool Standard Output INFO @ Tue, 12 Nov 2024 14:19:09: # Command line: callpeak -t /corral4/main/objects/e/8/c/dataset_e8caa316-80eb-411d-8233-08819f152174.dat --name MarkDuplicates_on_data_57__BAM__as_BED_ --format BAM --gsize 1870000000 --call-summits --keep-dup all --d-min 20 --buffer-size 100000 --bdg --qvalue 0.05 --nomodel --extsize 200 --shift -100 # ARGUMENTS LIST: # name = MarkDuplicates_on_data_57__BAM__as_BED_ # format = BAM # ChIP-seq file = [‘/corral4/main/objects/e/8/c/dataset_e8caa316-80eb-411d-8233-08819f152174.dat’] # control file = None # effective genome size = 1.87e+09 # band width = 300 # model fold = [5, 50] # qvalue cutoff = 5.00e-02 # The maximum gap between significant sites is assigned as the read length/tag size. # The minimum length of peaks is assigned as the predicted fragment length “d”. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Paired-End mode is off # Searching for subpeak summits is on INFO @ Tue, 12 Nov 2024 14:19:09: #1 read tag files… INFO @ Tue, 12 Nov 2024 14:19:09: #1 read treatment tags… struct.error: unpack requires a buffer of 4 bytes Exception ignored in: ‘MACS2.IO.Parser.BAMParser.tsize’ Traceback (most recent call last): File “/usr/local/lib/python3.10/site-packages/MACS2/callpeak_cmd.py”, line 389, in load_tag_files_options ttsize = tp.tsize() struct.error: unpack requires a buffer of 4 bytes Traceback (most recent call last): File “/usr/local/bin/macs2”, line 653, in main() File “/usr/local/bin/macs2”, line 51, in main run( args ) File “/usr/local/lib/python3.10/site-packages/MACS2/callpeak_cmd.py”, line 65, in run else: (treat, control) = load_tag_files_options (options) File “/usr/local/lib/python3.10/site-packages/MACS2/callpeak_cmd.py”, line 391, in load_tag_files_options treat = tp.build_fwtrack() File “MACS2/IO/Parser.pyx”, line 1169, in MACS2.IO.Parser.BAMParser.build_fwtrack File “MACS2/IO/Parser.pyx”, line 1181, in MACS2.IO.Parser.BAMParser.build_fwtrack File “MACS2/IO/Parser.pyx”, line 1166, in MACS2.IO.Parser.BAMParser.get_references struct.error: unpack requires a buffer of 4 bytes
Tool Standard Error empty
Tool Exit Code 1
Job Messages Job Message 1:
  • desc: Fatal error: Exit code 1 ()
  • error_level: 3
  • exit_code: 1
  • type: exit_code

Job Message 2:

  • desc: Fatal error: Matched on error:
  • error_level: 3
  • match: error:
  • stream: stdout
  • type: regex|

Thank you so much!

Consolidated into here: ATAC-Seq Analysis Using MACS2 Callpeak - #2 by jennaj

For any one else sharing data, a history share link is best! How to get faster help with your question