I am new to galaxy server and trying to analyze the ChI-seq data files. I run FastQC analysis for chipped fastq file and found per base sequence content error and warning for per base sequence GC content (two peaks in the graph). I have attached both the graphs with this post.
I wanted to understand what could be the cause for these errors and how can we resolve this?
Your help is highly appreciated.
The FastQC tool form has a reference link at the bottom to the tool authors website. They host a range of resources, including example graphs with descriptions about what they mean and considerations for different sequencing types.
This Galaxy tutorial has some parts based on that same data plus related content. Quality Control
The GTN also hosts tutorials specific for ChIP-seq. Use the search at the very top to search with keywords to find those. Some include extra QA help.
Please give those a review.
If you need more help:
If this is raw data, try running the data through a trimming program, then rerun this tool.
Does that also report odd results in these or other modules?
Do any other samples from the same source have odd results?
Is there anything special about the data? You could just list out some features: species, source, any prior manipulations (including merging fastq files together).
You can post back more screenshots or sometimes better, post back a shared history link. A double peak indicates there are two different “pools” of data in the sample. Recognized contamination would show up in other sections of the report. Your job didn’t fail but this FAQ explains how to review the job details and share back a history link: Troubleshooting errors