Hello,
I would like to produce a scatter plot among showing the correlation among ChIP-seq data (peaks (TF-binding regions) falling in promoter regions) with RNA-seq data (differentially expressed genes/DEGs from comparing the deletion of same TF (same as in the ChIP-seq) versus Wt, how can I do that in GALAXY?
-
What kind of files should I use as an input for my ChIP-seq, and what for my RNA-seq data, in order to produce such a scatter plot?
-
Which tool(s) should I utilize from GALAXY to produce such a plot, as long as I have the corresponding inputs?
FYI: I have done the whole analysis in GALAXY so far, I used MACS2 for the peak calling for the ChIP-seq, and DEseq2 for identifying the DEGs for my RNA-seq.
I would highly appreciate any help,
Best wishes,
P.S.: I have already searched for a relevant tutorial but I haven’t found any… if by any chance you are aware of such tutorial please let me know about it.
Hello @Manolis1
Create BED files from both datasets (with coordinates based on the same reference genome), then
- click on the little graph icon within one of those datasets to reach the data graphing options, then search with “plot”
- or search the tool panel with “scatterplot” to find more choices
You will already have a mostly BED formatted file from MACS2 peaks output. So, that probably won’t need any more manipulations.
For the genes that are differentially expressed, you can pull out the lines from your reference annotation associated with those genes to get the coordinates. The tool gffread is one way to convert GTF data to BED format. Maybe convert first then join on the gene identifiers on common between your files. Or you can use a filtering tool.
These and more data manipulations are explored here → Hands-on: Data Manipulation Olympics / Data Manipulation Olympics / Foundations of Data Science
Hope this helps!
Hi @Jen,
Many thx for your response!
Wishes for a great weekend,
Cheers,
1 Like