DiffBind_ChIP-seq_Error

akiko · February 25, 2022, 5:39am

Hi,
I have ChIpseq data, 4 samples (2 groups with 2 replicates each). I did SICER peak calling and tried to perform DiffBind. But I always get the same error. [1] “en_US.UTF-8”.
By the way, I tried the first condition with the name “Condition”, but got the same error.

Can you please help me with this.
Thanks

gallardoalba · February 25, 2022, 9:08am

Hi @akiko,
could you share your history with me? I’ll have a look at it.

Regards!

akiko · February 25, 2022, 11:17pm

Error Localization

Dataset	77495831 (bbd44e69cb8906b5b0c1f33cca20c0d9)
History	5490068 (259ea8208f94a88d)
Failed Job	694: DiffBind on data 198, data 97, and others: Differentially bound sites (bbd44e69cb8906b5c7f7671d99f250a8)

akiko · March 1, 2022, 5:19am

Hi Cristóbal,
did you find out anything about the cause of the error after that?

gallardoalba · March 1, 2022, 9:56am

Hi @akiko,
sorry, I cannot access your history. Could you follow those instructions? Sharing your History.

Regards.

akiko · March 1, 2022, 11:29am

Hi Cristóbal,
here is the URL for my history.
https://usegalaxy.org/u/akikosaito/h/cutrun

Thank you very much.

jennaj · March 2, 2022, 2:10am

Hi @akiko

The output from SICER is in a non-standardized bed format, so it is assigned the interval datatype. Even if the coordinates are adjusted to be “0-based, fully-closed start” (an option on the tool form, and you used that), the change applied only makes the first three columns of the data match strict bed format.

Diffbind as wrapped for Galaxy expects a strict bed input for peaks. If a similar tabular datatype is input, Galaxy will try to convert it to bed format at runtime. For SICER peak outputs, that is not enough, results in required information loss, and leads to the particular error you ran into.

If you want to use the SICER output with Diffbind, you’ll need to transform all peak input data to bed6 formatted datasets. Use tools in the “Data Manipulation” tool group. This will involve rearranging and removing existing data columns, plus filling some new column data in.

Once the 1) peak inputs are adjusted to have the content below, 2) the tool form is changed to designate the 5th column as the “score” value, and 3) the datasets are assigned to the bed datatype, the current tool error will be avoided.

The bed6 contents:

column 1-3 should be the original data (chrom, start, stop)
column 4 can be a default but not empty value (name, see the first FAQ below for accepted values)
column 5 should be the peak calling statistic you want to be used (score)
column 6 should be filled in with + for all lines for your use case (strand, set to forward)

FAQs:

Hope that helps!

akiko · March 2, 2022, 5:40am

Thanks for the reply.

One of the output files from SICER, “Test-W200-G600-FDR0.01-islandfiltered.bed”, was a strict bed6 file. So, I ran DiffBind using this file and the job was completed.

However, the number of regions in the result was unusually small, and when I checked the details, I found an error message in “Tool standard error” of “Job information”.
What should I do?

jennaj · March 2, 2022, 8:07pm

Hi @akiko

That output contains the mapped reads that contributed to called peaks, not the actual peaks.

akiko · March 3, 2022, 1:51pm

Hi Jennifer,
I uploaded the SICER (test-W200-G600.scoreisland) file to History after modifying it to bed6. I ran DiffBind using this file. I got an error message when using the sample named KO d14 (see dataset numbers 860 and 863).

Topic		Replies	Views
issue with SICER input files usegalaxy.eu support database , metadata , error-metadata	2	854	February 6, 2020
DiffBind Chip-Seq error macs2 , chip-seq , epigenetics	5	1303	July 15, 2021
How to arrange samples for Diffbind analysis macs2 , chip-seq , epigenetics	6	1577	May 4, 2023
Plots but no intervals in DiffBind usegalaxy.org support troubleshooting	8	429	January 31, 2024
Diffbind memory issues usegalaxy.org support exceeds-memory-error , diffbind	9	55	January 15, 2025

DiffBind_ChIP-seq_Error

Related topics