I have ChIpseq data, 4 samples (2 groups with 2 replicates each). I did SICER peak calling and tried to perform DiffBind. But I always get the same error.  “en_US.UTF-8”.
By the way, I tried the first condition with the name “Condition”, but got the same error.
Can you please help me with this.
could you share your history with me? I’ll have a look at it.
did you find out anything about the cause of the error after that?
sorry, I cannot access your history. Could you follow those instructions? Sharing your History.
here is the URL for my history.
Thank you very much.
The output from
SICER is in a non-standardized
bed format, so it is assigned the
interval datatype. Even if the coordinates are adjusted to be “0-based, fully-closed start” (an option on the tool form, and you used that), the change applied only makes the first three columns of the data match strict
Diffbind as wrapped for Galaxy expects a strict
bed input for peaks. If a similar tabular datatype is input, Galaxy will try to convert it to
bed format at runtime. For
SICER peak outputs, that is not enough, results in required information loss, and leads to the particular error you ran into.
If you want to use the
SICER output with
Diffbind, you’ll need to transform all peak input data to
bed6 formatted datasets. Use tools in the “Data Manipulation” tool group. This will involve rearranging and removing existing data columns, plus filling some new column data in.
Once the 1) peak inputs are adjusted to have the content below, 2) the tool form is changed to designate the 5th column as the “score” value, and 3) the datasets are assigned to the
bed datatype, the current tool error will be avoided.
- column 1-3 should be the original data (chrom, start, stop)
- column 4 can be a default but not empty value (name, see the first FAQ below for accepted values)
- column 5 should be the peak calling statistic you want to be used (score)
- column 6 should be filled in with
+ for all lines for your use case (strand, set to forward)
Hope that helps!
Thanks for the reply.
One of the output files from SICER, “Test-W200-G600-FDR0.01-islandfiltered.bed”, was a strict bed6 file. So, I ran DiffBind using this file and the job was completed.
However, the number of regions in the result was unusually small, and when I checked the details, I found an error message in “Tool standard error” of “Job information”.
What should I do?
That output contains the mapped reads that contributed to called peaks, not the actual peaks.
I uploaded the SICER (test-W200-G600.scoreisland) file to History after modifying it to bed6. I ran DiffBind using this file. I got an error message when using the sample named KO d14 (see dataset numbers 860 and 863).