Diffbind: failed to create symbolic link; File exists

Hi, I am having issues with input files for diffbind analysis. I am doing the chip-seq analysis and I wanted to see whether the binding changes between the conditions.

As I understand the issue is with naming the bam files. However the files that I am putting into this workflow are created by the previous one and it would be hard to start everything anew - especially when Ive got more samples and conditions for testing. It there any other way how can I change the names of the files/samples? Or maybe there is another way around this issue?

The bams that I am putting into the pipeline are already in collection of lists.

Invocation: Galaxy
Workflow: Galaxy

Hello @Orange_Pomeranian

We already discussed the collection ideas in a direct message .. but I just took a look at your shared invocation again and think I see the problem!

Using collections for the upstream steps is great! And yes, you can process all samples together for all the common steps.

Suggested change: instead of then using the datasets from the collection as individual files, which loses the element identifier/sample labels, consider using Filter collection to create two collection groups – one grouping per condition. This will allow Diffbind to see the common label between the peak/BAM collections. Which samples to sort into the different groups could be in a tabular file input at the start of the workflow.

However, please keep in mind that this Bioconductor tool and other like it (relative statistics) will require replicates within sample groupings. This means you will need a minimum of two samples per group. Once the data frame issues are resolved, the underlying tool will then start to report a different sort of message about design formula. These messages can be searched but maybe this link helps to understand more about why this is? Or at least gets you started? → Bioconductor Forum (query=diffbind+replicates).

Please let us know how this works out! :slight_smile: