Welcome, @Sarah_Perkins
I see your bug report sent in from Galaxy Main https://usegalaxy.org.
The problem is that your transcript fasta dataset identifiers do not match up with the transcript-to-gene mapping dataset.
The fasta
(data 14) and TMP
(data 356 + 357 + 358 + 362) transcript identifiers are formatted as:
TR100009|c0_g1_i1|m.500685
The tabular
transcript-tab-gene dataset (data 347) has two problems: 1) truncated transcript and gene identifier formats that do not match the TMP
inputs, plus 2) it looks as if the order might be reversed (gene-tab-transcript)? As long as the genes are named consistently, it doesn’t matter what they are, but the transcript names do need to match the other inputs.
TR1|c0_g1 TR1|c0_g1_i1
In short, primary IDs need to be in the same exact format across all inputs or tools cannot match data up. This is true for any tool, not just DESeq2
.
The gff3
dataset is missing a header line: ##gff-version 3
. That is why it was given the more generic gff
datatype during Upload
, and why the tool form does not recognize it as a valid input.
Also, replicates are required with DESeq2
in Galaxy.
FAQ: https://galaxyproject.org/support/
Hope that helps!