EdgeR doesn´t run with count matrix in tabular format (obtained from mirDeep2 Quantifier)

Hello everyrone,

This is my first time using Galaxy. I am currently trying to perform Differntial Expression Analysis with Small RNA Seq data and I am having issues with running EdgeR.
I have followed this pipeline:


Align with MirDeep2 Mapper
Quantify teads with MirDeep2 Quantifier
Cut columns belonging to miRNA ID and read counts and generate a new file.
Perform Differential Expression Analysis with Deseq2 with this new file.

After performing reads quantification with MirDeep2 Quantifier, I obtained the following count table:


After that, I cut the columns 1 and 2 with ´cut columns from a table´tool, and used this new table to perform DEA with Deseq2. After the cutting, the table looked like this:

However, when I tried to run this cutted tables with EdgeR it indicates this error:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate ‘row.names’ are not allowed

I understand this is because more than one microRNAs can have the same ID and come from differente precursors. However, I don´t know which is the best way to solve this problem. I have read other comments in this issue but couldn´t find a solution until now.

Thank you very much in advance for you help!

Alejandra

Hi @alepando018

In general: duplicated IDs in count files are rejected by all of the differential expression Bioconductor tools.

What do the tutorial authors recommend for handling the non-unique IDs? The graphic only shows steps up through generating the count file. Is there a way to reach them for advice? Would merging together the ID+precursors labels together into a single value cause those IDs to not match up with the ID+precursors merging in other files? If not, that may be your solution. What you care about is having unique identifiers in each file, and a set common identifiers listed between each file (with counts for that sample).

This is the Galaxy GTN version of a similar tutorial. More are in that topic. Maybe worth reviewing?