Hi, I;m not sure where’s the problem, but I can’t seem to run differential expression analysis for miRNA-seq. I have a list of miRNAs with read counts of all samples in CSV format. Once uploaded, I opted for RNA-seq (Tool) and set up the factor level. However, this error kept on popping:
The part of the message with duplicate row.names is the important part. This can result from actual duplicated sample names, but also from file formatting issues.
Searching with the error message at this forum yields these hits:
Most of those are referring back to the formats. Remember these are R tools, so any extra whitespace in identifiers or headers can cause problems. Identifiers are best interpreted when they “only include alphanumeric characters and optionally underscores, and not staring with a number”. Galaxy will clean this up a bit for you but it can’t do this perfectly so it is best to try to start off with “clean” data, especially if there is an error from the tool about format.
Right now, the first things that stick out to me in your screenshot is the space in the first column header, and the use of dashes in your identifier names. I would remove the space, and swap out the dashes with an underscore. Then double check for unique sample names in the other header columns, and try the run again.
This message can show up for other reasons but this is where I would start. Remember to make changes in all files as needed since these tools are “matching up” common identifiers across files.
If you get stuck, you are welcome to share back your history for more feedback! This error can come up for other reasons, but without clean files, that can be hard to predict based on what you have shared so far.