GOseq Error--Duplicate Row names?

Hi @battagliad1

This error can also result from labels or data structures that the tool cannot interpret. It is a R tool that is very picky about format. The same error would result from using it line-command with the same inputs/parameters.

This is a good post about how to check the data, since it goes through a few tools, including this one, with common format troubleshooting: EdgeR row names error

Short list of things to check:

  1. No extra headers in either file. Remove if needed.
  2. Do the geneIDs have a .N (where N is a version number) attached at the end? Try removing the .version and confirm that all IDs are unique.
  3. Are the geneIDs in the same order between the two inputs?
  4. Some people have had success with filtering down both files to contain the same genes (only).
  • Technically, what is required is that each gene in the gene-true/false input is represented in the gene-lengths file (just one time, and that second gene-lengths file can have excess genes). But doing this can sometimes identify problems.

See Data Manipulation Olympics or that other topic above for how to do the data manipulations/comparisons.

Let’s start there, thanks! :slight_smile: