Error running limma "non-unique row names"

Hi - I was trying to re-run limma on multiple count files from different time points (3 replicates for each condition, 2 conditions for each time point), and I received the row.names error:

        Warning message:

In Sys.setlocale(“LC_MESSAGES”, “en_US.UTF-8”) :
OS reports request to set locale to “en_US.UTF-8” cannot be honored
Error in .rowNamesDF<-(x, value = value) :
duplicate ‘row.names’ are not allowed
Calls: rownames<- … row.names<- → row.names<-.data.frame → > .rowNamesDF<-
Warning message:
non-unique values when setting ‘row.names’:

I read through all the posts here regarding row.names error and did the following:

  • I doublechecked the files and made sure they didn’t contain version numbers “.N” (which they shouldn’t; these files were already processed and went through one round of limma fine)
  • I doublechecked my file inputs and made sure there weren’t files with the same name
  • I added numbers to my groups in each factor (ZnD to ZnD1, ZnR to ZnR3, etc)

None of these worked. All of these files from each time point worked fine in limma on their own earlier, and I’m not sure what is the problem here. The error code didn’t give me the offending row name, either, unlike other posts. There was a “Detected common problems” in the bug report:

The tool was started with one or more duplicate input datasets. This frequently results in tool errors due to problematic input choices.

But I can’t find the duplicate input. I checked every file and they don’t even have the same headers. Here’s the history. Any help would be appreciated.

Thanks!

Hi @billy.l,

I don’t see any issue with the count file and the annotation. The error might be caused by the job setup. The prime suspect is use of multiple factors. I thought every factor contains all samples. For example, someone investigates an effect of a treatment and also want to check sex as a factor. I believe the job setup may look something like this:
factor1: drug, group1: control, samples: fc1, fc2, mc1, mc2
factor1: drug, group2: treatment, samples: ft1, ft2, mt1, mt2
factor2: sex, group1 : females, samples: fc1, fc2, ft1, ft2
factor2: sex, group2 : males, samples: mc1, mc2, mt1, mt2

In sample names m and f are for males and females, and c and t for control and treatment. The experiment has eight samples in total.

Maybe try two factors: time, with multiple groups/levels (h2, h4 etc), and genotype (or whatever), with three levels, ZnR, ZnE, ZnD.
Maybe try it on a small subset first.

Hope that helps.

Kind regards,
Igor

PS I don’t know the current situation, but in the past R was very sensitive to text strings. In many situations text stings cannot start with a digit, hence h2 instead of 2h. I also do not recommend space characters in names.

1 Like

Thanks @igor. limma used these files fine when I ran each time point individually. Maybe it is an issue with grouping. I am a bit confused on the differences between “factor” and “group”. I’ll try your suggestion.

Hi @billy.l
“factor” is essentially a name of test, for example, it can be treatment or age. Groups are, as name suggests, groups of samples analyzed together. Sometimes groups are called “levels”. For example, for drug treatment it might be control, dose1, dose2, or control, timepoint1, timepoint2. In simple cases we have a single factor and two or more groups. In complicated cases we analyze effect of several factors, for example, effect of treatment in different age groups. In this case samples should have info on treatment, for example, control or dose1, dose 2, and age (age1, age2, age3).
Hope it makes sense.
Kind regards,
Igor