RUVSeq:Fatal error: An undefined error occurred, please check your input carefully and contact your administrator.

I’ve been trying to use RUVSeq ( Remove Unwanted Variation from RNA-seq data) through inputting Salmon files. I have selected the corresponding files for each factor and made sure to select the Salmon option in the settings. I do believe that the error has something to do with the following:

Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which.max, which.min

Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: EDASeq
Loading required package: ShortRead
Loading required package: BiocParallel
Loading required package: Biostrings
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:base’:

I, expand.grid, unname

Loading required package: IRanges
Loading required package: XVector
Loading required package: GenomeInfoDb

Attaching package: ‘Biostrings’

The following object is masked from ‘package:base’:

strsplit

Loading required package: Rsamtools
Loading required package: GenomicRanges
Loading required package: GenomicAlignments
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: ‘matrixStats’

The following objects are masked from ‘package:Biobase’:

anyMissing, rowMedians

Attaching package: ‘MatrixGenerics’

The following objects are masked from ‘package:matrixStats’:

colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
colWeightedMeans, colWeightedMedians, colWeightedSds,
colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
rowWeightedSds, rowWeightedVars

The following object is masked from ‘package:Biobase’:

rowMedians

Loading required package: edgeR
Loading required package: limma

Attaching package: ‘limma’

The following object is masked from ‘package:BiocGenerics’:

plotMA

Loading required package: ggplot2
Error in read.table(tx2gene, header = has_header) :
more columns than column names
Calls: get_deseq_dataset → read.table

Thanks for the help.

Hi @giovanna_morello

Thanks for posting the full error message! Very helpful.

This is the part near the end to pay attention to first:

The tool is reporting that the number of data columns are mismatched within one of the input files.

First, click into the files using the eye-icon. Sometimes there are several columns full of zeros at the end (certain tools do this, or spreadsheet programs). You can use a tool like Cut to clean up the file.

Other times there might be header values with spaces within the names. You can replace those with an underscore.

Finally, whenever including annotation in GTF format, removing any # header lines is a good idea. This brings the file back into a strict specification. The tool Select can be used (how-to).

Some tools will also accept GFF3 but not all (these do have at least one required header line). Whenever needed, you can convert from GFF3 to GTF with a tool like gffread.

In summary, tools in R are very strict about whitespace and overall format since the inputs are being parsed into more data frames. Please give this a review and we can try to help more if this is not enough.

Short summary of common reasons for issues with Bioconductor and related tools (but really any tool that is “picky” or tossing an error about not being able to interpret formats) → FAQ: Extended Help for Differential Expression Analysis Tools

Please let us know if you are able to solve this! :slight_smile: