Error while running GoSeq after cut the required columns and preparing the DE and Length files

Hello,

I have a problem while running the GoSeq tool. I used Advanced cut tool to obtaion the Geneid & True/False (Expression data) file and Geneid & Length file which were used in the GoSeq tool. the total number of rows and columns is ldentical and no (NA) in the file or spaces except tab space
I tried many methods by downloading and doing it manually or using the provided cut tools to obtain the 2 files and to succeed in the GoSeq run however it always shows the following error. What can i do? and where is the error?

Error: Warning message:
In Sys.setlocale(“LC_MESSAGES”, “en_US.UTF-8”) :
OS reports request to set locale to “en_US.UTF-8” cannot be honored
Warning messages:
1: In newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L, :
gam.fit3 algorithm did not converge
2: In pcls(G) : initial point very close to some inequality constraints
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:dplyr’:

combine, intersect, setdiff, union

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:dplyr’:

first, rename

The following object is masked from ‘package:geneLenDataBase’:

unfactor

The following objects are masked from ‘package:base’:

I, expand.grid, unname

Attaching package: ‘IRanges’

The following objects are masked from ‘package:dplyr’:

collapse, desc, slice

Attaching package: ‘AnnotationDbi’

The following object is masked from ‘package:dplyr’:

select

Using manually entered categories.
Error in `[.default`(summary(map), , 1) : incorrect number of dimensions
Calls: run_goseq … goseq → reversemapping → [ → [.table → NextMethod

Welcome @Haifa_Hammad :slight_smile:

Thanks for sharing the error details! Very helpful. This the part to pay attention to

Using manually entered categories
incorrect number of dimensions

This usually means the issue is with the category mapping file, not the GeneID/True/False or GeneID/Length inputs.

For GOseq, the category file should be very simple: 2 columns only

GeneID    CategoryID

gene1 GO:0008150
gene1 GO:0003674
gene2 GO:0009987

Common problems:

  • More than 2 columns
  • Missing tab separation
  • GeneIDs not matching your other two files exactly
  • File formatted as a matrix instead of a simple 2-column list

Some prior discussions are here:

If you want to try to resolve it yourself with more of the text manipulation tools, we have some resources:

Or, you can generate a share link to your history and post that back in a reply, then unshare after we are done.

Let’s start there and if you solve this, please let us know! I’ll watch for your reply. :rocket: