Problem with the annotatemyIDs tool

des_b · July 21, 2023, 1:22pm

Hi all,

I have a problem when I use annotate my IDs tool. I noticed a misannotation in few genes.

In details: In original file with all of my counts the Enseble annotation is correct; each row has a unique Enseble ID. After I run the Annotate my IDs tool (to use the output in the limma-voom),
some of these genes with unique IDs are transformed into duplicate Ensemple IDs.
In some cases, it’s different isoforms of the gene that are transformed in one ensemble ID, so the gene symbol is correct, and the Ens IDs are wrong and in other cases even different gene symbols
acquired the same ensemble ID.

Any idea how this happened? I have the correct organism (mouse), and the tool version is: annotateMyIDs annotate a generic set of identifiers (Galaxy Version 3.16.0+galaxy1).
I even used a previous version of the tool but still the same result.

To see if there is something wrong with my data I also run the Annotate DESeq2/DEXSeq output tables (of the DESeq2 output), and everything is normal. Each gene has a unique ID and gene symbol.

Appreciate any comments or suggestions

jennaj · July 21, 2023, 4:52pm

Hi @des_b

Different annotation sources will create slightly different gene/transcript “footprints” on the reference genome. This can create a one-to-many, many-to-one, and many-to-many situations resulting in non-unique IDs in files after converting IDs between sources. Everyone would get this result, it is not your data or a tool problem.

Maybe try running the analysis with annotation from Ensembl instead? That would be the GTF incorporated during counting. UCSC has this for some genomes ready to use, but other sources could be adjusted to work with these tools. Let us know if you need help with that.

If I misunderstood (and probably did) please share more details, including why you are converting IDs and your broader technical goals. Applying different versions of Ensembl annotation could present with the same many/one problems too.

des_b · July 24, 2023, 6:39pm

Thank you for answering me.

My goal was to use the limma tool for checking differential expressed genes between my different treatment groups (based on the tutorial: 2: RNA-seq counts to genes)

I can upload the annotated version of the genes to have a result that I can analyse. So I used the annotate myIDs tool to have a file with the Ensemble ID (ENMUSG000000XXXXX) and the corresponding gene name (e.g. Gapdh).
In this specific tool, I can’t upload any external gtf files (or at least I couldn’t find a way).

My problem is that Limma gave me errors with the explanation of duplicated row names, and I believed the output Annotation file is the problem.

However, based on your answer, having these multiple duplicate Ensemble IDs, even with different gene names, is expected, so this should not create any problem with any downstream tool like Limma.

Not sure If I can do anything else or if there is any other tool that I can use to give me the same annotated information to use in Limma tool

jennaj · July 25, 2023, 5:49pm

This is one reason why it is best to use the same exact annotation throughout a single analysis. If you started over and did the counting with an Ensembl GTF, then you wouldn’t need to do any gene transformations and these errors would go away.

I know that isn’t a nice answer since it means rerunning prior work but is really the only way forward unless you want to custom hand-edit the files to curate which genes are assigned to features (not recommended!).

Topic		Replies	Views
featureCounts output not compatible with Annotate DeSeq2/DexSeq output tables tool-dev	2	714	March 9, 2021
annotateMyIDs (Galaxy Version 3.17.0+galaxy1) not working usegalaxy.org support mapping , transcriptomics , annotatemyids , rna-seq , featurecounts	2	348	August 31, 2023
DESeq2 Returning Nucleotides As Gene ID usegalaxy.org support ncbi	4	449	October 26, 2022
Annotating Gene IDs -local-DESEQ2 rats AnnotateMyIDs usegalaxy.org support galaxy-local	15	1983	July 15, 2019
Ensembl stable gene ID to NCBI gene symbols conversion_Plants usegalaxy.org support annotatemyids , feature-annotation	3	591	August 25, 2023

Problem with the annotatemyIDs tool

Related topics