how to replace these ID with official gene names?

amir · March 3, 2019, 9:50am

im doing an rna seq analysis and after aligning with Star and filtering after Deseq2 a question remains. and that is:how to replace these ID with official gene names? i found a tool names annotatedmyIDs but it isnt working

jennaj · March 4, 2019, 4:24pm

The tool annotateMyIDs does have a small problem right now (all versions). The tool is not ignoring header lines even when that is set on the tool form.

Cross-reference post: AnnotateMyIDs Fatal error

If that does not solve the problem, check that the Organism and ID Type are a match for the transcripts or genes in your data. The tool UniProt ID mapping and retrieval is an alternative for many use cases.

Which tool to use depends on where the existing annotation was sourced (meaning: what ID content they currently represent). The format of the transcript/geneIDs can also cause problems. I’ve seen problems come up when a version is added to the end of the value – that should be removed before using either tool if present. Example: NM_130786.2 should be modified to be NM_130786.

If this does not address your issue, please share a few of the IDs and the source/genome build of the reference GTF used with whatever counting tool you used as input to DESeq2. We can probably help with determining the right starting “ID” type.

amir · March 5, 2019, 8:02am

how i should remove the last part of my data ?

im just want to try this and if it didnt work i will share my data ID and other information and my gtf.maybe im doing somethings wrong

amir · March 5, 2019, 8:08am

how remove the last number ?
NM_130786.2 to NM_130786.???

jennaj · March 5, 2019, 4:59pm

Hi -

Since the identifier is in the first column, Text transformation with sed will work without a complex expression. Try this:

s/\.[0-9]+//

Other options include using regular expressions with tools like:

Text reformatting with awk
Replace parts of text
Replace Text in entire line
Replace Text in a specific column

Or, if you don’t want to use a regular expression, the column can be isolated, the “dot” replaced with a “tab”, then all the data rearranged back again into one file. This could be put into a workflow if you plan to run it again. Example tool order: Cut > Convert delimiters to TAB > Paste > Cut.

Also, your IDs are Emsembl transcripts, so choose that as the input type with annotateMyIDs.

Hope that helps!

amir · March 5, 2019, 3:02pm

finaly it works
thank you so much for your wonderfull support

Topic		Replies	Views
Problem with the annotatemyIDs tool usegalaxy.eu support limma_voom	3	560	July 25, 2023
ref_gene_id featurecounts usegalaxy.org support	6	3173	May 22, 2019
annotateMyIDs (Galaxy Version 3.17.0+galaxy1) not working usegalaxy.org support mapping , transcriptomics , annotatemyids , rna-seq , featurecounts	2	348	August 31, 2023
DESeq2 Returning Nucleotides As Gene ID usegalaxy.org support ncbi	4	449	October 26, 2022
Annotating Gene IDs -local-DESEQ2 rats AnnotateMyIDs usegalaxy.org support galaxy-local	15	1983	July 15, 2019

how to replace these ID with official gene names?

Related topics