how to replace these ID with official gene names?

im doing an rna seq analysis and after aligning with Star and filtering after Deseq2 a question remains. and that is:how to replace these ID with official gene names? i found a tool names annotatedmyIDs but it isnt working

The tool annotateMyIDs does have a small problem right now (all versions). The tool is not ignoring header lines even when that is set on the tool form.

Cross-reference post: AnnotateMyIDs Fatal error

If that does not solve the problem, check that the Organism and ID Type are a match for the transcripts or genes in your data. The tool UniProt ID mapping and retrieval is an alternative for many use cases.

Which tool to use depends on where the existing annotation was sourced (meaning: what ID content they currently represent). The format of the transcript/geneIDs can also cause problems. I’ve seen problems come up when a version is added to the end of the value – that should be removed before using either tool if present. Example: NM_130786.2 should be modified to be NM_130786.

If this does not address your issue, please share a few of the IDs and the source/genome build of the reference GTF used with whatever counting tool you used as input to DESeq2. We can probably help with determining the right starting “ID” type.

how i should remove the last part of my data ?

im just want to try this and if it didnt work i will share my data ID and other information and my gtf.maybe im doing somethings wrong

how remove the last number ?
NM_130786.2 to NM_130786.???

Hi -

Since the identifier is in the first column, Text transformation with sed will work without a complex expression. Try this:

s/\.[0-9]+//

Other options include using regular expressions with tools like:

  • Text reformatting with awk
  • Replace parts of text
  • Replace Text in entire line
  • Replace Text in a specific column

Or, if you don’t want to use a regular expression, the column can be isolated, the “dot” replaced with a “tab”, then all the data rearranged back again into one file. This could be put into a workflow if you plan to run it again. Example tool order: Cut > Convert delimiters to TAB > Paste > Cut.

Also, your IDs are Emsembl transcripts, so choose that as the input type with annotateMyIDs.

Hope that helps!

2 Likes

finaly it works
thank you so much for your wonderfull support

1 Like