im doing an rna seq analysis and after aligning with Star and filtering after Deseq2 a question remains. and that is:how to replace these ID with official gene names? i found a tool names annotatedmyIDs but it isnt working
The tool annotateMyIDs
does have a small problem right now (all versions). The tool is not ignoring header lines even when that is set on the tool form.
Cross-reference post: AnnotateMyIDs Fatal error
If that does not solve the problem, check that the Organism
and ID Type
are a match for the transcripts or genes in your data. The tool UniProt ID mapping and retrieval
is an alternative for many use cases.
Which tool to use depends on where the existing annotation was sourced (meaning: what ID content they currently represent). The format of the transcript/geneIDs can also cause problems. I’ve seen problems come up when a version is added to the end of the value – that should be removed before using either tool if present. Example: NM_130786.2
should be modified to be NM_130786
.
If this does not address your issue, please share a few of the IDs and the source/genome build of the reference GTF used with whatever counting tool you used as input to DESeq2
. We can probably help with determining the right starting “ID” type.
how i should remove the last part of my data ?
im just want to try this and if it didnt work i will share my data ID and other information and my gtf.maybe im doing somethings wrong
how remove the last number ?
NM_130786.2 to NM_130786
.???
Hi -
Since the identifier is in the first column, Text transformation with sed
will work without a complex expression. Try this:
s/\.[0-9]+//
Other options include using regular expressions with tools like:
Text reformatting with awk
Replace parts of text
Replace Text in entire line
Replace Text in a specific column
Or, if you don’t want to use a regular expression, the column can be isolated, the “dot” replaced with a “tab”, then all the data rearranged back again into one file. This could be put into a workflow if you plan to run it again. Example tool order: Cut
> Convert delimiters to TAB
> Paste
> Cut
.
Also, your IDs are Emsembl transcripts, so choose that as the input type with annotateMyIDs
.
Hope that helps!
finaly it works
thank you so much for your wonderfull support