Hi,
Just starting as a Galaxy user I got an issue conserning annotation. I have a number of featureCounts output files for DE analysis. A number of significantly differentially expressed genes with a Fc of >2 were obtained. However, when I want to annotate them using ‘Annotate DeSeq2/DexSeq output tables’ my columns with the annotation info returns only NAs. I got the impression that the cause of this is that featureCount files come with the Entrez-id while this information is not present in my .gtf file that I got from the EnSembl repository (GRCh38.102.gtf.gz; removed the header lines containing an #). Any suggestions how to change the gene Entrez-id in my featureCounts output or how to get ENS numbers in my .gtf file? Thanks for your help,
Henk
1 Like
Hi @HUJI_stu,
you can use the annotateMyIDs tool in order to perform such format conversion by using the featureCounts file as input. Let me know if it works.
Regards
1 Like
Dear Cristobal,
I tried AnnotateMyIDs and it works fine. Unfortunately it requires some additional steps of joining datasets which is done automatically if you use the Annotate DeSeq2/DexSeq output tables. Found out that the datasets with my own data uses Entrez ID names as transcript identifier and that’s where things went wrong because the gtf file I used did not contain this information in the attribute column.
Kind regards,
Henk
2 Likes