Reference annotation GTF options for UCSC's mouse builds mm9, mm10

Hello !!!
I am performing a RNA seq analysis with mice sample.
Right now I am struggling in funding¬downloading the file

UCSC Main on Mouse: wgEncodeGencodeBasicVM25 (genome)
that contains all the transcripts in order to perform the Join two Dataset analysis between the Deseq2 files vs the UCSC Main on Mouse: wgEncodeGencodeBasicVM25 (genome) .

Could u help me out please
thanks a lot in advance!!!

1 Like

Hello @Gaia_Gentile

Gencode annotation for the mouse builds mm9 and mm10 are available from the Gencode website.

Instructions for how to load and prepare that annotation for use with tools is covered in this prior Q&A. It is about a specific tool and the human genome, but the same advice applies:

You may also find this FAQ helpful:

I also added a few tags to your post that will lead to more related topics. Or, review the results of this search:

If that doesn’t answer you question, please explain more about the steps you have done so far (tools), and which data is involved (reference genome, reference annotation). If you are following a tutorial, include a link to that too please.

Thanks!

Good morning,

Thank you for your kind reply.

I am performing a RNA seq data analysis using Galaxy.

My samples are mice sample.

So the first step I have done was to upload the file> UCSC Main on Mouse: wgEncodeGencodeBasicVM25 (genome).

Then i did the Quality control of raw reads using the tool FastQC.

Subsequently I performed the Read alignment ¬ Genome based using RNA STAR Gapped-read mapper for RNA-seq data (Galaxy Version 2.7.6a).

Then I perform the Quality control of aligned reads using Multi QC aggregate results from bioinformatics analyses into a single report (Galaxy Version 1.9).

After I did the Read quantification using htseq-count - Count aligned reads in a BAM file that overlap features in a GFF file (Galaxy Version 0.9.1+galaxy1), in which In the voice Aligned SAM/BAM File> my file of interest and in the voice GFF File**>** UCSC Main on Mouse: wgEncodeGencodeBasicVM25 (genome).

Then I did the DESeq2 Determines differentially expressed features from count tables (Galaxy Version 2.11.40.6+galaxy1) .

After this I would like to perform the tool Join two Datasets side by side on a specified field (Galaxy Version 2.1.3), I have trouble in this part of the analysis. I put in the voice Join> my Deseq2 file of interest and in the voice with> the file UCSC Main on Mouse: wgEncodeGencodeBasicVM25 (genome).

But it is not right because after I perform such analysis I do not obtain for each gene for expl ENSMUST00000021332.9 the name to whom it is associated, in order to know which genes then are upregulated or downregulated.

I miss the file of UCSC Main on Mouse: wgEncodeGencodeBasicVM25 (genome) in which I have the name of the transcripts.

I hope it is clear now.

Looking forward to hearing from you.

Best regards,

Gaia Gentile

1 Like

This file does not contain the gene name, just the transcript name and the location on the genome.

Try the file wgEncodeGencodeAttrsVM25 instead.

Thanks!