Basic Question - FPKM Count - What do I use as the reference gene model?

Hey all, I am new to Galaxy and Seq-analysis in general so have a basic questions…
For FPKM Count you need 2 inputs 1) the bam file from RNA-seq which I have, and 2) a reference gene model in the form of a BED12 data set

My RNA-seq was done in human cell line, so my questions are:
What do I need for the reference gene model? where do I obtain it from? and how do I get it into BED12 format?

Thank you inadvance for any help you can give :slight_smile:

Hi @Quantum_Bean

Some tools such as featureCounts have “built-in” gene models for popular species such as human. Check this tutorial as an example 1: RNA-Seq reads to counts

Good source of annotation is Download section of UCSC Genome Browser. Galaxy uses UCSC style of genomes, so gene annotations from UCSC GB are compatible with built-in genomes in Galaxy.

You need gene annotation in GTF or GFF/GFF3 format. Some tools do you BED, and Galaxy has a tool, Convert GTF to BED12, for conversion of GFT files into BED.

Hope that helps.

Igor

1 Like