mm10/GRCM38 genome

Hi ,
I used Bowtie2 with built-in index for mm10 reference genome at usegalaxy.eu

q1: To retrieve the sequences of chromosome coordinates , can I use UCSC genome browser?
q2: What is the source of mm10/GRCm38 at Galaxy europe .
q3 : For a particular chromosome location, such as chr1: 1123455-1123789 , would the sequence differ if sourced for mm10/GRCm38 from UCSC or Ensemble?
Thanks

1 Like

HI @akanksha_bafna

mm10/GCRm38 indexes at UseGalaxy servers were sourced from UCSC.

This is the same genome nucleotide sequences as other sources of mm10, including Ensembl. The difference can be with how those sequences are named/labeled. That means any data based on coordinates can be used with either, if the chromosome labels are adjusted.

Example

UCSC -> chr1
Ensembl -> 1

Adding in the β€œchr” is not enough for some of the chromosomes. However, there are some chromosome mapping files available in the wider bioinformatics community that you can use to do data mappings. This is the tool we have in Galaxy to do that β†’ Replace column by values which are defined in a convert file (tool link at EU).

Next, this guide covers your question with many more details. It is focused on human, but most applies for the other natively indexed model genomes like mouse. Reference genomes at public Galaxy servers: GRCh38/hg38 example. Note the parts about patch releases – UCSC hosts genomes for the original base assembly and that is also what Galaxy indexes.

Finally, this FAQ about reference genome indexes might be helpful. It explains what the a server index represents (technically: always a fasta index and optionally per-tool indexes) plus how to create your own if wanted. β†’ FAQ: How to use Custom Reference Genomes?

Hope this helps! :slight_smile: