Genome update sus scro: how to use a custom genome!

Dear Galaxy Team,

I hope you are doing well.

I am currently working with swine genomic data in Galaxy and noticed that the available reference genomes for Sus scrofa are limited to susScr2 and susScr3. However, for my analysis, I would like to use the more recent assembly Sscrofa11.1 (RefSeq: GCF_000003025.6), also known as susScr11 in UCSC.

Could you please let me know if this genome assembly is available on the Galaxy instance, or if there are plans to include it? If not, I would appreciate any guidance on how to upload or configure a custom reference genome and annotation for use in Galaxy tools (e.g., alignment and quantification).

Thank you very much for your support.

Carolini

Welcome @Carolini

The pig genomes are already on our longer list of UCSC genomes to index for tools but this is not quite ready and you don’t need to wait! Others are already using the UCSC data as a custom genome with tools at the UseGalaxy servers so I think we have the resources to index on the fly this way, even for the larger genome.

Quick help if you already know how to do this and just need a refresher! :scientist:

With similar details here


UCSC is a good place to source the data! This will allow you to use the same database (dbkey) as UCSC uses so that the display applications can automatically cross link your results later on when you want to review results.

Get the genome fasta and the reference annotation at the same time. The process will be similar to these topics (and others tagged custom-genome ).

And these are the UCSC links are specifically! Both can be pasted into the Upload tool and loaded with all defaults. You may want to run NormalizeFasta on the genome to remove the fasta counter/description (some tools don’t care, others do!) or you can back up and do that later should you get an error with an intermediate tool. Tools will not automatically assign the database key, but you can do that directly to allow the UCSC linkouts to work (and IGV too!).

Then, if you really wanted to use the NCBI version instead, that is still possible of course! Instructions are here. Please be sure to notice that the GTF will need a bit of polish to remove the headers once in Galaxy. IGV probably has this indexed, but UCSC won’t. If IGV needs the custom genome instead, instructions are in the second FAQ below.

Please give this a try and let us know if it works or now! Follow up questions are welcome too. :slight_smile:

1 Like

Thank you very much for your help! <3

1 Like