Chlamydomonas genome

I am learning to use galaxy and have 16 WGS illumina 150bp single read datasets from UV mutants to map to the model algae Chlamydomonas rheinhardii. There is no reference genome in galaxy,(I am using https://usegalaxy.org/) I have uploaded the most recent (v6) to a history, and need to map using BWA-MEM2 and then identify and annotate mutant sites.
I have been through relevant tutorials but am still unsure that I have prepared the genome data correctly- the information available on this is not as clear as for other features. So far I have
-Uploaded CreinhardtiiCC_4532_707_v6.0.fa.gz file (also uploaded all annotation files available into the same history ) and decompressed to produce a fasta file
-Used NormalizeFasta with linelength 80 and Truncate names at whitespace
-Used the pencil icon to access dataset attributes. Checked datatype is fasta
changed the name to CreinhardtiiCC_4532_707_v6.0.genome
-In the top menu bar, User → Preferences → Manage Custom Builds
Name:Chlamydomonas.reinhardtiiCC_4532_707_v6.0.genome
Key: Creinhardtii_v6.
Definition: FASTA-file from history
Saved → new custom build
associated all annotation files with the custom genome created
The history is here: the genome was downloaded from JGI
Galaxy

Can you let me know how I should best proceed with the genome indexing / preparation for mapping please, and ideally provide this as a reference genome - I’m sure this would be useful to many people using this model organism. Thank you

Welcome, @cgreig

Wow! :star_struck: Your data preparation looks perfect and it seems you were able to get that content into a custom database and a SNPeff index and a few other custom files as well. You should be able to use every tool that uses a reference genome now!

It also looks like you were able to get started with your analysis project and things are working out well so far.

I’ve captured a copy of your history to save back the reference data someplace stable (your data removed, just the public references remaining), and I created a ticket at the IDC for the request to index this natively. We are still working out how to get more genomes batch processed but this makes sure it is on the master “list” we’ve been tracking. We would probably be pulling it directly from JGI but having a copy to compare against is helpful for context.

Thanks! :scientist: