Upload Genome Reference

Hi
I need to map my results to the reference genome of Candida albicans SC5314 but I couldn’t find it. I’m trying to use RNA STAR. How can I upload the genome?

Upload the genome in FASTA/FASTA.GZ format into your Galaxy account (history) and during RNA_STAR job setup change the source of genome from built-in to “from history”.
Maybe consider HiSAT2. It is less memory intensive compared to RNA_STAR.
Make sure you get both the genome sequence and gene annotation from the same site, to avoid potential mismatch in contig/chromosome names.
Kind regards,
Igor

1 Like

with respect to above reply
i request to kindly elaborate/ guide for which data format to be used for

  1. input/query sequence file [fastq or fastqsanger or fastqsanger.gz]
  2. reference genome [either fasta or fasta.gz or faa.gz etc.]
  3. do we need to normalize reference genome fasta
    kindly guide as i am getting error message for same either hisat2 or RNA STAR
1 Like

Hi @sagar_khulape

  1. fastqsanger or fastqsangergz for most. If you upload reads with default settings, that is what the datatype will be.
  2. all of those are the same (nucleotide sequence), and datatype fasta (uncompressed) in Galaxy. If you also upload these with defaults, that will result.
  3. usually. see the help below for how to know. and yes, any target reference genome needs to be formatted in a way that tools can understand them.

We have many resources. I would suggest starting with these.

Hope that helps!

1 Like