get fasta file for installed genomes in history

Hi,
I was wondering if there is a way to get the fasta file for one of the installed genomes (available via dropdown; server indexed files).
I tried on usegalaxy.eu

  • bedtools getfasta (but this needs a coordinate file and changes headers)
  • twoBitToFa (This does not provide a dropdown of installed genomes)

Thanks!

Hi @microfuge
You can generate BED file using info from a BAM header: usually it has contig name and length. Sequence names can be changed later.
Many (majority?) of animal genomes were sources from UCSC Genome Browser. UCSC Genome Browser provides data for download. This might be a convenient option for popular species, such as human, mouse, drosophila or nematode.

Kind regards,
Igor

Thanks @igor
Since the fasta, 2bit, len files are already in Galaxy a more direct way of getting these into history would be great.

Hi @microfuge

There is a complicated way of getting the genome into the history that I can explain if needed. It involving finding the data in our CVMFS resource, and it is public so URLs can be copied into the Upload tool. If you share the dbkey (“database”) of the genome, I can point you to where to find it, and that will show how to find others later on.

But I’m wondering what your use case is. Maybe there is a simpler way to solve it. Would you be able to describe the goal a bit more? What tool requires the indexed reference genome directly in the history?

Hi @jennaj , Hi @igor ,
Thanks!

Now that you mention, I understand and also think the use case is quite niche.

User in our local instance want this, to

  • Run awk scripts in Galaxy on the fasta
  • Run other tools (not available in Galaxy) in the command line on another machine.

I will try to write a custom tool in Galaxy to do this in our local instance.

Thanks Again

1 Like