Dear Galaxy Team,
I am trying to run Extract Genomic DNA for the Daphnia magna reference genome (fasta) (Assembly: ASM2063170v1.1) and corresponding gtf file. I have downloaded the respective files from NCBI and uploaded them on Galaxy. I have specified the comment lines for the GTF file.
Next I tried to launch Extract Genomic DNA by selecting the genome fasta and respective gtf file from my history but it refuses to launch and gives me this weird warning message I cannot fix:
Unspecified genome build, click the pencil icon in the history item to set the genome build
***Fetch sequences for intervals in ****
As I uploaded the files to galaxy of course there is no genome build available for Daphnia magna. Further I have realized it marks my genome fasts as (unavailable). Why?
Can someone please help me to figure out what is going on here? I think it might have to do with the custom genome build. Btw I made sure to have the fasta correctly formatted using the built in fasta harmonizing tool NormalizeFasta in Galaxy.
Thanks a lot in advance.
Agree. The screenshot is showing that the reference genome fasta is not available for some reason.
This usually comes up when using the rerun function, and the active history doesn’t contain that same input data at all or just not in a usable state.
What to do:
- Go to User → Histories and switch to the history with both the fasta and GTF datasets
- Confirm that the fasta file is a useable dataset: datatype is fasta (not fasta.gz or anything else)
- If needed, you can copy datasets between histories, see Copy a dataset between histories
An alternative tool choice is bedtools MaskFastaBed. But, you’ll still need to make sure the target fasta and GTF are in the active history.
Hope that helps but let us know! A share link to the history would be one very quick way for others to help more. Sharing your History
Thanks for your suggestions. The fasta and gtf file are correctly formatted also interpreted as such by galaxy. However, I noticed that the fasta contains lower and upper case letters within the fasta sequence.
Could this be the reason why the fasta is not recognized as a custom genome?
Also thank you for suggesting the bedtools MaskFastaBed. However, here I have the same problem that Galaxy does not recognize my fasta as usable reference genome sequence, although it is correctly identified as fasta.
I will sit down and make a new and clean history of my work and will share it here, hoping that someone might be able to identify the problem. Thx @jennaj for sharing this valuable information with me.
Thanks for posting back the format of the fasta. It looks good and the upper/lower case format is not related to what is going on.
Starting over in a new history is a good plan for troubleshooting. Let us know how that goes!