hg38 patch 11 fa.gz error

Hi,
I’m following the tutorial for alternative splicing and try to generate a fasta file with the gffread tool (Hands-on: Hands-on: Genome-wide alternative splicing analysis / Transcriptomics)

I’m using the hg38 fasta.gz as reference genome. However, I get the following error when running the tool:

Warning: couldn’t find fasta record for ‘chr14_KZ208920v1_fix’!
Error: no genomic sequence available (check -g option!).

I have seen online that the chr14_KZ208920v1_fix’ is from a patch update.
Therefore, I have uploaded a hg38patch11.fa.gz file, but get the same error.
I notice that the patch 11 only has 123 sequences. Is there a way to integrate/merge the patch file to the main hg38 fa.gz file? Or is there any other way to fix this error so I can run this tool?
Thank you!

1 Like

Hi @jnguyen1

I replied to your bug report directly but for others reading, I also put the content into a helper guide at this forum. Please see → Reference genomes at public Galaxy servers: GRCh38/hg38 example

In short, using reference annotation that is a match for the reference genome is important. There are at least two primary methods to accomplish the data preparation, and both are covered in the guide above. And while that guide has extra information about human genomes, the FAQ link outs apply to any grouping of genomic reference files.