Prokka - gbk file misalign in txt file -- Data is in NCBI's gbk format

Hi all,

I was using Prokka on public server Galaxy Main [https://usegalaxy.org] to do assembly for my bacterial genome. Everything went smooth. However it appears that wording/spacing in gbk output file downloaded as txt format are all misplaced/aligned.

Is that any way to solve this or can we download gbk file in different format (rather than txt only)? Thanks

1 Like

Download the data in .gbk format and open it this way:

https://fileinfo.com/extension/gbk

Yes Jenna, but is that possible to download as .gbk? I tried but it was .txt. Any way to do it (and again sorry for my stupid question!)?

1 Like

Hi @czs

No problems, let me try to explain with more detail.

I downloaded a test dataset (output from Prokka) and the gbk formatting is correct. It is just plain text formatting. Not “tabular”.

The format was designed and specified by NCBI, but all data is patterned in a specific way.

Compare your output to this example and you’ll notice that it matches up: https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html

If you want to use some tool that requires the .gbk extension, change the file extension after downloading the data. Tool examples are in my original reply.

The Prokka tool creates many outputs in Galaxy and some are parsed into other formats (fasta, tabular, txt). So if you don’t specifically need .gbk, review those.

Note Any data that is in txt format may have spaces, tabs, or missing values for some columns. These are often custom report styles. It is just how the output was formatted by the original tool authors. Preserving that formatting is important in many downstream use cases.

Hi @jennaj

I see ! thanks for your help !

1 Like