convert 23and me to vcf

vcf
tsv
23andme
bcftools

#1

Trying to convert a 23andme tsv file to vcf using bcftools convert to vcf. Building a lesson for UG class.

I get this error: Fatal error: Exit code 255 ()

–tsv2vcf requires the --samples option

python: : Unknown error 59090864

Can someone specify proper options to get the bcf conversion to work?


#2

This pypi package might give you better results.

If that fails, there’s also a perl script that I have successfully used.


#3

I knew about the pypi and perl script but

I was hoping to get my students to execute this within Galaxy if at all possible


#4

To use the Galaxy wrapped BCFtools convert to vfc tool/function with tsv input, both the reference genome and sample information need to be specified at runtime. This is noted on the tool form – highlighted in the screenshot attached below.

Sample names can be entered directly on the form or supplied as a “list” dataset file in a tabular format. Include/exclude are both possible. For help with what a sample content represents and the proper formatting, please see: http://samtools.github.io/bcftools/bcftools.html#convert (is linked from the bottom of the tool form).

If you run into problems even with those entered, double check both of the below are true:

  1. The reference genome fasta has the same exact chromosome identifiers on the “>” title lines as the VCF includes for mapping positions. No extra whitespace or description content + consistently wrapped at 40-80 bases. The tool NormalizeFasta can help to reformat fasta data correctly in most cases and the final formatting is the same as that of a custom genome.

  2. The sample names are exactly the same between the form entry/list and the VCF content. This is a file that you create/upload to Galaxy or by using the Upload tool’s “paste” function (use the gear icon option to “convert spaces to tabs” to ensure proper tabular formatting).

Some help links:

Hope this works out!