How to find the genes with the polymorphic variants in a VCF file

Hi, I have a VCF file with polymorphic variants and I am trying to find the 5 genes with the highest number of these variants. I hav intersect the VCF file with a BED file with data about genes from UCSC. However, I can’t figure out how to find the genes from there. I have tried to convert the resulting VCF file to pgSnp to then do intervals but I cannot do the conversion because I get the following error:

bad variant nt AACACACACACACACACACAAACAT,AACACACACACACACACACACACAAACAT for nt 2 at /opt/galaxy/shed_tools/ line 95, line 173

Any input?

Thanks for the help.

Have you tried SnpSift Intervals (

Alternatively, but more complex: you could consider SnpEff ( to annotate your VCF with genomic effects. This will include the gene name’s together with other details. Just make sure you are suppressing upstream/downstream change annotations.

Finally, I would strongly recommend excluding indels from this type of analysis. Polymorphic indels have a very high chance to represent alignment atrefacts, and they are also complicating things because one indel at a site may affect a gene, while another one may fall just outside the gene.