Hi!
I generated a vcf file and then annotated it using bcftools.
My header is in the format: Chrom Pos ID Ref Alt Qual Filter Info Format data
and some example data looks like this (most important is the GENE=___ at the end of the INFO section):
chr16 868776 . CGGGGGGGGGGGGC CGGGGGGGGGGGC 61.251 . AB=0.363636;ABP=4.78696;AC=1;AF=0.5;AN=2;AO=4;CIGAR=1M1D12M;DP=11;DPB=11.1429;DPRA=0;EPP=11.6962;EPPR=5.18177;GTI=0;LEN=1;MEANALT=6;MQM=37.5;MQMR=42;NS=1;NUMALT=1;ODDS=5.44144;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=136;QR=32;RO=1;RPL=0;RPP=11.6962;RPPR=5.18177;RPR=4;RUN=1;SAF=4;SAP=11.6962;SAR=0;SRF=1;SRP=5.18177;SRR=0;TYPE=del;technology.ILLUMINA=1;GENE=ENST00000262301.15
How do I group and count variants that have the same gene? Any help would be greatly appreciated.
Cheers,
Glenn