How to filter rare variants (10%) out

Lidia_Ryabova · September 17, 2019, 10:22am

Hi, I have huge file 107 GB, IT IS POOLED DATA CONSISTING OF 50 HOLE GENOMES
I need to filter the rare variants ( 10%) of this 107 GB .bam file .
What tool do I use for it please?
Thank you in advance .

Lidia

gbbio · September 17, 2019, 11:05am

Can’t give a full solution now but you could start by looking at freebayes

jxtx · September 17, 2019, 1:57pm

If you are starting from a BAM file, you need to both call and filter variants to get to all sites that are more than 10% variable.

Since you know the number of genomes you can use --pooled-discrete with a --ploidy of 100. Or you can use ``–pooled-continuous`.

Then you can filter using VCFFilter, the AF info field in the VCF contains frequency information.

Lidia_Ryabova · September 18, 2019, 10:16am

Hi, thank you very much. This is my very first work with Galaxy analysis and I have to support Phd student , may I ask you if there is tutorial that I can follow to call and filter variants please .

jxtx · September 18, 2019, 2:57pm

It will need to be adjusted because you are using pooled samples, but this is probably a good start: https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/dip/tutorial.html

The non-diploid tutorial uses prokaryotic examples, but describes the pooled options: https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/non-dip/tutorial.html

Other variant analysis tutorials are here: https://training.galaxyproject.org/training-material/topics/variant-analysis/

Topic		Replies	Views
Project help...(variant analysis) usegalaxy.org support tool-deprecated , picard_markduplicates	1	2515	August 3, 2020
Calculating variant allele frequency from FreeBayes VCF freebayes	3	5972	July 4, 2019
How to filter FreeBayes output (vcf) file to specific region (SNP) bed file in usegalaxy bed , freebayes , usegalaxy , variant-analysis	1	812	February 4, 2021
Output Freebayes results as gVCF usegalaxy.org support freebayes	2	18	August 1, 2025
MiModD Extract Variant Sites - filtering germline variant-analysis	1	5	November 5, 2024

How to filter rare variants (10%) out

Related topics