Comparing SNPs across samples and sample groups

VI_Rodriguez · July 16, 2024, 5:03am

I have the corresponding VCF outputs from running FreeBayes and SNPeff for individual mapped transcriptomes but I want to find common variants between samples (eg. individual 1 vs individual 2). Is there a tool that allows this and accounts for all the genes in the transcriptome? (note that I do not have specific genes in mind so I must go through the transcriptomes) Additionally what can I use to translate the relevant sequences to their amino acid code?

igor · July 17, 2024, 2:59am

Hi @VI_Rodriguez
Can you provide more information on your pipeline? Have you mapped the reads to different transcriptomes or you mapped RNA-Seq data to a reference genome?

Different Galaxy servers have different sets of tools. Maybe search the tool panel (it has a search/filter box at the top) with “translate” or “orf” or “TransDecoder”.

Hope this helps.
Kind regards,
Igor

VI_Rodriguez · July 18, 2024, 11:34pm

I have mapped reads sourced from the different samples to the mm10 reference genome using RNA STAR

jennaj · July 19, 2024, 6:29pm

Hi @VI_Rodriguez

Jumping in since is the weekend now in Australia (where @igor is)

Thanks for posting the extra details!

If your samples are all mapped to the same reference, you can run Freebayes in batch mode. The merged VCF output will have columns for each input sample (BAM) for any features called in any sample.

Screenshot

For your questions about nucleotide translation to protein, there are several tool options. Search the tool panel with “tran” to find these.

But, if your overall goal is to learn how variants impact protein translation, a tool like SnpEFF would probably be interesting for you to explore.

Both Freebayes and SnpEFF have tutorials that show how this works, plus common intermediate steps/tools (and why those were used). Find these linked at the bottom of the tool forms, or you can review the Variants topic at the GTN directly.

Hope this helps and @igor can add more.

Topic		Replies	Views
Finding Frequency of SAP in Illumina Data from Short Viral Genome usegalaxy.org support variant-analysis	0	401	August 12, 2019
Ways to make a variant calling for RNA Seq (paired-end) usegalaxy.org support freebayes , transcriptomics , variant-analysis , vcf , rna_star	5	813	April 12, 2023
Identifying polymorphisms in mapped RNA seq data and the corresponding genes gtn-tutorial , transcriptomics , variant-analysis	1	73	May 10, 2024
SnpEff annotation- transcript information discordant to the information available on the Ensemble website usegalaxy.eu support variant-analysis , snpeff	11	4055	January 29, 2020
ANNOVAR alternative usegalaxy.eu support tool-deprecated , variant-analysis , snpeff	1	1195	April 8, 2020

Comparing SNPs across samples and sample groups

Related topics