Finding Frequency of SAP in Illumina Data from Short Viral Genome

Hello,

I am trying to find the frequency of SAPs in my sample of Illumina reads of a short viral genome (1682 bp) that encodes for one gene. My current workflow is to run FASTQC, trim my reads, align with Bowtie2, variant call with FreeBayes with a coverage cut off of 30, annotate variants with SnpEff eff, and then was hoping to use “SnpEff to peptide” but cannot get this to work.

Currently my input into “SnpEff to peptide” is my SnpEff eff output and a fasta file of the amino acid sequence of my 1682 bp reference genome. Is the “Ensembl all_pep.fa.” input in a different format than a fasta file? Or is there any reason I should not be using Snpeff to peptide for my sample.

Thank you very much in advance.