As you may know, we use amplicon based NGS to do whole genome sequencing of COVID-19 virus. Totally 100 amplicons are used to cover the whole genome of the virus. As you know, an amplicon contains two primers, namely forward and reverse primers. So 100 amplicons involve 200 primers. Those primer sequences should be removed from the NGS reads before aligning to references. I wonder how to clip primer sequences using your Galaxy tools?
There are many primer trim tools but I don’t think many use an input file for the primer sequences. With cutadapt you can insert your forward and reverse primers separately as files.
Assuming you are using the artic primers you can download those in bed format (https://github.com/artic-network/primer-schemes/tree/master/nCoV-2019)
Tools like ivar trim and Samtools ampliconclip can “trim” primers after mapping based on a bed file. So that might be an option for you. I personally don’t have experience with the tools.
You can use ivar trim, which is also the tool we use in the GalaxyProject Covid-19 workflows.
We also maintain primer scheme files to be used with this tool at Galaxy | Europe and at https://zenodo.org/record/5888324.
You might also be interested in this tutorial: Mutation calling, viral genome reconstruction and lineage/clade assignment from SARS-CoV-2 sequencing data.
samtools ampliconclip should be a good alternative to ivar trim, but I haven’t tested it myself.
Thanks. I wonder how to make primer files in BED format by myself?
Use the existing examples to guide you, I’d say