I would like to analyze Nanopore-generated sequence data for 24 PCR fragments from a mammalian species with a published reference genome, and a 25th amplicon from a species without a published reference genome. I have the Sanger sequence for the 25th fragment, which I designate here as 25-Ref. The 25 PCR fragments were pooled in equimolar quantities and run in one flow cell. The sequence data were returned to us as a 20 GB folder with ~1600 fastq files. The combined length of the 25 fragments is < 100 kb, and a very high depth of coverage is expected. My objectives and questions are stated below:
Objectives: local de novo assembly of difficult-to-assemble/sequence regions and to call variants on the sequenced amplicons (& to also generate proof of concept for future projects.)
- I plan to use the Nanopore tools in UseGalaxy.eu and UseGalaxy.org for the analysis of above data. What is the best way of uploading the data to UseGalaxy.eu / UseGalaxy.org via FTP as one folder of files or as individual files for the 1600+ files?
- Since the above Nanopore files in fastq format, can I also analyze the data using the same tools as those used for analyzing Illumina fastq files in Galaxy, e.g. BWA-mem etc., with the first 24 amplicons aligned to the known reference genome and the 25th aligned to 25-Ref?
- Does anyone know of tutorials/methods on sequence analysis of short amplicons/fragments under 10 kb generated by Nanopore sequencing? My googling yielded few useful leads.
Thank you in advance for your help -I am new to Nanopore data processing.