Need help contig assembly

I am new to bioinformatics and aim to get to show variety of 600 bp amplicon of an organism genome. I used two sets of primers each amplifying 415 bp of the 600 bp resulting in around ~200 bp overlapping region.
I pooled the amplicon of these two sets to become one sample and sent for sequencing.
Now I have a “R1” fastq.gz file and a “R2” fastq.gz file.

I plan my analysis to be

  1. Merge PE reads of amplicon from each primer set
  2. Assemble contig from the two merged amplicons
    3.Show variety of sequence feature of contigs.

I tried around several things but not successfully got the result. Can I ask for the guidance?

Thank you so much

Hi, what exactly have you tried/failed? Are you following any tutorial? What is the sequencing platform/organism/data quality?

Hello David,

Thank you so much for the reply. I am very late so I figured out something but I ran into another problem, which I will make a new post of it.

Thank you again.

Hi @Prakit_Saingam,
could you share with us the solution to this question? I could help other colleagues.

Thank you!

1 Like

@jennaj’s answer on this thread may help you.

Hi!
The approach I used is mapping of reads to the reference genome and follow up with variant calling. I followed a galaxy tutorial (From NCBI's Sequence Read Archive (SRA) to Galaxy: SARS-CoV-2 variant analysis). Before this approach, I tried the genome assemble (Genome Assembly of MRSA using Illumina MiSeq Data) ; however, the assemble only resulted in one consensus sequence, which is not my purpose.
The variant calling serves my purpose but I am still trying around with how to present the result (i.e. trying the pileup table export).

I am newbie to bioinformatic. Please advise if there is anything I said is inaccurate or incorrect.

Hope this helps!
Thank you!

1 Like