I have four R1 and four R2 fasq.gz files of a bacterial genome sequence. Do I need to concatenate each of four R1 or each of four R2 files directly?
Hello, the titel of your post and the text contain a different question.
Do I need to concatenate each of four R1 or each of four R2 files directly?
Mostly not, what do you want to do with the files? What kind of analyses do you want to perform?
This can also be a good place to start:
Hi gbbio, i would like to assemble the reads. So i need to combine the fastq files. Thanks
Do you need to do a denovo assembly or a “reference based” assembly/mapping? In other words, do you have a reference or do you know if there is a known genome available for your sequenced bacteria? Either way you mostly don’t need to concatenate your files unless you do some sort of insilico pooling. The assembly/mapping tools have R1 and R2 input fields. If you really want to concatenate you can use a tool called: Concatenate datasets tail-to-head (cat).
Sorry for late, and thanks for your help.
Could you help in the determination of fold coverage please?
The fold coverage is mostly not that interesting but as far as I know you could just do that with fastqc (And do some calculation yourself). Other tools you could take a look at are Samtools coverage and Samtools depth. If you search for “depth” or “coverage” in the tool menu there are even more options.
Looking for solution for the same
@gbbio thanks, Ill try