Hello!
We have a pool of approximately 600,000 150bp (+/- a few) reads generated from NGS.
From this pool, we want to identify the number of times a certain sequence arises. Is there a way to do this using Galaxy?
Hello!
We have a pool of approximately 600,000 150bp (+/- a few) reads generated from NGS.
From this pool, we want to identify the number of times a certain sequence arises. Is there a way to do this using Galaxy?
Should add, we have approximately 12 reference sequences we want to identify in the pool for their prevalence.
Hello,
Thanks for your response.
Our data files are CSV where each sequencing result is listed as a row - do you have any suggestions how to use this format, as that tool requires FASTA.GZ files?
Many thanks
The .csv datatype means a “comma separated values” type of data file.
Try this:
Tutorial → NGS data logistics