I am trying to use repeat explorer but I am having some issues with it. I will be really grateful if some1 can guide me through it.
I have paired-end reads from a 500bp insert library, as this is the simplest kind of dataset we have. We were kicking around some ideas about using low coverage sequencing to evaluate repeat content. Our goal is to know the percentage of repeats in the genome. The genome size of my insect(beetle) is around 2.5G. I am planning to subsample reads from these low coverage reads and then feed it in the repeat explorer for repeat content estimation in the genome.
My doubt is such simplest kind of dataset which I am about to use, will it give me correct repeat content??? I know if I have RepeatExplorer runs of varying coverage (0.1%,1%,2%,5%,10%) I should be able to observe a saturation curve for repetitive content for your genome. But I am not sure if subsampling has to be done from good coverage data or if a low coverage data will also work??