I’m running RepeatExplorer for ~12 populations. Due to quota limitations, I upload only one population at a time, concatenate the sequences, and then have RepeatExplorer randomly sample 2,000,000 reads. The first run it sampled only 520,000, which I assumed was because it automatically downsamples to meet the reserved memory (4 Gb). However, each run after sampled less: 370,000, 200,000, 130,000, and 70,000. I understand that it may downsample due to the repetitiveness of the genome, but the last few have the smallest genomes and, at least according to the output, much lower repeat content.
Can someone explain why it appears to be behaving this way?