This error message usually means that there is a data or parameter problem to resolve. It can happen with any tool. In short, the requested analysis completely overwhelmed the underlying tool and it quit out without more details.
The root problem is that the simplified processing in the tutorial you are following doesn’t include all of the steps needed for working with read data. It will be fine to show you the basic outline of steps using the tutorial example data, but with full sized data, you’ll need to do a bit more or there will be a lot of noise in the data that this tool doesn’t understand how to handle.
For read QA see this tutorial, or you can use a workflow that applies the same manipulations but for batch data (one or more pairs in a collection).
Go to Workflows → Public workflows and search with the keyword quality. You can run this directly without even importing it, and input the paired end collection output from the NCBI Faster tool that you already have. One pair or 100 pairs, this will all work fine.
The output will be some reports about your data, then the trimmed reads ready to use with BWA-MEM. These may look great, then you are ready for the next step. In your case that is mapping the reads.
If the reports do not look good, then import the workflow, add more filters to Cutadapt, save, rerun until you do like the results!
You could optionally try this: map with BWA-MEM and toggle on the read groups (defaults are fine with a single sample) then use a tool like Filter BAM with parameters like these below, followed by MarkDuplicates before proceeding to Freebayes. More tutorials for all of these tools are linked from the tool forms.
Tool Parameters
Input Parameter
Value
BAM dataset(s) to filter
MyBam.bam
Select BAM property to filter on
isProperPair
Select properly paired reads
true
Select BAM property to filter on
isPrimaryAlignment
Select BAM records for primary alignments
true
Select BAM property to filter on
mapQuality
Filter on read mapping quality (phred scale)
>=20
Would you like to set rules?
true
Enter rules here
Not available.
Finally, now that you have some mapping results and variant calls that have been created from sequences with a bit of polishing, with the BAM filtered in a scientifically appropriate way, the variant calls derived from that BAM will have a much more meaningful result that the CustomProDB tool should be able to process.
Please give this a try and let us know if it actually helps or if you need more help.
I’m running a test with your original HISAT2 bam file through some of those modified steps. That will take some time to run, plus you can see what I did it happens to turn out Ok! HISAT2 is a fine choice, or you can try with BWA-MEM instead.
Then I have another test using the tutorial for the tool as the baseline. The sample data processes fine through those steps. This is because the initial reads were cleaned up a bit to make the flow easier.