Errors with NCBI Blast+: cannot index custom database and dc-megablast throws window size errors.

I have been recently advising a colleague on how to mine gene-specific sequences from their metagenomic sequencing data using the Blast+ tools on the web version of Galaxy.
I had them convert their fastqs to fasta and then advised them to make a custom Blast database with these sequences. While Galaxy appears to make the makeblastdb tool available, there is no tool for indexing the database (makembindex). As such, when we try to blast our known gene against our database we get the following error:
BLAST Database error: No alias or index file found for nucleotide database [/scratch/03166/xcgalaxy/main/staging//30576019/inputs/dataset_45082029_files/blastdb] in search path [/scratch/03166/xcgalaxy/main/staging/30576019/working::]

As an alternative, I then advised them to just blast directly against the fasta file without making a database. This works for blastn or tblastx queries, but we again run into a wall with dc-megablast.
As our subject is microbial community sequencing data and we are hoping to grab sequences from different species that have homologous genes to our known gene query, dc-megablast should be the most appropriate tool for this job. Yet, we are continuously running into the following 2 errors:

  1. If we don’t specifically define a window size (ie. aim to use the default) dc-megablasting against a fasta file returns the following error:
    Error: Argument "-window_size". Value is missing Error: (CArgException::eNoArg) Argument "-window_size". Value is missing

  2. If we attempt to define a window size (either 0 or the default for dc-megablast of 40) we instead get this error:
    Error: Cannot convert string '-window_size' to Int8 (m_Pos = 1) Error: Argument "window_size". Argument cannot be converted: -window_size’
    Error: (CArgException::eConvert) Argument “window_size”. Argument cannot be converted: -window_size'

Any advice is appreciated, it could be just me making an obvious bonehead error. I know this can be easily done on an HPC cluster but I want this scientist to be able to do these analyses on their own and scripting is not their specialty/focus.

1 Like

Please see this related Q&A. In short, corrections have been made and you can try again if working at UseGalaxy.org.