I’m running a blastn search to identify best hits for a large batch of sequences. I would like to return the scientific name for the best hit in the tabular data. I using the following command:
blastn -query '/anvil/scratch/x-xcgalaxy/main/staging/70028283/inputs/dataset_921a7f8f-68ab-47c1-a502-b5de65826183.dat' -db '"/cvmfs/data.galaxyproject.org/byhand/blastdb/nt/2023-09-01/nt"' -task 'blastn' -evalue '1e-05' -out '/anvil/scratch/x-xcgalaxy/main/staging/70028283/outputs/dataset_eb92d3c7-bb52-43d1-b0bb-2ad4416fbb01.dat' -outfmt '6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore staxids sscinames scomnames' -num_threads "${GALAXY_SLOTS:-8}" -strand both -dust yes -max_hsps '3'
This works as expected except that the final columns containing the species identifications (14 and 15) are empty (“N/A”).
Upon closer inspection, I am getting the following warning:
Warning: [blastn] Taxonomy name lookup from taxid requires installation of taxdb database with ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz
How can the taxonomic database be added to the Galaxy servers?
Thanks!
Joel