Database error: No alias or index file -- BLAST+

BLAST does not find my Galaxy-hosted and Galaxy-created database

There have been previous reports of problems with locally hosted databases, but this is for a Galaxy-hosted database.
The error message is
LibCvmfs version 2.4, revision 25 BLAST Database error: No alias or index file found for nucleotide database [/scratch/03166/xcgalaxy/main/staging//29564501/inputs/dataset_43031652_files/blastdb] in search path [/scratch/03166/xcgalaxy/main/staging/29564

This database was created on Galaxy using the ‘make BLAST database’. I ran the make database program twice, in both cases the database appears to lack an alias/index file.
Is this a Galaxy software problem?

Advice?

1 Like

Hi @richard22

Apologies for the delayed reply. This was a server-side problem.

Please see this related Q&A. In short, corrections have been made and you can try again if working at UseGalaxy.org.

Dear Jennifer

I’m afraid the problem has not been fixed.

I tried to run again, same result.

Regards

Richard

Prof. Richard Lathe

University of Edinburgh Medical School

Hi @jenanaj,

I was initally able to successfully run makeblastdb (see my previous post), but now I am having the same issue as @richard22 . I run blastn with the database I created with makeblastdb, but it produces the error:

"BLAST Database error: No alias or index file found for nucleotide database [pathname] "

I checked the files and there is no .pal or .pl files (apparently this is the index file that is missing?)
Is there possible a way to make my own index file? This issue doesn’t seem to be resolved unfortunately.

I tried running makeblastdb again but I get the error:
BLAST Database creation error: Multi-letters chain PDB id is not supported in v4 BLAST DB

1 Like

Hi @richard22

^^ This was part of the prior issue (the database wasn’t created correctly when makeblastdb was run before.

^^ This is an error generated by BLAST indicating a formatting problem with the sequence identifiers in your fasta custom genome input to makeblastdb. BLAST is either actually having trouble parsing the identifiers (content on the “>” title lines) – OR – there is some other format problem and this is a fall-back error BLAST produced.

Let’s check your sequence identifiers first.

  1. Would you please post back ~5 “>” title lines
  2. And post back the first ~3 or so lines from the first sequence (including the title line and sequence lines)
  • Tool: Select (direct link Galaxy)
  • Options:
    • Select lines from: your_fasta_custom_genome
    • that: leave at the default (“Matching”)
    • the pattern: ^>

Run Select then copy/paste back the first few title lines in a reply, along with a short sample of a single sequence. If you don’t want to post the content publically for some reason, please reply in a direct message to me here.

You could also send in a bug report from your error dataset in case we need to look closer. Please do not delete the inputs/outputs, include a link to this Galaxy Help topic in the comments, and let us know you sent that in when posting back the sequence/title data.

Let’s start troubleshooting more from there :slight_smile:

Thanks for a bit of direction @jennaj. Please see below the sequence ids and one full sequence from the data set I am trying to BLAST against my database:
^>BLA_1[54-113]
^>BLA_2[144-203]
^>BLA_3[8-229]
^>BLA_4[242-334]
TTGACGAGCCCAATCCCTTGGTCGAAGCTGAGACTGAACCCGAAGTCAGGTCACTTGAAGCAAAGCAACCTGTACCTAAAAAAAGCACTCCAC
^>BLA_5[384-413]

For the database, I have only been successful with version 2.7.1 of makeblastdb. Below are a sampling of seqids and a sequence from that file. This was from the nt file downloaded directly from the ncbi ftp site and unzipped on Galaxy (local instance). I tried to run megablast on 2.7.1 to match the version I made the database with, but still having issues.

^>X17276.1 Giant Panda satellite 1 DNA
GATCCTCCCCAGGCCCCTACACCCAATGTGGAACCGGGGTCCCGAATGAAAATGCTGCTGTTCCCTGGAGGTGTTTTCCT
GGACGCTCTGCTTTGTTACCAATGAGAAGGGCGCTGAATCCTCGAAAATCCTGACCCTTTTAATTCATGCTCCCTTACTC
ACGAGAGATGATGATCGTTGATATTTCCCTGGACTGTGTGGGGTCTCAGAGACCACTATGGGGCACTCTCGTCAGGCTTC
CGCGACCACGTTCCCTCATGTTTCCCTATTAACGAAGGGTGATGATAGTGCTAAGACGGTCCCTGTACGGTGTTGTTTCT
GACAGACGTGTTTTGGGCCTTTTCGTTCCATTGCCGCCAGCAGTTTTGACAGGATTTCCCCAGGGAGCAAACTTTTCGAT
GGAAACGGGTTTTGGCCGAATTGTCTTTCTCAGTGCTGTGTTCGTCGTGTTTCACTCACGGTACCAAAACACCTTGATTA
TTGTTCCACCCTCCATAAGGCCGTCGTGACTTCAAGGGCTTTCCCCTCAAACTTTGTTTCTTGGTTCTACGGGCTG.
^>X51700.1 Bos taurus mRNA for bone Gla protein
^>X68321.1 B.taurus mRNA for cyclin A
^>X55027.1 Bovine mRNA for chromogranin B
^>Z12029.1 B.indicus gene for alpha-lactalbumin

I sent in a bug report as your requested with a link to this Help topic.

Thanks again!

Andrea

1 Like

Just closing this out.

As communicated via email from the bug report, the fasta dataset had duplicated identifiers.

All >indentifer content must have a unique name within the same fasta to generate an index.