Hi all!
I have a bunch of sequences, in fasta format, like
test2
TCCTGGTTCTTATATCTGCCACAAAATAATTTGAATTTTTTAATGAGTTCTTGGTGGCCC
I used NCBI BLAST+ blastn to blast this against following databases:
A) Homo Sapiens GRCh37 (hg19) ncRNA+CDS
>>> which had been imported from “Shared data” to my history
B) Homo_sapiens.GRCh38.cdna.all and Homo_sapiens.GRCh38.ncrna.fa
which I had uploaded in .gz format from Ensembl (ftp://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz)
I choose BlastN among blast flavors. However, when I blast this very same set of sequences at NCBI, I get full length alignments (query coverage=100) with no mismatches. This is not the case with galaxy version of blast. Where am I going wrong?
Blastn @ NCBI gives
XR_001756008.2
NR_047631.1
NR_047618.1
All three with no mismatches and 100% query cover.
Galaxy Blast+ blastn against Homo_sapiens.GRCh38.cdna.all.fa gives ensemble transcripts with query cover of 38/60.
Is this issue related to databases or am I doing something wrong in galaxy?
Thank you very much.