NCBI BLAST+ blastx

saman_ghodsi · October 10, 2024, 9:08am

Hello,

I have a 3-frame translated database of the planarian flatworm derived from its transcriptome. I want to exclude potential non-coding RNA that has been mistakenly translated in my database. I was considering performing a BLASTx search against a non-coding RNA database, and based on the E-value, identifying the non-coding RNA sequences that have been mistakenly translated.

First, I would like to know if this approach is valid, and second, if it is feasible to perform BLASTx against an indexed non-coding RNA database in Galaxy, which I haven’t been able to find. Unfortunately, I no longer have access to the original transcriptome, so conducting a BLAST against that is not an option for me. Could you suggest what I can do to obtain this database? Can I upload it myself, considering its large size?

Thank you very much in advance for your reply.

jennaj · October 10, 2024, 5:02pm

Hi @saman_ghodsi

As far as I know, nucleotide sequence will be much better for this. Could you recover that by mapping your translations back to the original reference genome (determine the coordinates, extract the sequence).

Galaxy has tools for all of this, and you can usually use whatever target/reference sequences you want with any tool using Custom Reference Genome functions (see custom-genome).

We have protocols in our tutorials, plus other tools hosted that are not part of a Galaxy specific tutorial, but those tend to involve sequencing reads. Maybe these still help for context? Please start here if interested. → Transcriptomics / Tutorial List

Let’s start there, and you can ask more questions.