Searching for Protein Homology in unknown sequence


Im searching for the protein homology of a unknown sequence and was wondering what would be the best way in approaching this? I already found out what type of organism the unknown sequence was, then aligned it to the parent organism, and found the ORF of the aligned sequence. Could someone please tell me if this was the correct approach and if it is, what the next step would be? Thank you and please have a great day!

1 Like

Is your organism hosted at UCSC? You could visualize the hit in the browser and see what other annotation tracks overlap. (quicker/more details and context, if you just have one or a few sequences).

Other options include:

  • Load annotation tracks into Galaxy (from any public source) and compare coordinates with your sequence’s hit coordinates (Tool groups: Bedtools, Operate on Genomic Intervals. (better if you have many sequences)
  • Visualize those data in IGV or UCSC, as a built-in or custom genome.

There is more you can do to perform homology analysis. What type of sequence do you have? How was it mapped? We can give more advice using that additional info. Or, see own tutorials here:

1 Like

Hello Jennaj!

I do not believe my bacteria is found on the UCSC database. Also, the unknown sequence file is a fasta file and is a merged assembly.