Check for Genes existence in a FASTA file

Hello everyone,
I am a software developer and new to bioinformatics field. I have a sample file with is FASTA file and want to check if any of genes sequences exists in thie FASTA file.
I tried to search directly for the gene sequence between the contigs sequences but I always get 0 matches.
I have heard that it is not mandatory to find the whole gene sequence but may be a part (percentage) of it, but unfortunately I don’t know to get this percentage.

I am using C# in my code. So, if anyone know who to resolve this or may be you have a library/add-on that can be used from my code.
Also, an article describing the process will be very helpful.

I appreciate your help.

Welcome, @Mohammed_Ramadan

How to perform the analysis depends on what kind of data you are starting with.

Is that fasta file a reference genome sequence? DNA? If yes, maybe start by reviewing our resources here to see if that adds some useful context about how this is usually approached.

And, I’ll just add, that if you source a fasta for a reference genome from a public repository, they also probably have a reference annotation file (or files) with known or predicted gene bounds and potentially other genomic features available. You’ll find that sort of data referenced in the tutorials above, too.

You can ask follow up questions! You probably will not get too much coding support help here, since our topics focus more on the usage side of the Galaxy Project. But, if you are developing a tool for Galaxy, you can try explaining a bit more about what you are doing, and maybe we can help to route you to the correct place or bring in people back here.

Let’s start there! :slight_smile: