Hello,
I downloaded GTF file from NCBI and uploaded it to my Galaxy. I ran HISAT2 along with my sample but I incurred an error while using it with StringTie.
Error : no valid ID found for GFF record
I even removed the header comments and ran Stringtie against my HISAT result but still got an error.
Does anyone seem to know what the problem is?
Hi @Faaiza_Ibrahim_B.Sc
maybe consider providing a brief description of your goal(s). Are you after gene annotation or read counting/gene expression?
You use HiSAT2 and StringTie on bacterial data. HiSAT2 is a gapped aligner that can align reads across introns. StringTie is often used for prediction of genes in eukaryotic genomes. On other hand, many bacterial genes present in operons. Read splitting can be disabled in HiSAT2. Not sure if SringTie is the best option for annotation of bacterial genes, but I am not familiar with the topic. For read counting maybe consider alternative tools such as featureCounts or htseq-count. You may need to tweak the setting, depending on the annotation file and your goals. Read counting tools count reads on annotation present in 3d column (type) and aggregate results using one of attributes from the last column. By default many read counting tools count reads against exons, but exon annotations may not present in some bacterial datasets (no āexonā in 3d column). In this case you need to choose something else, for example, CDS or gene. You may get different results with different settings. The same for the attributes.
As I donāt know what you are trying to archive and cannot check the data, it is hard to answer your question.
Thank you for quick response!
Problem solved. My data is rna-seq and I hope to find out gene expression pertaining to my samples.
I have done untill Deseq2 and I have got certain novel genes (MSTRG)and hypothetical proteins for some genes and I am not sure on how to annotate it.
I planned on retrieving FASTA sequence for the novel genes and performing a two-step BLAST.
BLAST against specific bacteria (In this case Gardnerella vaginalis)
BLAST against related organism to Gardnerella
For hypothetical protein I have not decided on a procedure yet.