Hi! I am fairly new in doing Bioinformatics (genome annotation of a non-model fish). Our team finished in structural annotation using BRAKER3 Pipeline. I now have an outputs of braker.aa, etc.
I used eggNOG-Mapper and Interproscan for functional annotation. While using interproscan, I did input my braker.aa file and have the following parameters:
InterProScan database: 5.59-91.0
Applications to run:
TIGRFAM: protein families based on hidden Markov models (HMMs) FunFam: Prediction of functional annotations for novel, uncharacterized sequences. SFLD: a database of protein families based on hidden Markov models (HMMs) SUPERFAMILY: database of structural and functional annotation for all proteins and genomes PANTHER: Protein ANalysis THrough Evolutionary Relationships Gene3d: Structural assignment for whole genes and genomes using the CATH domain structure database HAMAP: High-quality Automated Annotation of Microbial Proteomes PROSITE Profiles: protein domains, families and functional sites as well as associated profiles to identify them Coils: Prediction of Coiled Coil Regions in Proteins SMART: identification and analysis of domain architectures based on Hidden Markov Models or HMMs SMART: protein domains and families based on well-annotated multiple sequence alignment models PRINTS: group of conserved motifs (fingerprints) used to characterise a protein family PIRSR: protein families based on hidden Markov models (HMMs) and Site Rules PROSITE Pattern: protein domains, families and functional sites as well as associated patterns to identify them AntiFam: a resource of profile-HMMs designed to identify spurious protein predictions. Pfam: protein families, each represented by multiple sequence alignments and hidden Markov models MobiDBLite: Prediction of intrinsically disordered regions in proteins PIRSF: non-overlapping clustering of UniProtKB sequences into a hierarchical order (evolutionary relationships)
Use applications with restricted license, only for non-commercial use?: true
Applications to run: Phobius: combined transmembrane topology and signal peptide predictor SignalP (eukaryotes): signal peptide cleavage sites in amino acid sequences for eukaryotes TMHMM: Prediction of transmembrane helices in proteins
Include pathway information: true
Include Gene Ontology (GO) mappings: true
Provide additional mappings: true
Output format: Tab-separated values format (TSV) GFF3 XML JSON
when I viewed my TSV results I noticed that columns 12 -15 (InterPro annotations - accession, InterPro annotations - description, GO annotations with their source, Pathways annotations) are blank (see picture). I guess I want to ask is, is this a normal output? if not what are the causes of this issue/error, and how do I troubleshoot it?
I hope you can help me with this.
Thank you in advance