Question [Interproscan Output: Missing Columns 12-15]

Hi! I am fairly new in doing Bioinformatics (genome annotation of a non-model fish). Our team finished in structural annotation using BRAKER3 Pipeline. I now have an outputs of braker.aa, etc.

I used eggNOG-Mapper and Interproscan for functional annotation. While using interproscan, I did input my braker.aa file and have the following parameters:

InterProScan database: 5.59-91.0

Applications to run:

TIGRFAM: protein families based on hidden Markov models (HMMs) FunFam: Prediction of functional annotations for novel, uncharacterized sequences. SFLD: a database of protein families based on hidden Markov models (HMMs) SUPERFAMILY: database of structural and functional annotation for all proteins and genomes PANTHER: Protein ANalysis THrough Evolutionary Relationships Gene3d: Structural assignment for whole genes and genomes using the CATH domain structure database HAMAP: High-quality Automated Annotation of Microbial Proteomes PROSITE Profiles: protein domains, families and functional sites as well as associated profiles to identify them Coils: Prediction of Coiled Coil Regions in Proteins SMART: identification and analysis of domain architectures based on Hidden Markov Models or HMMs SMART: protein domains and families based on well-annotated multiple sequence alignment models PRINTS: group of conserved motifs (fingerprints) used to characterise a protein family PIRSR: protein families based on hidden Markov models (HMMs) and Site Rules PROSITE Pattern: protein domains, families and functional sites as well as associated patterns to identify them AntiFam: a resource of profile-HMMs designed to identify spurious protein predictions. Pfam: protein families, each represented by multiple sequence alignments and hidden Markov models MobiDBLite: Prediction of intrinsically disordered regions in proteins PIRSF: non-overlapping clustering of UniProtKB sequences into a hierarchical order (evolutionary relationships)

Use applications with restricted license, only for non-commercial use?: true

Applications to run: Phobius: combined transmembrane topology and signal peptide predictor SignalP (eukaryotes): signal peptide cleavage sites in amino acid sequences for eukaryotes TMHMM: Prediction of transmembrane helices in proteins

Include pathway information: true
Include Gene Ontology (GO) mappings: true
Provide additional mappings: true
Output format: Tab-separated values format (TSV) GFF3 XML JSON

when I viewed my TSV results I noticed that columns 12 -15 (InterPro annotations - accession, InterPro annotations - description, GO annotations with their source, Pathways annotations) are blank (see picture). I guess I want to ask is, is this a normal output? if not what are the causes of this issue/error, and how do I troubleshoot it?

I hope you can help me with this.

Thank you in advance

Hi @KeemAntonio

If the information for the result hit is available, and the options are toggled, it will be reported. But if there is no information available for that result in the InterProScan database, then it is just left blank. The database was sourced from EBI here. https://www.ebi.ac.uk/interpro/download/

You could input the JSON output from Galaxy at their site here. Maybe try a few examples from your data to see how that works? https://www.ebi.ac.uk/interpro/result/InterProScan/#table to compare and explore that way.

Hope this helps and we can follow up more! :slight_smile: