Salmon Output File "Name" column gene id format

I am using Salmon to TPM normalize some counts data and was able to get a .tabular output file. When reading the file, I saw that the first column is labeled Name and has a list of genes that start with 5HIN. The first gene is “5HIN8:01339:11853.” I am a little confused on the formatting and is there a table with the conversion of this naming format to gene id? Thanks!

The gene values are coming from the reference annotation you are using.

What happens if you run with default settings plus provide a GTF? The gene abundance output should then only have whatever the original values for the “gene_id” attribute in the GTF were.



Thank you for your response. I am using the ion torrent server which only gives FASTQ files and used Salmon to TPM normalize. The ion torrent server does not provide a GTF file. Here is what the result looked like:

Ah, Ok, thanks for clarifying.

That implementation is totally different. This forum is for troubleshooting usage issues in Galaxy.

All I have is guesses about the encoding. Gene, then coordinates for some flavour of sub-footprint all combined into a single ID. You already know the gene ID. The rest you’ll need help with to interpret. Which is exactly what you were asking about originally but I hadn’t “gotten” it yet. :slight_smile:

Try contacting the people who run the service you are working on instead. Or, do a search for user docs or possibly Q&A at a general bioinformatics help forum. Or, the service itself probably has documentation somewhere or possibly a vignette with examples.

