Salmonquant didnot work using my Reference transcriptome

Hi @lida-soltanii

Thanks for sharing your history, this made it so much easier to help with exactly what is going wrong!

This is your message from the tool in the job logs (find these logs using the i-con inside of a dataset).

The tool is stating that it found two or more transcripts with the same sequence identifier. You should extract all the identifiers and count them up to find the duplicates. Then make adjustments. Don’t forget to also update your transcripts-to-genes mapping data too, or you will run into more problems with downstream steps.

Then this recent post has more about Salmon in general.


What to do from here

  1. Double check that you do not have any sequences in your transcriptome that have the same name: the tool thinks that you have at least one duplicate, so at a minimum that needs to be solved.

  2. Consider incorporating reference annotation at the Salmon step. You will need that “transcript-to-gene” mapping file when using DESeq2 later anyway. Both forms have details about what the data is and how it is formatted, and we have prior Q&A about it, but please ask more questions if you get stuck.

  3. You have been manipulating your fasta file already to create the hybrid transcriptome but if that wasn’t in Galaxy, I can let you know that you can do that in Galaxy, too! Converting to a tabular format, making changes, then converting back to fasta format is a pretty common way to do this. Your GTF or tabular transcripts-to-gene data is already tabular.

Hope this helps! Let us know if you get this working, or have more questions :scientist: