Salmon output issue: differences in line numbers of Quantification and Gene Quantification output files

Hi @Egle

Try with these two files for human as an example of a paired set of reference data.

These are direct URLs – can be copied and pasted into the Upload tool. Leave all the auto settings at default.

For “sanity checks” you can copy one of the transcript_id values and use a tool like “Select lines that match an expression”. Use that transcript_id value as a term for a text search against both of these files. Then check – is that transcript_id only in the fasta file once? Are all the lines in the GTF describing features for that same transcript_id (at a minimum, all the transcript_id lines should all be associated with the same gene_id).

It should definitely be possible to get this from other data providers. If the files at a particular site are confusing, you could write in to them and ask how others get this data. You want a transcript_id fasta and a GTF or two column file that describes the transcript_id(s) plus minimally gene_id(s).