I am novice trying to reproduce analysis from a published to study to better understand my own similar data. The paper is using Salmon followed by edgeR, for which I have not found a direct tutorial. However I am able to use salmon and DESEQ2 together, in what appears to be a successful analysis that yields no significantly changed genes. Some what surprising but also what I saw with a similar designed experiment. The trouble I am having is that I can’t seem to get the data to load into edgeR, the program fails, with both the quantification file and the gene quantification file. Poking around it looks like folks who have tried to do this analysis, have first converted the salmon output using tximport, which also doesn’t work for me using the GTF file as the annotation. I read the info on github for this tool and they recommend making a file “tx2gene” that has two columns, transcript ID, Gene ID. So my questions are:
- What is the best way to make such a file? Biocart at ensembl? Will that be formatted correctly
- This process seems to be implemented in DESEQ2 as part of running the “package”. I found this out when looking to see if DESEQ2 would accept the “gene level” quantification. It spit out an error below.
How is deseq2 able to use the GTF to generate this info for TXimport.
Obviously if I am way off base please point me in the right direction.
Thanks!