Infer Experiment results - Unstranded or Stranded (Reverse)?

Hello!

My data is single-end. I analyze gene expression. From the graph, it can be seen that 12 files are Unstranded, and 6 have an indication for Stranded (Reverse).
What conclusion should I draw and which option should I provide to HISAT2 - Unstranded or Reverse?
I’m really wondering how to define it.

Thanks in advance.

Hi @Seahorse

Keep in mind that applying the same strand for all samples depends on how/where the samples were sourced. Questions about that would start with: were all sequenced by the same group, in the same experiment, with the same protocols? Is there anything else special about the data you can find? Public data usually has details about this, and possibly a publication.

This FAQ is related: In 'infer experiments' I get unequal numbers, but in the IGV it looks like it is unstranded. What does this mean?

Next, maybe load the files into something like IGV along with an annotation. Or, UCSC already has annotation if your genome is supported there.

You could also run the analysis twice, and see what results. For whichever is the wrong strand, you’ll notice that in the output. The counting step will fail, or just be odd compared to the other, or it might show up in the DE step.

If you are using a workflow, that should be easy to do. And if you don’t have a workflow yet, the transcriptomics tutorials all have at least one that can be imported and adjusted. Some include mixed samples in the examples.

Hope that helps! :slight_smile:

1 Like