Error with IsoformSwitchAnalyzeR step with alternative splicing analysis

Hello,

I’m stuck at the Isoform switching analysis with IsoformSwitchAnalyzeR in the Alternative splicing hands-on.

Since I didnt work with data collections, I have skipped the split collection between the 2 conditions part Hands-on: Genome-wide alternative splicing analysis / Transcriptomics (galaxyproject.org)

And continued with the “Hands-on: Import data with IsoformSwitchAnalyzeR” part, without any trouble. However, when I start the pre-processing step ( Hands-on: Genome-wide alternative splicing analysis / Transcriptomics (galaxyproject.org)), I get the following error.

Error:
Warning message:
In Sys.setlocale(“LC_MESSAGES”, “en_US.UTF-8”) :
OS reports request to set locale to “en_US.UTF-8” cannot be honored
The filtering removed 1597 ( 84.54% of ) transcripts. There is now 292 isoforms left
Error in isoformSwitchTestDEXSeq(SwitchList, alpha = args$alpha, dIFcutoff = args$dIFcutoff, :
A statistical test cannot be performed without replicates. Please remove all conditions with only 1 replicate and try again.
This can either be done before creating the switchAnalyzeRlist in the first place or with the subsetSwitchAnalyzeRlist() function

If I understand the error, it is because the data has no replicates?
My input data of the previous step (Import data with IsoformSwitchAnalyzeR), include in 1 Factor level: the normal sample and in 2 Factor level: the treated sample. Both data are the transcriptomes quantification data from the StringTie Isoform quantification step ( Hands-on: Genome-wide alternative splicing analysis / Transcriptomics (galaxyproject.org)).

Is this error because I have skipped the split collection part (because I did not work with a collection; but with individual files) ?
Could someone please help me?

Thank you in advance!

Hi @jnguyen1

It sounds like you have one sample per factor level but this tool requires at least two samples per factor level. The tool would have the same requirement if used outside of Galaxy.

The rational from the Bioconductor tool authors is included in this FAQ: Extended Help for Differential Expression Analysis Tools as a general reference, and you can find more discussion at their website, both in the vignettes and support forum. Example search → Bioconductor Forum

So, the problem is with number of samples input (4 samples is the minimum, not just two), and not related to collection manipulations. Please explain more if I am misunderstanding.

Hope this helps!

Thanks for the reply!

That makes sense, I guess that was the problem (see screenshot)
Isoform switch

So that means I need min 2 healthy samples and 2 treated samples?
Do they have to be of the same cell type? (ex. I have 1 healthy sample from cell line a and 1 other healthy sample from cell line b, can I use both as input for the first factor level?

Yes.

It depends on the underlying purpose of comparing “healthy” to “treated”.

All replicates in the same factor level should meet the analysis criteria for that group. So … for the very minimum usage: at least two “healthy” and at least two “treated”.

Technically, three replicates for each is “best” for serious analysis but you can read more about the tool at the Bioconductor resources for the exact protocol logic.

In short, the tool measures the differences between replicates in the same factor level group to generate some metrics it then uses when comparing the overall differences between the factor levels.

This second question is a bit broad for me to give more detailed advice… but others can comment if they want to. You could also check publications that include this tool to review what others have done or have considered when making design decisions.

Then I would say … maybe try it and see what it results? Put the tools in a workflow so you can run it 20 times fast and compare. :slight_smile:

Thanks, I have combined two samples who are similar to one factor level and now it works, thanks!

Some of the samples however had this error:

Warning message:
In Sys.setlocale(“LC_MESSAGES”, “en_US.UTF-8”) :
OS reports request to set locale to “en_US.UTF-8” cannot be honored
The filtering removed 5274 ( 75.44% of ) transcripts. There is now 1717 isoforms left
Step 1 of 2: Testing each pairwise comparisons with DEXSeq (this might be a bit slow)…
Estimated run time is: 1 min
Step 2 of 2: Integrating result into switchAnalyzeRlist…
Isoform switch analysis was performed for 556 gene comparisons (100%).
Error in isoformSwitchTestDEXSeq(SwitchList, alpha = args$alpha, dIFcutoff = args$dIFcutoff, :
No signifcant switches were found with the supplied cutoffs whereby we cannot reduce the switchAnalyzeRlist to only significant genes (with consequence potential)
Warning message:
In DESeqDataSet(rse, design, ignoreRank = TRUE) :
some variables in design formula are characters, converting to factors

Does that mean that there are too little differences ( = no significant differences) between both factor levels that the tool couldnt run and gave an error?

Again, thanks so much for replying and helping me

Yes, probably. It is more of a warning about the data content than an error. That message is from the underlying tool, not the Galaxy wrapper. Maybe try searching with the error messages at the Bioconductor forum for exact interpretation directly from the tool authors? You might also find discussion in the associated vignettes or publications.