why is `plotDEXSeq` output pdf empty

Hey everyone,

I have a short question concerning the Input Parameters of the Tool plotDEXSeq (Visualization of the per gene DEXSeq results (Galaxy Version 1.28.1.0)):
I am not sure about 2 Input parameters in the following screenshot:

  1. Gene identifier: Which gene identifier should I take? I used StringTie in the upstream analysis, which is why I have only MSTRG.tags for novel isoforms. But when I try to run the tool with a MSTRG.-tag from the list (e.g. MSTRG.9077 in the screenshot above), the output pdf is empty (no error occurring). I tried it with gene names, ensembl_ID from known genes in the list - nothing worked, the output is always empty.
    Do you know what is going wrong here?

  2. Specify the primary factor name in the DEXSeqResults object: Just to be sure, should I take what is the name of Factor 1 in my DEXSeq experiment (in this case: Genotype)?

Thank you!

Hannah

1 Like

I’ve been having the same problem. It persists when I select a single gene or a list of genes. Can anyone please help me sort this out. I’m new to Galaxy and DEXSeq so can’t really figure out where I’m going wrong.
When I click on the plotDEXSeq file it says - [1] “en_US.UTF-8”
null device
1
R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
The pdf is empty.
It would be nice if someone could help me out with this.
Thanks
Ayushi

1 Like

Hi @ayushirehman and @HanMue

A few items that are specific to DEXseq analysis that may help with usage.

  1. DEXseq requires that the first “FactorName” is actually labeled/named as “condition”.
    • This is just a quirk of how the tool works. The DE troubleshooting FAQ linked below includes a note about this, and you can find much discussion about it at the Bioconductor support help site (also linked into the FAQ).
    • Meaning, the “primary factor name” will always be “condition” when using plotDEXSeq.
    • If it isn’t named that right now, expect problems. This may require that you need to rerun DEXseq with the labels adjusted. But read the next part, too – you want the GTF input to be correct as well before rerunning.
  2. The tool DEXSeq-Count has two “modes of operation”. Both need to be done.
    1. First: “Prepare annotation” - Run this on the GTF annotation input you want to use for the next step/mode (counting).
    • This could be a GTF from a public annotation source that is based on the same reference genome mapped against.
    • OR – it could be based on a GTF resulting from Stringtie analysis.
    • Note: a Stringtie-merge GTF can include both known AND novel isoforms in your read data discovered with Stringtie. Or just novel if you did not incorporate a known GTF. The processing up to this point would go through steps similar to those in this tutorial at this point. The important part is that there is one more step of intermediate processing – running GFFcompare. Meaning: be sure to “prepare” the GFFcompare output GTF and use that for the counting step.
    1. Second: “Count reads” - Run this using the “prepared” annotation and the mapped BAMs.

At this point, the DEXseq generated counts will be summarized by gene. Some may be known genes and some may be novel genes or all may be novel genes (it depends if you used incorporated known GTF annotation or not when running Stringtie or later with Stringtie merge).

Examine the DEXseq count result datasets – Gene identifier is the first column of data. That value will match up with the prepared GTF’s “gene_id” values for data lines of feature type = “aggregate_gene” (3rd column) as included in the associated attributes (9th column).

The important part is that the tool is filtering on gene identifiers, not transcript identifiers. If you attempt to plot using “transcripts” values (found in the “prepared” GTF’s data lines of feature type = exonic_part), there will be no data to graph, and an empty plot could result.

Other problems could be present. Example: A known annotation GTF incorporated at some point in the analysis wasn’t actually a match for the genome/transcriptome the reads were mapped against. But the above should help with basic usage, and this FAQ can help with the rest: Extended Help for Differential Expression Analysis Tools

Please give this a review and tune your analysis processing as needed. IF and after all of this is correctly set up, and either of you still has problems, we can troubleshoot from there.

Thanks!

1 Like

Thanks for the reply! I’ll look into what I’ve performed and try rectifying the mistakes. Will get back to you in case of any problem.

1 Like

@jennaj,

thank you very much for the helpful answer!
I corrected my workflow processing the Stringtie-merge GTF with GFFcompare and then using the GFFcompare annotated transcripts GTF in prepare annotation mode in DEXSeq-Count.
In DEXSeq, I named the first “FactorName” ‘condition’.
Up to know, the analysis is fine and I would like to annotate my DEXSeq results.
The tool Annotate DESeq2/DEXSeq output tables (Galaxy Version 1.1.0) recommends using the GTF file that was used for counting. In my case, this would be the GFFcompare-GTF-output after application of ‘prepare annotation mode’. But this GTF does not include any gene_names, here is what it looks like:


Going back to the GFFcompare annotated transcripts GTF before running ‘prepare annotation’, it looks like that (only for the feature type ‘transcript’, there is a gene_name and ref_gene_id in the GTF, not for the feature type ‘exon’)

Only the original Stringtie-merge output contains gene_names and ref_gene_ids for both feature types (exon and transcript):

My question is now, which GTF is appropriate for Annotate DESeq2/DEXSeq output tables?
My DEXSeq result file looks like this:

I intended to find out which line corresponds to which gene (ensembl ID and gene_name) and I tried the following setting in Annotate DESeq2/DEXSeq output tables using the unprocessed Stringtie-merge GTF:
grafik
Some MSTRG-tags are annotated afterwards, some not:
- How can I identify the unannotated MSTRG-tags? Do they represent novel splice forms found in StringTie?

Sorry for the screenshot-spam - I hope it helps setting the parameters for the annotation of the DEXSeq results.

Best regards

Hannah