Decoupler on AnnData object from Scanpy - task fails


I am trying to do pseudobulk analyses on my single cell dataset. I think the only tool in Galaxy that allows this is decoupler. I just don’t understand how to get it to work - I’ve searched for workflows or histories with “pseudobulk” and “decoupler” to try and see how other people used it but am coming up blank.

I have an AnnData object from Scanpy that I’ve used processed cell data to filter the pre-normalisation file down to the cells I want. Individuals are denoted by “patient_no” and I want to analyse differential expression between the conditions in the column “LN_status”, so I think patient_no should go in sample key and LN_status in groupby column? I think the field that’s throwing it off is the “Layer” one - despite it saying optional, if I leave it blank it says “TypeError: ‘>=’ not supported between instances of ‘NoneType’ and ‘int’”. If I put “counts” in there as I’ve seen in most examples of using decoupler then it says counts doesn’t exist as a layer.

Do I need to give my AnnData object a counts layer? And how would I do that? Or is there something else I’m doing wrong here?

History is here:

Any help much appreciated.


Hi @miRlyKayleigh

Apologies for the delay … your use case is interesting and doesn’t have an obvious solution!

Let’s try to get some advice from some of the scientists who designed the Decoupler pseudo-bulk tool (Galaxy version). Would you have time to help @pavanvidem or @pcm32 ? Or suggest someone who could? Thanks! :slight_smile:

References →

Hi there,

For pseudo bulk you need to have raw counts somewhere in your AnnData. After a normal single cell analysis that is normally not what you would have in your adata.X. So while someone could contribute a fix of the tool/wrapper, the fastest fix would be to have a layer in your AnnData with the raw counts (and give that name in the field).

I hope that helps!

1 Like

Hi Kayleigh,
Please set Minimum Counts and Minimum Total Counts parameters and try again. You may set these values to 1 if you have already filtered your data.

It is good to store the raw counts in for eg in the counts layer before processing. Anndata Operations tools should help. Alternatively, you can use Copy AnnData to .raw and enable Use raw in the decoupler tool. I guess both of them should work. For more details on the sequence of steps, please follow the tutorial from the developers: Pseudo-bulk functional analysis — decoupler 1.7.1 documentation

@pcm32 I tried to update but ended up with this issue: Error when running get_pseudobulk(): AttributeError: 'csr_matrix' object has no attribute 'A' · Issue #141 · saezlab/decoupler-py · GitHub
Let’s wait until the conda package with this fix is out before updating.

Hope your history will be green again :slight_smile: