Decoupler on AnnData object from Scanpy - task fails

miRlyKayleigh · June 20, 2024, 9:05am

Hello

I am trying to do pseudobulk analyses on my single cell dataset. I think the only tool in Galaxy that allows this is decoupler. I just don’t understand how to get it to work - I’ve searched for workflows or histories with “pseudobulk” and “decoupler” to try and see how other people used it but am coming up blank.

I have an AnnData object from Scanpy that I’ve used processed cell data to filter the pre-normalisation file down to the cells I want. Individuals are denoted by “patient_no” and I want to analyse differential expression between the conditions in the column “LN_status”, so I think patient_no should go in sample key and LN_status in groupby column? I think the field that’s throwing it off is the “Layer” one - despite it saying optional, if I leave it blank it says “TypeError: ‘>=’ not supported between instances of ‘NoneType’ and ‘int’”. If I put “counts” in there as I’ve seen in most examples of using decoupler then it says counts doesn’t exist as a layer.

Do I need to give my AnnData object a counts layer? And how would I do that? Or is there something else I’m doing wrong here?

History is here: https://singlecell.usegalaxy.eu/u/kayleigh.s/h/20240603-megadataset-just-naive-cells-harmony

Any help much appreciated.

Kayleigh

jennaj · July 11, 2024, 6:40pm

Hi @miRlyKayleigh

Apologies for the delay … your use case is interesting and doesn’t have an obvious solution!

Let’s try to get some advice from some of the scientists who designed the Decoupler pseudo-bulk tool (Galaxy version). Would you have time to help @pavanvidem or @pcm32 ? Or suggest someone who could? Thanks!

References →

Pseudo-bulk functional analysis — decoupler 1.7.1 documentation
Galaxy | Tool Shed for TS repository decoupler_pseudobulk
container-galaxy-sc-tertiary/tools/tertiary-analysis/decoupler at main · ebi-gene-expression-group/container-galaxy-sc-tertiary · GitHub Development repository for decoupler_pseudobulk

pcm32 · July 11, 2024, 7:44pm

Hi there,

For pseudo bulk you need to have raw counts somewhere in your AnnData. After a normal single cell analysis that is normally not what you would have in your adata.X. So while someone could contribute a fix of the tool/wrapper, the fastest fix would be to have a layer in your AnnData with the raw counts (and give that name in the field).

I hope that helps!
Pablo

pavanvidem · July 12, 2024, 3:27pm

Hi Kayleigh,
Please set Minimum Counts and Minimum Total Counts parameters and try again. You may set these values to 1 if you have already filtered your data.

It is good to store the raw counts in for eg in the counts layer before processing. Anndata Operations tools should help. Alternatively, you can use Copy AnnData to .raw and enable Use raw in the decoupler tool. I guess both of them should work. For more details on the sequence of steps, please follow the tutorial from the developers: Pseudo-bulk functional analysis — decoupler 1.7.1 documentation

@pcm32 I tried to update but ended up with this issue: Error when running get_pseudobulk(): AttributeError: 'csr_matrix' object has no attribute 'A' · Issue #141 · saezlab/decoupler-py · GitHub
Let’s wait until the conda package with this fix is out before updating.

Hope your history will be green again
Pavan

Topic		Replies	Views
Convert RNA STARsolo output to AnnData format usegalaxy.org support single-cell	1	148	April 18, 2024
Need help with single-cell RNA-seq analysis workflows usegalaxy.org support single-cell	4	308	June 14, 2024
The tool "filter with scanpy" does not work!	1	494	March 15, 2021
Single Cell Visualization help usegalaxy.eu support workflow	0	608	July 13, 2020
Filter single cells by a specific gene in scRNAseq dataset using scanpy usegalaxy.org support single-cell	1	399	July 25, 2023

Decoupler on AnnData object from Scanpy - task fails

Related topics