salmonKallistoMtxTo10x not working

I have been trying to perform a scRNA sequence experiment with the Galaxy tutorial Generating a single cell matrix using Alevin and keep hanging up when I get to the SalmonKallistoMtxTo10x step. The failure happens both with the data set in the tutorial and with a smaller data set directly from 10x genomics. Details on data set and run parameters below. Any suggestion on how to get past this blockage would be helpful. The error output files are empty.

1k PBMCs from a Healthy Donor (v3 chemistry)
Single Cell Gene Expression Dataset by Cell Ranger 3.0.0
Peripheral blood mononuclear cells (PBMCs) from a healthy donor (the same cells were used to generate pbmc_1k_v2, pbmc_10k_v3). PBMCs are primary cells with relatively small amounts of RNA (~1pg RNA/cell).
• 1,222 cells detected
• Sequenced on Illumina NovaSeq with approximately 54,000 reads per cell
• 28bp read1 (16bp Chromium barcode and 12bp UMI), 91bp read2 (transcript), and 8bp I7 sample barcode
• run with --expect-cells=1000


GTF2GeneList extracts a complete annotation table or subsets thereof from an Ensembl GTF using rtracklayer (Galaxy Version 1.42.1+galaxy6)
Ensembl GTF file
Feature type for which to derive annotation
transcript
Field to place first in output table
transcript_id
Suppress header line in output?
Yes
transcript_id,gene_id
Append version to transcript identifiers?
Yes
Flag mitochondrial features?
No
Filter a FASTA-format cDNA file to match annotations?
Yes
Annotation field to match with sequences.
transcript_id

Rename galaxy-pencil the annotation table to Map
Rename galaxy-pencil the uncompressed filtered FASTA file to Filtered FASTA


Alevin Quantification and analysis of 3’ tagged-end single-cell sequencing data (Galaxy Version 1.3.0+galaxy2)
Tool Parameters
Input Parameter Value
Select a reference transcriptome from your history or use a built-in index? history

Transcripts fasta file 10 Filtered FASTA uncompressed (Hidden)

Kmer length 31
Perfect Hash False
Single or paired-end reads? paired
4 pbmc_1k_v3_S1_L001_R1_001.fastq.gz
5 pbmc_1k_v3_S1_L001_R2_001.fastq.gz
Relative orientation of reads within a pair Mates are oriented toward each other (I = inward)
Specify the strandedness of the reads read comes from the reverse strand (SR)
protocol 10x chromium v3 Single Cell protocol
Transcript to gene map file 9 Map
Retrieve all output files True
optional
Whitelist file
noDedup False

dumpBfh False
dumpFeatures True
dumpUmiGraph False
dumpMtx True
forceCells Not available.
expectCells Not available.
numCellBootstraps Not available.
minScoreFraction Not available.
keepCBFraction 1.0
lowRegionMinNumBarcodes Not available.
maxNumBarcodes Not available.
freqThreshold 3


SalmonKallistoMtxTo10x Transforms .mtx matrix and associated labels into a format compatible with tools expecting old-style 10X data (Galaxy Version 0.0.1+galaxy5)
Tool Parameters
Input Parameter Value
.mtx-format matrix 11 quants_mat.mtx
Tab-delimited genes file 13 quants_mat_cols.txt
Tab-delimited barcodes file 14 quants_mat_rows.txt
Prefix to prepend to cell names / barcodes Empty

Hello! Yes absolutely - can you please share your history?

2 Likes

Hey @Wendy_B I’ve been experiencing the same problem (details of the error below). As for the data, I used the one here:
(Given on Generating a single cell matrix using Alevin )
https://zenodo.org/record/4574153/files/Experimental_Design.tabular
https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.100.gtf.gff
https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.cdna.all.fa.fasta
https://zenodo.org/record/4574153/files/SLX-7632.TAAGGCGA.N701.s_1.r_1.fq-400k.fastq
https://zenodo.org/record/4574153/files/SLX-7632.TAAGGCGA.N701.s_1.r_2.fq-400k.fastq

Can you suggest why there’s an error here?

DETAILS OF ERROR:
Execution resulted in the following messages:

Fatal error: Exit code 1 ()
Tool generated the following standard error:

Traceback (most recent call last):
File “/cvmfs/main.galaxyproject.org/shed_tools/toolshed.g2.bx.psu.edu/repos/ebi-gxa/salmon_kallisto_mtx_to_10x/e42c217a450f/salmon_kallisto_mtx_to_10x/salmonKallistoMtxTo10x.py”, line 71, in
mmwrite(’%s/matrix.mtx’ % mtx_out, umi_counts.transpose())
File “/usr/local/lib/python3.8/site-packages/scipy/io/mmio.py”, line 101, in mmwrite
MMFile().write(target, a, comment, field, precision, symmetry)
File “/usr/local/lib/python3.8/site-packages/scipy/io/mmio.py”, line 451, in write
stream, close_it = self._open(target, ‘wb’)
File “/usr/local/lib/python3.8/site-packages/scipy/io/mmio.py”, line 323, in _open
stream = open(filespec, mode)
OSError: [Errno 30] Read-only file system: ‘.//matrix.mtx
Galaxy job runner generated the following standard error:

Traceback (most recent call last):
File “/cvmfs/main.galaxyproject.org/shed_tools/toolshed.g2.bx.psu.edu/repos/ebi-gxa/salmon_kallisto_mtx_to_10x/e42c217a450f/salmon_kallisto_mtx_to_10x/salmonKallistoMtxTo10x.py”, line 71, in
mmwrite(’%s/matrix.mtx’ % mtx_out, umi_counts.transpose())
File “/usr/local/lib/python3.8/site-packages/scipy/io/mmio.py”, line 101, in mmwrite
MMFile().write(target, a, comment, field, precision, symmetry)
File “/usr/local/lib/python3.8/site-packages/scipy/io/mmio.py”, line 451, in write
stream, close_it = self._open(target, ‘wb’)
File “/usr/local/lib/python3.8/site-packages/scipy/io/mmio.py”, line 323, in _open
stream = open(filespec, mode)
OSError: [Errno 30] Read-only file system: ‘.//matrix.mtx

1 Like

Hi @pa.saunders and @rituu-vermaa

This tool SalmonKallistoMtxTo10x needs some tool developer then administrative changes to work at UseGalaxy.org. I suspect Alevin has this problem as well (not confirmed yet).

Please try running this tutorial, or any workflows based on it, at the UseGalaxy.eu server instead for now.

If you need to move existing data between the servers, the how-to is in this FAQ.

Ps: @rituu-vermaa , Thanks for submitting the bug reports. I replied to some of those already re: mixed up inputs. That is a bit common with this tutorial since it repeats groups of steps with slightly different criteria a few times, which is arguably confusing but unavoidable when doing the work step-by-step and not with a workflow/collection yet. You could consider adding in dataset #tags to help keep track of the different runs (example screenshot in that tutorial with tags included). Later on when using a workflow and collections, tags or not, that will be less likely to happen, and is one reason why both of those functions are popular and worth learning about (search the GTN tutorials with keywords for help with those: “collection” and/or “workflow”).

Hope that helps, and apologies for the confusing tool trouble on top of a complicated tutorial!