Difficulty in getting SRA into Galalxy

mohammed · December 11, 2019, 8:52am

Hi,

I’m trying to upload the following data set on to Galalxy: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89509

I select on the SRA selector and make a note of the SRR number and go to Get Data>Download and Extract Reads in BAM but i get the following notice once completed: 2could not display BAM file error: file was not sequenced defined (mod=‘rb’) is it a SAM/BAM format? "

Not sure what the problem is?

Thanks.

Mohammed

jennaj · December 12, 2019, 4:46pm

Welcome, @mohammed!

Try downloading the data in fastq format using one of these tools. The first is simpler to use but the latter has more built-in grouping options. You could try both on a smaller subset of SRR accessions and review, if you do not understand the differences, then decide which to use for the entire data series (or individual accessions).

Download and Extract Reads in FASTA/Q format from NCBI SRA (Galaxy Version 2.10.4)
Faster Download and Extract Reads in FASTQ format from NCBI SRA (Galaxy Version 2.10.4)

In most cases, there is no reason to extract reads in bam format. The data would then only need to be converted to fastq after to use it with tools in Galaxy, effectively adding more steps to the analysis and consuming more quota space.

Hope that helps!

mohammed · December 12, 2019, 6:48pm

Thank you. Currently using: “Download and Extract Reads in FASTA/Q” format from NCBI SRA (Galaxy Version 2.10.4) tool so hopefully it works. Thanks again.

mohammed · December 13, 2019, 4:40pm

Hi,

I’ve downloaded a small set of this data via “Download and Extract Reads in FASTA/Q…” but now how do convert this file into SAM/BAM? and do I use Bowtie2 to align these BAM files once generated? Not sure what the step wise plan should be when getting data from GEO>Download and Extract Reads in FASTA/Q>COnversion to BAM or SAM?>Bowtie>Feature counts etc.

Thanks,

Mohammed

mohammed · December 13, 2019, 4:44pm

Hi,

Forgot to add that the original seq platform: AB 5500 Genetic Analyzer (Mus musculus) for GEO.

Thanks,

Mohammed

jennaj · December 13, 2019, 5:38pm

@mohammed

The data represents single-end RNA-seq fastq reads. This tool will extract the fastqsanger (Sanger Phred+33) quality score scaled version of the data. These reads need to be mapped to produce a BAM.

Upstream QA/QC should be done (FastQC and Trimmomatic), then either HISAT2 or RNA-star can be used for the spliced mapping step.

Please review the “Transcriptomics” tutorials here for example workflows:

mohammed · December 18, 2019, 3:41pm

Thanks! I’ve tried but no luck. Seems that original fastq file is not the standard fastq format (letters). I see numbers via “eye symbol”. Is there any way of identifying what file type this is (e.g. @2_8_524_F3/1
T…223…012…000…2002.0230.3011.1233…102…113.2001.0221.122)? as it seems the original/host file is not been uploaded correctly? and i will need to request fastq files from authors? Does its matter if its SOLID seq?

Thanks,

Mohammed

Topic		Replies	Views
NCBI SRA Fastq (convert SRA files from GEO into fastq files) usegalaxy.org support metadata , sra , quality-control	7	1777	June 22, 2021
Error with "Download and Extract Reads in FASTA/Q format from NCBI SRA" usegalaxy.org support ncbi , sra , get-data	4	2308	September 23, 2019
Faster download FASTQ from NCBI SRA_ERROR usegalaxy.org support upload , troubleshooting , transcriptomics	3	34	November 6, 2024
Download FASTQ reads from SRA usegalaxy.org support sra	0	700	May 27, 2020
fasta to fastq; fastsanger.gz to fastq; SRA to fastq ncbi , sra , fastqsanger , quality-control	3	5671	February 11, 2020

Difficulty in getting SRA into Galalxy

Related topics