Fatal error in galaxy program: Problems retrieving SRA data from NCBI

fatal error: An error occurred (404) when calling the HeadObject operation: Key “run/SRR8198053/SRR8198053” does not exist

Hello guys, I was trying to extract the sequence data using the galaxy program and I got this popup in the fasterq dump log , does anyone know what does that mean and why is it occuring? It would be also helpful if you can suggest ways to rectify this error

1 Like

Hi @Ashwin_G.k

The data does exist at NCBI’s SRA: GSM3476303: RA1: RA patient1; Homo sapiens; RNA-Seq - SRA - NCBI

Tool for reference: Faster Download and Extract Reads in FASTQ format from NCBI SRA (Galaxy Version 2.10.9+galaxy0)

There are two possibilities:

  1. Rule out usage problems: Check to make sure you entering the accession ID only into the tool form. Any extra content will cause problems. Note that it is possible to input a list of accessions (more than one). That list should be in tabular format and include one accession per line with no extra whitespace or tabs.

  2. Test to see if there is a technical problem: NCBI may be busy at this time. Trying a rerun usually resolves this type of issue. If you keep having problems, you can copy the data URL directly from NCBI and paste that into the Upload tool as an alternative data retrieval method. Find the data link using the first link I posted above.

If a rerun still fails, please write back and note:

  • Where you are working. The URL of a public Galaxy server or explain if some other deployment type.
  • What full tool name including version you are using.
  • Did copy/pasting the data URL work as an alternative?
  • A screenshot of the tool form. Use the “rerun” double-circle icon to bring up the tool form as it was configured when the result was an error.

Note: NCBI is reorganizing its data into a new cloud storage platform. This impacts some accessions but not all, and which accessions are impacted changes over time. Galaxy cannot control or predict of that is what is going on … but there is usually a way to get around those transient data retrieval issues. Let’s eliminate tool usage/technical issues on the Galaxy side, and try the Upload workaround if needed, as first-pass solutions.

Let’s start there.

Thanks!

hello Jennaj, Fortunately the data i extract are working, I do get sinle and paired end data and i’m also able to do quality check on them but this error still exists in all of my datasets,I even get the info on spots,reads read and written

1 Like

Just wanted to note that I’ve also been experiencing this behavior, and like @Ashwin_G.k, I also find that fasterq-dump does manage to download some (all?) of the actual data despite the error message. (I’m using Galaxy Main, fasterq 2.10.9+galaxy0, and fwiw manually downloading the sra archive and then extracting with fasterq does seem to work.)

1 Like

Thanks for the feedback @dmz! Good to learn that more workarounds are possible, and appreciate that you posted it back to the community :sunglasses:

There is not much we can do on the Galaxy side. I expect this will continue until NCBI SRA is done reorganizing their data.