Downloading Large Files - 1Gb Limit (Galaxy Main | wget | curl | API Key)

I’m a relatively savvy Galaxy user, however just today, we are experiencing issues in downloading any file from Galaxy Main larger than 1Gb. Irrespective of file size, the resulting downloaded archive is 1.08Gb, and fails to expand on opening.

I am downloading using a good internet connection (University-based) and am doing nothing different than usual!

Please help!

Best wishes,

D

1 Like

So, I’ve checked locally and there is no traffic/network shaping in place. I’ve also repeated the below on two different computers;

(1) Tried to download collection >1Gb using API key as previous using both curl and wget - all fail at 1.08Gb. Attempting to download a single file from within a collection, without API fails due to authentification.

(2) Tried to download single dataset using direct wget/curl without API - wget fails, but curl works

(3) Tried three different web browsers to download either a collection or a single file >1Gb - all fail at 1.08Gb

Using MacOS Mojave

1 Like

Hi @KEELE

Sorry that you are having problems and thanks for doing all of the troubleshooting.

Which Galaxy server are you working at? You state Galaxy Main but we’d like to confirm that with the URL.

Also, to help expedite the troubleshooting, share back some of the command strings that you used and note which failed/worked. It is Ok to xxx mask out the actual dataset http address and API keys, but leave all other content, including punctuation, intact. Use “Pre-formatted” text so that any spaces, etc are preserved.

We can follow up from there, either here publically or in a direct message based on your reply.

Many thanks for your response :slight_smile: I have done some further investigation;

(1) I’m using usegalaxy.org

Command to download a collection [FAILS];

wget -O MyTestArchive.tgz 'https://usegalaxy.org/api/dataset_collections/bcaddaXXXe5aafbb/download’?key=90a1876062110873XXXXX38a426cca8a

NB: I’ve redacted the data ID and my api key using XXX - I can confirm the correct details were provided in the command.

Running this command twice, results in two different file sizes as below, none of which are anywhere near the expected size - I have 20 files in the collection, many of which are several Gb each;

MyTestArchive.tgz [ <=> ] 175.44M 6.60MB/s in 43s

MyTestArchive.tgz [ <=> ] 208.08M 3.18MB/s in 48s

Command to download a single file from within a collection [WORKS];

wget -O SingleFileWithinCollection.tgz ‘https://usegalaxy.org/datasets/bbdXXXXXcb8906b5fc7e7eced1d68b4e/display?to_ext=data&hdca_id=bcadda227e5aafbb&element_identifier=ID1-DZ_A_TTACCGAC-CGTATTCG_L008.fastq.gz’?key=90a1876062110873XXXXX38a426cca8a

Conclusions; individual files from within a collection appear to be accessible and download correctly using wget or curl. However, when attempting to download the entire collection (by capturing the collection url from the disk icon), the download stops at an apparently random point. Yesterday I found this to be ~ 1.08Gb, whereas today, this appears to be ~ 200Mb.

Your help is most appreciated.

Deleted for clarity; only partial first response published.

For everyone’s benefit, adding a hard line to an email cuts off the response! :slight_smile:

Deleted for clarity; only partial first response published.

Can anyone help with the download of collections please?