Uploading large number of files from shared data store to Galaxy

Hi there

I have 1923 bacterial samples (3846 files) in a folder on our HPC cluster that I need to get into our Galaxy server (ultimately these need to be paired collections). I have “FTP Upload” configured with a directory on the HPC available to upload to the Galaxy server (i.e. the Galaxy server uses the same shared store as this cluster), but when I try and use the upload form it fails after a while with a ‘504 Gateway timed out’. What is an efficient way to get this data into Galaxy?

Ideally I’d love to use the rule based uploader to organise this data while I upload it.

Is scripting this with parsec perhaps the best option? Alternatively is permitting users to load libraries using the user_library_import_dir a good option?

1 Like

I later reconnected to the Galaxy server and discovered that the samples had all been added to my history. In other words, even though the UI showed a timeout the upload continued and eventually completed. On Gitter the advice was to increase the uWSGI timeout in the nginx config, something I subsequently did.

1 Like