usegalaxy.eu: gz file getting corrupted during ftp import cycle

Any ideas on what can be causing the following file corruption?

Path 1

  • I have a example.fastq.gz file on remote server.
  • I upload it to usegalaxy.eu using curl -T {"example.fastq.gz"} --user user@name.de --ssl ftp://ftp.usegalaxy.eu
  • I import it to galaxy setting it’s datatype to fastq.gz
  • I execute fastQC on it and end up with Ran out of data in the middle of a fastq entry. Your file is probably truncated or ID line didn't start with '@'
  • I download the example.fastq.gz from Galaxy to disk
  • I run gunzip on the file and get unexpected end of file uncompress failed

Path 2

  • I have a example.fastq.gz file on remote server (this is identical file from Path 1)
  • I use scp to copy it to my local disk
  • I use gunzip to extract it
  • I run fastQC locally
  • All works fine

edit: the md5 has changed
original

md5sum CB1A1_S1_R2_001.fastq.gz
33462a19ef9cb2595ffa06a849b062ff  CB1A1_S1_R2_001.fastq.gz

downloaded from usegalaxy.eu

md5 CB1A1_S1_R2_001.fastq.gz
MD5 (CB1A1_S1_R2_001.fastq.gz) = 45284f73109d8866cfe81f3cdb2014ca

@marten does that also happen when you upload the file via web?

@marten Does it happen only with a specific file? Can you share that file with me?

I tried once with the same file and it worked fine.

It happened with many files in that batch of 192 datasets. I’ll share a history with you

update: I’ve followed the same steps as in Path 1 on the whole set (182 files) at usegalaxy.org and everything worked without an error.

I uploaded a set of 57 fastq.gz files, from 400MB to 12GB, using curl from 3 different location in Europe (DE, IT, UK) to our FTP server. All of them were successfully transferred and verified.
As I told you, I think something really weird it’s happening over the network between you and us.

1 Like

Thank you very much for checking. That is a disturbing find nevertheless. :confused: