FastQC doesn't work

fastqc
error
ftp-upload
#1

Hello
I’am trying to make a fastQC “FastQC Read Quality reports” on this file: NGS13_S1_L001_R1_001.fastq.gz

But there is always an error:

Fatal error: Exit code 1 ()
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/jobdir/022/411/22411229/_job_tmp -Xmx7g -Xms256m
Failed to process file NGS13_S1_L001_R1_001_fastq_gz.gz
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Ran

What i can do to success my FastQC

thank you

1 Like
#2

Welcome, @delphineboulet!

Click on the job details icon for the error dataset job-details

Then on the stderr link inside the report. You’ll probably see something like this for the full error message:

uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Ran out of data in the middle of a fastq entry. Your file is probably truncated

If that is included in your error, then the Upload of your data was not complete or the data was truncated before uploading to Galaxy or possibly malformed in the middle somewhere (from some upstream transfer or manipulation).

Try re-uploading the data first (use FTP if large), making sure it completely loaded, and see if that resolves the problem. If not, examine the data locally to see if truncated/malformed in the original and fix it before uploading to Galaxy.

We can troubleshoot more from there with help from the Galaxy EU team if needed.

Thanks!

#3

Hello,
Maybe this observation can help us going through this issue.
Both compressed (.gz) or non compressed files lead to the add of a .gz extension.
If I upload a compressed file file.fastq.gz, the resulted uploaded file will be displayed on Galaxy as file.fastq.gz.gz (2 times gz)
If I upload a non-compressed file file.fastq, the resulted uploaded file will be displayed on Galaxy as file.fastq.gz (1 time gz)
This happen to all the files I tried to upload, even with the previous successful files.
thanks
delphine

1 Like
#4

The displayed name of the file once loaded into Galaxy is not used by the application – it is just a label. What is important is the assigned datatype, and this should be a match for the data content. There are built-in checks to assign datatype when “autodetect” is used with the Upload tool.

Uncompressed fastq data should be given an uncompressed datatype and compressed fastq a compressed datatype.

What datatype is assigned to your data in each case? And did you check to make sure these files are complete and can be uncompressed/re-compressed locally to ensure they are intact? Meaning, the upload was successful/complete for a full dataset (not partial or truncated)?

The EU server has been undergoing some changes – ping @hxr for administrator level help in interpreting your feedback in that context.

Thanks!

1 Like
#5

Ping @bjoern.gruening for compressed datatype help. FTP is currently unavailable.

1 Like
#6

I have also problem with illumina WES x100 forward fq.gz when it is unzipped in my galaxy history to run fqc
Fatal error: Exit code 1 ()

gzip: /data/dnb02/galaxy_db/files/dataset_8664883.dat: unexpected end of file

Reverse fq.gz runs right fastqc
I imported 2 files at same time from my sequencing.com account where I checked fastq files were complete and I ran bowtie2/samtools (EvE app) to generate a GRCh38 annotated VCF (~218,000 variants).
In my usegalaxy.org account I only successed hisat2 aligner to GRCh38 and bcftools call showed almost same variants than sequencing’s bowtie2/EvE VCF.
I want to run BWA MEM in Galaxy to pileup all fastq readings (raw BCF) although when I try BWA or Bowtie2 in Galaxy always crashed.

#7

She can you share one of those files? I’m unable to reproduce the upload issue. If I upload a “foo.fastq” file it gets added as “foo.fastq” and no gz is added. Are you using usegalaxy.eu or org?

#8

Hello
I’m using https://usegalaxy.org/ What type of file I have to use to run a FastQC job?
X.fastq? or an other type?
thanks

1 Like
#9

Fastq (including compressed) and SAM/BAM are all accepted inputs.

  • The tool will fail if the inputs are truncated. This usually occurs when a Upload did not completely transfer a file, although the input could have been truncated during an earlier step/transfer (upstream from Galaxy, so go back in check the data locally if the completely uploaded data, confirmed by FTP, fails with tools).

  • How to use FTP for larger files and/or slower connections FAQ: Loading Data >> https://galaxyproject.org/ftp-upload/

  • The possibility of an incompatible fastq compression format is also something to check – loading uncompressed data usually resolves errors for those cases.

If those don’t help, what is the error? Click on the bug icon to review the problem and post back the complete top of that form - tool version + error message (screenshot or text). You could also submit the problem from the form to our internal mailing list (if working at usegalaxy.org). If you choose that, please include a link to this post in the bug report comments so we can link the two.

#10

Hello,
“Click on the bug icon to review the problem and post back the complete top of that form - tool version + error message (screenshot or text).”

This information are into my first post of this topic. I managed to perform a fastQC (on a little data file) but, the graph Per tile sequence quality (and only this) is always totally blue, without any curve.

Thank you for help

#11

If that is still your error, then the data is probably truncated or has corrupted compression. If all else fails, load uncompressed data. The tool Fastq Groomer can be used for basic format QC (use all default settings) – if any reads are malformed (deviate from fastq format), the first occurrence will be reported in detail.

#12

hello,
I don’t find the error message. Here, the screen shot of my problem. The message in a little box is:
“Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/jobdir/023/241/23241389/_job_tmp -Xmx7g -Xms256m”


thanks for all
delphine

1 Like
#13

Hi @delphineboulet

The FastQC job itself was successful. My guess is that the input fastq sequence identifiers are not in the full, original, Illumina format anymore. That identifier information is needed to run the analysis in this module. If the identifiers are modified, then the module “passes through”.

FastQC Docs

#14

I will try
thank you