I am trying to get myself acquainted with Galaxy working through tutorials, but keep running into problems.
For example, I am trying the " Genome Assembly of a bacterial genome (MRSA) sequenced using Illumina MiSeq Data" tutorial (Redirecting…). After loading the data and renaming them, the FastQC step fails on one of the two datasets (DRR187559_2):
" An error occurred with this dataset:
format txt database ?
application/bz2 Failed to process file".
I did not find the tool " FastQC ( Galaxy version 0.74+galaxy0)" as shown in the tutorial. I used the closest one I found: “FastQC Read Quality reports (Galaxy Version 0.74+galaxy0)”
Is that the problem? What should I have done differently?
Thanks for sharing the history and explaining what is going on. The tool is fine, it seems the problem was introduced during Upload.
The tool is reporting that the input fastq data is truncated. You probably need to upload it again (since it is from a tutorial, and should be complete). Later on, when using your own data, you might also need to check it from the source to find out where the problem was introduced.
Ran out of data in the middle of a fastq entry. Your file is probably truncated
Those types of messages might be in the job logs (click into the job details using the “i” icon), and sometimes directly on the expanded dataset in the history view. It depends on the tool but reviewing inputs, parameters, and logs are always the best places to start troubleshooting. Check to see if you can find it for this job – your example has the same message shown in a few places
thank you!
It`s helpful to see these error messages and understand that the problem occurred during file upload.
I loaded the file directly from NCBI SRA this time and FastQC worked. Then I tried again with the one from the tutorial link (https://zenodo.org/record/4534098/files/DRR187559_2.fastqsanger.bz2) and it failed again. The file sizes are different betwen the two ways of upload. Is it possible that the file on zenodo.org is already corrupted?
The fastq file seems Ok so I suspect the problem is with the compression on that second file. We’ll figure out exactly what is going on and fix it in the tutorials/Zenodo.
Meanwhile, try this:
Load up the two files for the tutorial into a new history. You can delete the history you were working before if you need to recover space, or just don’t want to get mixed up.
Click on the pencil icon for each of those two files, one at a time, and convert to the uncompressed format → fastqsanger
Run through the tutorial, or use the tutorial’s workflow, with the uncompressed versions of the data instead of the bz2 compressed files.
Screenshots for what to select on the pencil icon → Edit Attributes → Datatypes tab. For this use case, uncompress after the data is loaded into Galaxy. Do the same action on both files. Later on, you can learn how to put files into a collection folder and adjust the datatype the same way, but as a batch action.
Step 1 – Click on the pencil icon
Step 2 – Click into the Datatypes tab of the Edit Attributes forms in the center panel, and review the convert choices
Thanks to the ticket filed by @jennaj we have created a new record which will work correctly going forward. Thanks for filing this issue @M_Plank, I had not had enough reason to track down the why before and it was an interesting result.