Nanopore data upload, format/datatype, and unexpected results from tools

Hi,

I have an existing dataset from nanopore run in fast5 format. I’m using GALAXY to convert it into fastq format. The tools which I’ve trialled from the tools list are : 1) extract fastq in tabular format from a set of fast5 files and 2) extract read in fasta or fastq format from nanopore files.

I got either 1 line or 0 bytes from these runs,no output.

Does anyone has any idea why? or which step did i miss or input the wrong thing?

Thank you.

1 Like

Hi @jas

This error indicates that a tar archive was loaded and used for the input. For many cases when a tar archive is uploaded, only the first file in the archive is loaded. That first file can be an index, which won’t contain any reads.

These tools are working at UseGalaxy.org using the tool’s test data (raw data URL link). Review to see if/how your input dataset differs and make adjustments as needed. The input was directly loaded by URL with the datatype “autodetected”. Galaxy | Accessible History | nanopore tar upload test

If you get odd results again after adjusting the input (as needed), or if you cannot figure out what adjustments are even needed, please write back and specify:

  1. Which public Galaxy server you are working at (server URL)? The topic was posted under the UseGalaxy.org category, but sometimes that is misassigned.
    • Or, if using a different type of Galaxy deployment, please describe. Galaxy version, how/where sourced and installed, and the last date it was updated.
    • Update your own Galaxy if needed. There are point-updates between full releases – and those are usually important to capture (tend to be bug fixes important enough to include before the next full release) Releases — Galaxy Project 20.09 documentation. Please be aware that the next full release is expected to be published in the next week or so (if all goes as planned!). The pre-release 21.01 is already installed at UseGalaxy.org for integration testing – but don’t install that version yet on your own server.
    • How and from where tools were installed also matters. The most current version should be sourced from the Galaxy Main ToolShed https://toolshed.g2.bx.psu.edu/. Install directly from the Admin area of the Galaxy application and use the built-in dependency resolution function for the simplest way to make sure the install is complete.
  2. What is the “datatype” currently assigned to the dataset when the tools failed to produce output?
  3. Was that “datatype” autodetected during Upload? Or did you assign it directly? What datatype is assigned if you try to “detect datatype” under the function: pencil icon > Edit attributes > Datatypes? Does a rerun work with that datatype assigned (if it differs)?

Let’s start troubleshooting more from there :slight_smile:. We may ask for you to generate a share link to the history containing the work so we can review it in more detail and provide the best feedback. That can be posted into a branched direct message (we’ll start one up – as you won’t be able to yet since are new to this forum) – unless you are Ok with posting the share link publically in a reply to this topic.

Thanks!

Hi there,

Good day to you.

Thank you for your reply.

  1. I’ve had a trial on nanopore tar upload test, I can’t seem to get it detect my dataset named: simulationdataset.fast5
  2. I’m working on usegalaxy.org (e.g. Galaxy | Accessible History | nanopore tar upload test).
  3. For the galaxy version, i run the tests in normal browser. I could see that it is version 21.01.rc1, commit (with a long list of alphabets and numbers).
  4. For tools, I’m a bit confused. The list of tools you from the toolshed link you shared seem different to the tools I see on the left panel from my galaxy browser page…?
  5. Because I’m very new to these, I didn’t change or edit anything. The datatype was autodetected as h5. Does this have an effect on the failed run ?

Do hope I have answered most of your questions…?
Your help is much appreciated.
look forward to your reply.
Thank you.

1 Like

hi @jas

Data in h5 format with that datatype assigned should work fine with these tools.

I’m going to send you a direct message with how to share your history privately so we can review it and provide better feedback. We won’t post anything private back to the public topic but may summarize generalize help or tool issues uncovered, depending on what the problem is (data vs server/tool issue).

Thanks!

Update:

The dataset had some content (and perhaps format) issues. Those are being worked out. No details or workarounds to share with the community at this time should anyone else run into the same type of odd results/errors from these nanopore parsing tools (yet).

2 posts were split to a new topic: Extracting reads from fast5 format