UMI-tools deduplicate

Hello @Dania_Shikhani

You can try again whenever SRA rejects a query. Their server just gets busy and it impacts everyone, not just Galaxy users.

The short answer is to try again! We know that these accessions are valid since they worked before.

I’m going to use this opportunity to explain the details for anyone else reading along. :slight_smile:

How to interpret an error from Faster Download and Extract Reads in FASTQ format from NCBI SRA

  1. Review the job logs. → FAQ: Troubleshooting errors

  2. Click on the i-info icon for one of the red datasets.

  3. Scroll down into the detailed Tool Standard Output (stdout) log. These are technical/processing errors discovered by the Galaxy wrapper.

  4. Also see the Tool Standard Error (stderr). This where to find reports about the processing details discovered by the underlying tool. Examples are content and parameter issues.

  5. These sections expand if you click on them!

  6. If the stderr has content, go into the Error tab and see if the Galaxy Wizard can describe what is happening.

  7. The Wizard did answer this one correctly (there wasn’t a fastq file to sort into a collection) but the message could be clearer about why and what to do, so I’m glad you asked! We’ll get that tuned up!

  8. Example of what to review. Whenever this is seen, the problem is either with the accessions (do not exist) or the SRA service itself.

    stdout

    Downloading accession: SRR19543607…
    Failed to call external services.
    Prefetch attempt 1 of 3 exited with code 1
    Failed to call external services.
    Prefetch attempt 2 of 3 exited with code 1
    Failed to call external services.
    Prefetch attempt 3 of 3 exited with code 1

    screenshot

What to do

  1. How to confirm that the accession is valid? Reviewing at NCBI is one way.
  1. How to confirm that the file is formatted correctly?
  • the datatype should be txt FAQ: Changing the datatype
  • one accession per line
  • extra whitespace (tabs, lines) will be stripped by our wrapper but you could also clean it up with a tool like Convert delimiters to TAB followed by Cut to isolate a single column
  1. If this is all correct or this same query worked previously, you can proceed directly to trying again! Waiting 10-15 minutes is usually enough. :rocket:

Please give this a try and see how it works now!

Note: I do see a problem with the final tool in my testing history above. Now that the tool is finding UMIs, it needs to know how to group them. How to group is a scientific decision for the protocol. I had used the exact same parameter as you were using, and the log message is stating that a different parameter combination is needed. I would try the suggestion! Once it works, you can modify a workflow to suit you goals (using my template or extract your own!).