Galaxy ATAC Seq final step error.

Hi, running through the tutorial for ATAC-seq analysis and I’m on the final step to plot the pyGenomeTracks. I’m running into an error for that tool, but I can’t seem to understand why.

Here’s the link to the tutorial: https://training.galaxyproject.org/training-material/topics/epigenetics/tutorials/atac-seq/tutorial.html

And my history: https://usegalaxy.org/u/edwin5588/h/atac-seq-try-5

I’m sure I have matching parameters for PyGenomeTracks to the tutorial. Can I get some guidance on what I got wrong?

Thanks,
Edwin

1 Like

Hi @Edwin_Huang

Thanks for sharing those details, definitely helps!

This is the error message. It suggests that the input file being processed at that step (reference peak file) didn’t contain what was expected.

SO, then I reviewed your data versus the tutorial, and noticed that your file had a different datatype assigned. Datatypes are special in Galaxy and not just labels – these “tell” a tool what the data content represents. Try adjusting to match the tutorial, then a rerun.

To see all of this yourself, click into the “i” icon for a dataset. This is where the inputs and parameters are listed out in a table plus all of the job logs.

  1. Tutorial step
    https://training.galaxyproject.org/training-material/topics/epigenetics/tutorials/atac-seq/tutorial.html#get-data
    .
    .

  2. Tutorial step screenshot


    .
    .

  3. Your data for the failed job, as it was used by the tool (converted to “bed”)

.
.
4. You data for this failed job in the original state in the history (“interval” format). Interval is a very generic file type (on purpose!). Using the more specific “encodepeak” datatype informs the tool about some extra columns, so they don’t get dropped during minor format conversions.
Screen Shot 2024-01-17 at 1.25.02 PM

.
.

So – why does Galaxy convert formats??

Because Galaxy hosts all sorts of tool written by different people, connecting data between these is not so easy (and why labs tend to have dedicated bioinformatics people!).

Galaxy attempts to smooth that out by providing strict specifications for each “datatype” . Converting to similar but different formats is needed sometimes. It depends on what that tool is expecting to work with.

That is mostly what the “Galaxy” wrapper around underlying 3rd party tools is doing – aiding with the flow of data between tools. But, it isn’t perfect of course! If you get an error, just back up from the start and walk through the input steps to spot where things went wrong. Everyone has to learn what files should “look” like. Training your eye to notice these tiny details, along with some quick data manipulations to make tools happy, comes with experience. The GTN tutorials can help with that. Later on when you are using workflows, you can tune up your workflow to do what is needed for these pickier tools.