Problem calling Galaxy History dataset in Python notebook

Hello
I read the Use Jupyter notebooks in Galaxy tutorial and opened a new notebook, but I don’t know how to call the dataset in my Galaxy history in this notebook.
I used the get() command and put the corresponding number in parentheses, but I got an error
can you guide me

Hi @maryam-gh99

The instructions are here: JupyterLab in Galaxy

Things to check:

  1. You have a dataset in your history with a plain text datatype (tabular would be an example).
  2. The command used is get(N) where N is that dataset’s number in the active history from where Jupyter was launched.

Check that and if you need more help, maybe share some screenshots showing the error and a (very small!) shared history so we can test it if needed. How to generate both: Troubleshooting errors. The share link will confirm where you are working – actually EU?

Let’s start there :slight_smile:

I am very grateful for your constant guidance
How nice that I got to know the Galaxy platform and its good team😊

‫‪Jennifer Hillman-Jackson via Galaxy Community Help‬‏ <‪notifications@galaxy.discoursemail.com‬‏> در تاریخ جمعه ۲۳ ژوئن ۲۰۲۳ ساعت ۰:۴۷ نوشت:‬

Hello
I had another question
I found that with the get() command
The desired file is downloaded and stored in the import path, and the get command returns the storage path. I want to return the dataset itself and perform operations on the expressions of the matrix.
Is it possible to guide me further?

Hum. The actual dataset content is supposed to be fully accessible in the environment and in the history when using the gx_get() and gx_put() commands.

Or, maybe I am misunderstanding! Some followup questions, and if you sorted this out already, please do share back what worked for others that find this post.

  • Are you able to reference/access the data imported with gx_get() within the environment?

    • The Galaxy “datatype” metadata is not considered within the Python notebook.
    • Instead, you’ll need to make the tools you are using happy with file extensions or command-line flags or both (requirements vary by tool).
  • And the reverse happens with the gx_put() command, meaning the dataset exported into the history is accessible as data content? Or, it is not and that is the problem (my best guess so far).

    • Does that dataset have a datatype assigned? Which? You might need to adjust it.
    • An expression matrix in plain text would probably be best understood with the datatype tabular.
    • An expression matrix in a compressed format would have a dedicated datatype, and that can vary. Auto-detect might not work well, and it needs to be directly assigned.
    • A datatype of txt or tsv is the same as tabular. Some tools will require tabular is specifically assigned.
    • A datatype of data is a fall-back. That could be compressed data, and would be unreadable until given a more specific datatype.
    • Keep in mind that is possible to create compressed datatypes that are not defined in Galaxy yet. Or, that Galaxy always uses uncompressed. Simple example: convert gtf.gz → gtf, then gx_put() the plain text gtf version.

And, this is new and not perfect yet, but is a listing of the datatype attributes available at any specific server. Adjust the URL for different servers, and know that some may not have it yet. If the tool you were using is available as a wrapped tool, that could provide some more help about what to assign.

https://usegalaxy.org/datatypes