get_object.py mentioned in cloud configuration

https://galaxyproject.org/authnz/cloud/demo/

In step 10 python get_object.py is run. Where is this script located? I tried to find it in galaxy github repositories as well, but it is not available. Has the name been changed?

python get_object.py [API KEY]

Ultimately my goal is for Galaxy to be able to import sequences residing within my S3 bucket(s).

  • I have installed a local instance of Galaxy within my AWS account. It is functioning well, with our bioinformatics group running various tools.
  • Within one of my AWS S3 buckets I am storing my sequences and would like Galaxy to import these sequences (> 1000 RNA-Seq files).
  • Although just a demo, I found the following two documents which appeared to fit my Galaxy-S3 bucket requirements: https://galaxyproject.org/authnz/cloud/aws/ and https://galaxyproject.org/authnz/cloud/demo/
  • I was able to define an AWS role and also generate API keys described, but was not able to run the python script get_object.py as provided per step 10. The script was not locatable, but this is understandable since the document is simply a demo.
  • Of note, I did find an object store config file which seems to have the capability to accept a S3 secret key and access key, but using a role I think is a better solution.
1 Like

The demo is running off a pre-baked branch of Galaxy, where a number of configurations are already made. The get_object.py script is also available from that branch, and it is used to POST a request to the cloud storage API to get a demo dataset stored in a S3 bucket into the demo history.

To use this functionality on your own Galaxy instance (not the pre-baked demo branch), you may need to follow the cloud storage documentation; in general:

  1. Setup Galaxy to let users login with their Google account;

  2. Login with your Google account, and get your authnz_id by sending a GET request to the /authnz/ controller;

  3. Setup an AWS IAM Role and define it Galaxy;

  4. Send a POST request to the /api/cloud/storage/get API with a following payload (docs):

    {
        "history_id": "...",
        "authz_id": "...",
        "bucket": "...",
        "objects": [
            "your S3 object 1",
            "your S3 object 2",
            ...
        ]
    }
    

Please note that this flow might be broken on your branch (due to some recent changes), the issue is known and we’re working on it.

I did find an object store config file which seems to have the capability to accept a S3 secret key and access key, but using a role I think is a better solution.

There are a number of important distinctions between the objectstore-based approach and the cloud storage API:

  • ObjectStore is an admin-level setting and the buckets you define will be used to store data belonging to all the users of that instance of Galaxy (albeit users will not see your configuration and buckets), while the API-based approach is user-specific, where the configuration of one user is not visible to the other user, and each user can have their own buckets.
  • You can configure ObjectStore to store all the data added to Galaxy (e.g., upload or generated) on S3. However, you cannot configure it to upload data stored on your S3 bucket onto one of your histories. In other words, ObjectStore cannot read your bucket containing RNA-seq data and upload them onto a history. For this we developed the cloud storage API.

The ObjectStore documentation is available from: Galaxy ObjectStore - Galaxy Community Hub

1 Like

Thank you Vahid! I am assuming the authnz ID is the same as the API key generated in Preferences/create new key?

No, you can get the authnz_id by sending a GET request to the /authnz/ controller (e.g., http://localhost:8080/authnz/)

1 Like