Timeout (504) : impossible to show toolshed categories

Hello,

We are installing a new galaxy instance (relase 23.0) with ansible. All works fine except,
we have a problem installing tools. By GUI (“Install or unistall” in admin/tool panel) .

504 error (timeout)
before displaying categories (GUI).
Same error when we try with epehemeris.within SSL and no SSL

The log show only this line

urllib3.connectionpool WARNING 2023-09-21 14:41:58,067 [pN:main.2,p:4120242,tN:WSGI_0] Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x7fe4d0b3c280>: Failed to establish a new connection: [Errno 110] Connection timed out’)': /api/categories

Do you have any idea what could be causing this error or where to look for more information?

Thanks

Hi @ehirchaud

There hasn’t been any downtime recently at the Galaxy Main ToolShed according to our status page here https://status.galaxyproject.org/.

I’ve asked others at the development Matrix chat. They may reply here or there, and you are welcome to join the chat. You're invited to talk on Matrix

References:

If you want to post more details that might help.

  1. Please confirm that you are installing tools from the Galaxy Main ToolShed.
  2. If the tool source is from somewhere different, where and details about how you configured that would be relevant.
  3. Have you previously installed tools successfully from this same source before? Meaning, is this is new failure behavior for something that previously worked or is this the first time you are doing this on the new server?
  4. Would you please share one example of a tool that timed out? The MTS link would be enough.
  5. How many tools are you attempting to install concurrently? For some people, attempting to install large batches of tools/data all at once overwhelms their own internet connection.

Screenshots of the admin UI, local configuration files (with secrets redacted!!), your command line for the query, or anything else you think might be involved can also be posted here for more context.

Let’s start there, thanks! :slight_smile:

Update:

A couple of our developers have also noticed some timeouts and delays over the last few days. Exactly why is not clear yet and could just be the general load on the service is high.

Trying again is the best advice we have right now.

Thanks for reporting the problem, and feel free to also post updates back here.

Update2:

We were able to find the problem, and the Main ToolShed’s connection load is much improved now. So, if you still had problems later last week, please try again now. The correction was applied yesterday.

Details if interested → We had a bot that started targeting the service more, and it doesn’t respect the usual controls (bytespider). This is one blog post about it. In short, it seems that soon if not right now every service everywhere will need to specifically block these two. PSA | Bytedance and Bytespider Bots | Recommend Blocking | WordPress.org.

Thanks for asking about this! The sporadic delays were confusing us too, and community feedback certainly helps with determining scope of impact! :hammer_and_wrench:

Hi @jennaj

Thank your for your answeres.

My problem is still current.

I answer your question bellow :

1: Yes I install from galaxy Main toolShed

A screenshot of my admin UI when I try to install a tool.

galaxy_admin_tool

I have a “old” galaxy instance (22.01) on other computer and I have no problem

I use this config file for tool_sheds_conf:

<?xml version="1.0"?>
<tool_sheds>
    <tool_shed name="Galaxy main tool shed" url="https://toolshed.g2.bx.psu.edu/"/>
    <tool_shed name="Galaxy test tool shed" url="https://testtoolshed.g2.bx.psu.edu/"/>
</tool_sheds>

3: For this instance in 23.0 it’s a new install. I think I have miss config with gunicorn ( its new to me) or nginx. I’m behind a professional proxy and I don’t find anything in log (nginx or gunicorn)

galaxy.yml : 

 gravity:
    celery:
        concurrency: 2
        loglevel: DEBUG
    galaxy_root: /exterieur/galaxy/server
    galaxy_user: galaxy
    gunicorn:
        bind: unix:/exterieur/galaxy/var/config/gunicorn.sock
        extra_args: --forwarded-allow-ips="*"
        preload: true
        workers: 2
    handlers:
        handler:
            pools:
            - job-handlers
            - workflow-schedulers
            processes: 2
    log_dir: /exterieur/galaxy/var/config/galaxy_gravity
    process_manager: systemd
    virtualenv: /exterieur/galaxy/venv

...

galaxy_systemd_env: [DRMAA_LIBRARY_PATH="/usr/lib/slurm-drmaa/lib/libdrmaa.so.1", HTTP_PROXY="{{proxy}}",HTTPS_PROXY="{{proxy}}",http_proxy="{{proxy}}",https_proxy="{{proxy}}"]


nginx conf:

upstream galaxy {
#    server unix:/exterieur/galaxy/var/config/gunicorn.sock;
    server unix:/exterieur/galaxy/var/config/gunicorn.sock;

    # Or if you serve galaxy at a path like http(s)://fqdn/galaxy
    # Remember to set galaxy_url_prefix in the galaxy.yml file.
    # server unix:/exterieur/galaxy/var/config/gunicorn.sock:/galaxy;
}

server {
    # Listen on port443
    listen        *:443 ssl default_server;
   # listen        [::]:80;
    # The virtualhost is our domain name
    server_name   "galaxy-gvb.test.anses.fr";

    # Our log files will go here.
    access_log /exterieur/galaxy/var/log/acces.log;
    error_log  /exterieur/galaxy/var/log/error.log;


    # The most important location block, by default all requests are sent to gunicorn
    # If you serve galaxy at a path like /galaxy, change that below (and all other locations!)
    location / {
        # This is the backend to send the requests to.
       # proxy_pass http://galaxy;
        proxy_pass http://galaxy;

        proxy_set_header Host $http_host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Upgrade $http_upgrade;
    }

    # Static files can be more efficiently served by Nginx. Why send the
    # request to Gunicorn which should be spending its time doing more useful
    # things like serving Galaxy!
    location /static {
        alias /exterieur/galaxy/server/static;
        expires 24h;
    }

    # In Galaxy instances started with run.sh, many config files are
    # automatically copied around. The welcome page is one of them. In
    # production, this step is skipped, so we will manually alias that.
    location /static/welcome.html {
        alias /exterieur/galaxy/server/static/welcome.html.sample;
        expires 24h;
    }
    location /_x_accel_redirect {
        internal;
        alias /;
    }

Hi @ehirchaud

Are you following the tutorials to set up the new server? If not, maybe walk through those steps and compare to what you have done already to find the configuration problem.

It seems that your server is not being found, so agree the timeout is probably on that side (instead of the reverse, where the timeout is with the MTS). The empty logs are another clue that things are not connecting up, so thanks for clarifying that part too.

One thing I noticed: The galaxy_root variable is set to the server directory, not the root where Galaxy itself lives. The server directory is captured in a different variable when following this tutorial. You could try removing the trailing /server … just keep in mind that might not be enough. I’m just showing how the tutorials can help.

Galaxy Admin Tutorials: Galaxy Training!

And, good to know your older server connects after we swatted that bad bot away!

Hi @jennaj

I follow this tutoriel.

In tutorial I see 2 galaxy_root variable one in main scope and an other in gravity scope

gravity:
  galaxy_root: {{ galaxy_root }}/server

This is the one I show .I think the trailling /server is ok for this part.

I’m going to retry step by stepthe tutoriel and I inform you I miss something.

1 Like

I have finally found the solution to my problem. My Galaxy instance is behind a corporate proxy. With the transition to Gravity, the environment variable is not taken into account during the execution of Galaxy. I found in the file venv/lib/python3.8/site-packages/gravity/state.py a variable named DEFAULT_GALAXY_ENVIRONMENT, and I added my proxy parameters to it. And it works. However, I’m not sure how to override this variable with the Ansible playbook.

Hi @ehirchaud

I cross-posted your question over the the Admin chat. They may reply here or there, and feel free to join or use this chat for detailed admin questions that may not be covered in the tutorials. You're invited to talk on Matrix

References:

Have a look at the Gravity configuration options. You can specify an environment dict under the gunicorn section like:

gravity:
    gunicorn:
        bind: ...
        environment:
            FOO: foo
            BAR: bar
2 Likes