How to install kraken2 database

Hello,
I have installed Galaxy locally using Galaxy Helm chart from here: [GitHub - galaxyproject/galaxy-helm: Minimal setup required to run Galaxy under Kubernetes]

I have installed Kraken, Kraken2, and other necessary tools successfully. Then I went on to install the kraken2 database using the data_manager_build_kraken_database from the Install Tool section. Then, after uploading a fastq file for classification, I could not find any database to classify the reads.
Please see the attached photo below:


Can anyone please tell me how to install Kraken database on local galaxy? I am using the 23.01 version.
Thank you very much.

1 Like

Welcome, @Anik_Du

Does the database show up on the Kraken tool form? I’m asking since the indexes are specific to the tool release and your screenshot looks like Kraken2.

Later on, be sure to not mix up data outputs generated between the two – they have been incompatible in the past.

The ToolShed is undergoing maintenance today, but you could also get the Kraken2 data manager and install indexes with that for the matching tool once it is backup.

Hope this helps! :slight_smile:

Hi @jennaj Thank you very much for the reply. Unfortunately, it did not work.
I have now installed kraken2 and also the data_manager for building the kraken2 database. However, after installing this, I do not know how to run the data manager. Because it does not show up anywhere in my local Galaxy interface.
So, still I do not see any database for running kraken2.

This is how I installed Kraken2 and the following screenshot shows my data_manager installation.

I did not understand how you could run the data manager while it was not showing up.

This is how my values.yml file looking like for launching Galaxy locally.


galaxy:
  fullnameOverride: galaxy
  nameOverride: galaxy
  revisionHistoryLimit: 3
  images:
    galaxy:
      repository: quay.io/galaxyproject/galaxy-min
      tag: "23.1" # Value must be quoted
      pullPolicy: IfNotPresent

  refdata:
    enabled: false
    type: cvmfs
    pvc:
      size: 10Gi
  cvmfs:
    deploy: false
    storageClassName: "{{ $.Release.Name }}-cvmfs"

  persistence:
    enabled: true
    existingClaim: "galaxy-k3s-rdloc-galaxy-pvc"
    size: 200Gi

  rabbitmq:
    enabled: true
    deploy: true
    persistence:
      storageClassName: freenas-nfs-csi

  celery:
    concurrency: 1

  postgresql:
    enabled: true
    deploy: true
    galaxyDatabaseUser: postgres
    galaxyDatabasePassword: password
    galaxyConnectionParams: ""
    persistence:
      enabled: true
      storageClass: freenas-iscsi-csi

  configs:
    galaxy.yml:
      galaxy:
        tool_config_file: "/galaxy/server/config/tool_conf.xml{{if .Values.setupJob.downloadToolConfs.enabled}},{{ .Values.setupJob.downloadToolConfs.volume.mountPath }}/config/shed_tool_conf.xml{{end}}"
        shed_tool_config_file: "/galaxy/server/config/mutable/editable_shed_tool_conf.xml"
        admin_users: xxxx@xxx.com

  ingress:
    #- Should ingress be enabled. Defaults to `true`
    enabled: true
    #-
    ingressClassName: nginx
    canary:
      enabled: true
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-production
      kubernetes.io/tls-acme: "true"
      nginx.ingress.kubernetes.io/ssl-passthrough: "false"
      nginx.ingress.kubernetes.io/backend-protocol: HTTP
      nginx.ingress.kubernetes.io/proxy-body-size: "0"
    path: /galaxy
    hosts:
      - host: galaxy.rdloc.xxxx.cloud
        paths:
          - path: "/galaxy"
          - path: "/training-material"
    tls:
     - secretName: galaxy.rdloc.xxxx.cloud
       hosts:
         - galaxy.rdloc.xxxx.cloud

Can you please give any suggestions on how can I make the database work?

Hi @Anik_Du

These data administration tools will not show up in the “regular” tool panel.

Instead, find this under the Admin masthead menu while logged into your administrative account. This is where the Kraken (1) data manager would have been, too. The section is Admin → Server → Local Data.

Let us know if you find it :slight_smile:

Hi @jennaj thanks a lot. Yes, I have found it now and it is working.

But I have another question:
I was trying to build the database from Pre-built RefSeq Indexes and the index files are dated back to June 2022 latest. Is there a way to get updated Indices I see already from here https://benlangmead.github.io/aws-indexes/k2 that they continuously update the database indices.
Please let me know.
Thank you very much for the cooperation.

Hi @Anik_Du

There isn’t a current way to capture the daily/weekly updates from RefSeq (that I know of!) but if this is something that you want to help with – meaning, improve the data manager to handle that – I’m sure others would be interested.

Part of the reason the public servers don’t do this is practical. Any data that is ever hosted, is then retained forever and available for reproducibility reasons. Obviously you can set your own data retention policies on a private server to do this differently! :slight_smile: