Setup for Annovar

Hi,

I try to setup ANNOVAR Annotate VCF with functional information using ANNOVAR(Galaxy Version 0.2) on a local Galaxy server running 24.2.3.

As the tool README suggested, I downloaded the software and the databases, set the databases in tool-data/annovar_index.loc for example:

clinvar_20240917        hg19    filter  /home/galaxy/galaxy/database/dependencies/annovar/humandb/

and set the tool_data_table_conf.xml.sample to tool_data_table_conf.xml. I restarted the server, non of the databases show up in the wrapper. I checked the galaxy.log but no failure or error is recorded.

I can run ANNOVAR, and I get in return a fail indicating, it was running the correct ANNOVAR script but the input to it is wrong. No surprise if no annotation database is given, but it shows the ANNOVAR scripts are known.

It is probably some configuration error, but the available information seems limited.

Thanks!

Hi @casio

Yes, this tool is a bit special. The data is protected now, and we don’t even host it on the UseGalaxy server anymore. We only had one version of the index data, and it was only deployed to the ORG server as a sort of prototype that never progressed further. The tool wrapper itself might have problems with the current Galaxy releases, too.

From there, I’m wondering if the 2024 indexes are incompatible with the older tool wrapper. You might be able to find release notes about table structure/content changes, and they might still have older indexes available? You could try with data circa 2016 since that’s when the wrapper was last updated.

More details about creating reference data “by hand” is in these older docs.

Other than these items, one common issue with custom data tables are minor formatting problems in the loc file itself. So, double check for extra spaces, and that your tabs are actually tabs, no extra newlines, that kind of thing. I’m sure you checked, I’m mostly listing this for anyone else reading about custom index creation later on. These tables are read into the database, so format really matters.

Hope this helps! :slight_smile:

I understood the docs in a way that I can download the latest index data from their website, and I just tell the tool where it is stored via the tool-data/annovar.loc file? The software itself is the latest version, and if anything would fail, it is the call of the software itself, which seems to work.

Hi @casio

Ok, glad you were able to get that far! The tool wrapper is deprecated from our team, but glad you were able to get the wrapper updated. Maybe some connection with the data table isn’t quite updated enough.

For some reason I though this tool and data was now restricted but perhaps your server is for private use though. I would suggest review the license agreement to be sure since they are the definitive owners. From what I remember, hosting this in a Galaxy server for use by multiple people was a problem at some point.

Thanks! :slight_smile:

It took me a while to figure out how to get Annovar running. I’m documenting the process here in case anyone else needs this tool:

  • Install the wrapper from the devteam:
    Galaxy | Tool Shed

  • Download Annovar from their website. According to my understanding of the license, it’s free for personal and research use. I interpret this to mean I can install it on our group’s non-public Galaxy server. Commercial use requires purchasing a license.
    Download link: ANNOVAR website

  • Download the indices provided by the Annovar website:
    Download ANNOVAR - ANNOVAR Documentation
    Select indices according to your requirements and store them, preferably in the /humandb folder within Annovar.

  • Modify the wrapper by removing the dependency on the non-existent Annovar package from Conda. Instead, manually add dependencies for Perl and Python 2.7.

  • Locate the call for python replace_NA.py in the wrapper and replace it with the absolute path to replace_NA.py. For example:
    /home/galaxy/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/table_annovar/d4e292ddda05/table_annovar/replace_NA.py

  • Go to your Galaxy root directory, navigate to the config folder, and create or copy the file tool_data_table_conf.xml from the wrapper location if all your tools are managed via the Tool Shed. This file tells Galaxy where the downloaded Annovar indices are located.

  • Navigate to the Galaxy root directory and change into the tool-data folder. Edit annovar_index.loc to match your needs. Ensure you use only tabs to separate the four entries per line. Check formatting with:
    cat -T annovar_index.loc

  • From the Galaxy root directory, edit the run.sh file and add the absolute path to the Annovar folder at the top:
    export SCRIPT_PATH=$SCRIPT_PATH:/home/galaxy/galaxy/database/dependencies/annovar

  • Restart Galaxy. Verify entries under the admin area:
    Local Data → Tool Data Tables → annovar_indexes

  • Run Annovar and verify its operation. Even if the task turns green, it may have failed silently. A clear sign of failure is if the result has 0 lines. Inspect Command Line and Standard Error outputs for troubleshooting.

This allowed me to successfully set up Annovar on my local lab Galaxy server. There might be simpler ways to handle absolute paths or generalize the setup.

1 Like