BWA-MEM Index Error - fail to open file

Hello,

I am pretty new here, and I sincerely apologize if this is a duplicate question. I tried to search for this for two days but did not find a similar topic here.

I recently set up a local Galaxy Serve on my office Mac. I am trying to run a metadata analysis, so I needed to run a large dataset. I installed it on my external hard-drive.

Everything worked fine until I tried to run BWA-MEM index. I got data_manager_fetch_genome_dbkeys_all_fasta and got hg38. After downloading hg38 and checked to see whether hg38 is in my drive, I got data_manager_bwa_mem_index_builder.
When I execute BWA_MEM index, it immediately shows the error message saying

[bwa_idx_build] fail to open file ‘/Volumes/Backup/galaxy/database/jobs_directory/000/32/dataset_116_files/hg38.fa’ : No such file or directory
Error building index.

{“param_dict”: {“index_algorithm”: “bwtsw”, “datatypes_config”: “/Volumes/Backup/galaxy/database/jobs_directory/000/32/registry.xml”, “GALAXY_DATA_INDEX_DIR”: “/Volumes/Backup/galaxy/tool-data”, “userId”: “1”, “userEmail”: “”, “dbkey”: “?”, “get_data_table_entry”: “<function get_data_table_entry at 0x1291cfe60>”, “admin_users”: “”, “all_fasta_source”: “hg38”, “user”: “galaxy.model:SafeStringWrapper(galaxy.model.User:<class ‘galaxy.tools.wrappers.ToolParameterValueWrapper’>,<class ‘galaxy.util.object_wrapper.SafeStringWrapper’>,<class ‘numbers.Number’>,<type ‘NoneType’>,<type ‘NotImplementedType’>,<type ‘bool’>,<type ‘bytearray’>,<type ‘ellipsis’>)”, “input”: “lt__function input at 0x12a86e758__gt”, “app”: “galaxy.app:UniverseApplication”, “user_email”: “”, “sequence_name”: “”, “local_working_directory”: “/Volumes/Backup/galaxy/database/jobs_directory/000/32”, “GALAXY_DATATYPES_CONF_FILE”: “/Volumes/Backup/galaxy/database/jobs_directory/000/32/registry.xml”, “user_name”: “”, “sequence_id”: “”, “tool_directory”: “/Volumes/Backup/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_bwa_mem_index_builder/46066df8813d/data_manager_bwa_mem_index_builder/data_manager”, “new_file_path”: “/Volumes/Backup/galaxy/database/tmp”, “user_id”: “1”, “out_file”: “/Volumes/Backup/galaxy/database/files/000/dataset_116.dat”, “GALAXY_ROOT_DIR”: “/Volumes/Backup/galaxy”, “tool_data_path”: “/Volumes/Backup/galaxy/tool-data”, “root_dir”: “/Volumes/Backup/galaxy”, “chromInfo”: “/Volumes/Backup/galaxy/tool-data/shared/ucsc/chrom/?.len”}, “output_data”: [{“extra_files_path”: “/Volumes/Backup/galaxy/database/jobs_directory/000/32/dataset_116_files”, “ext”: “data_manager_json”, “out_data_name”: “out_file”, “hda_id”: 116, “file_name”: “/Volumes/Backup/galaxy/database/files/000/dataset_116.dat”, “dataset_id”: 116}], “job_config”: {“GALAXY_ROOT_DIR”: “/Volumes/Backup/galaxy”, “TOOL_PROVIDED_JOB_METADATA_FILE”: “galaxy.json”, “GALAXY_DATATYPES_CONF_FILE”: “/Volumes/Backup/galaxy/database/jobs_directory/000/32/registry.xml”}}

I tried to run the older genome fetcher and older BWA-MEM, but I had no luck. I even tried to get hg19, but it did not run also.

What am I doing wrong? Is there a way to fix it?

Thanks,
In

1 Like

Hi @incho

In the application, log into your admin account then click on the tab “Admin”.

Under section “Server” are links to the installed data managers and index tables. Start by examining “Local Data”. The data managers will be listed with their associated tables.

Examine those tables – problems you may find are:

  1. Missing content in tables – that can indicate a path problem in your config based on your error. Is the data actually on disk and is the “galaxy user” granted read/write permissions for the exact path reported? /Volumes/Backup/galaxy/database/jobs_directory/000/32/dataset_116_files/hg38.fa
  2. A “dbkey” assigned to more than one row per tool – if you ran a DM twice on the same exact genome (dbkey/database) any duplications can cause problems. Removing duplications is non-trivial but certainly possible. There is no automatic “undo” for data managers.

More help, example usage, and tutorials are in this prior post and may also help to resolve the problem: Indexing reference genomes with Data Managers: Resources, tutorials, troubleshooting

Let’s start there :slight_smile: