And if it isn’t exactly like UCSC’s hg38, then you definitely should not assign that dbkey to the data, or expect problems.
I think my hg38.analysisSet.fa is from Index of /goldenPath/hg38/bigZips/analysisSet and is slightly different from hg38.fa. What is dbkey and what should i type in this field? And how can i use files from my history as inputs?
I commented my changes to all_fasta.loc and restarted the galaxy. However my hg38.analysisSet.fa is still present in the genome selection list (and hg38.analysisSet.fa from the history is still absent). But when i choose this hg38.analysisSet.fa in “Create DBKey and Reference Genome”, it also gives an error:
python '/home/transgen/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/4d3eff1bc421/data_manager_fetch_genome_dbkeys_all_fasta/data_manager/data_manager_fetch_genome_all_fasta_dbkeys.py' '/home/transgen/galaxy/database/objects/c/c/5/dataset_cc5ffff9-35ab-4878-9080-05e9d2b53732.dat' --dbkey_description 'hg38.analysisSet.fa'
Traceback (most recent call last):
File "/home/transgen/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/4d3eff1bc421/data_manager_fetch_genome_dbkeys_all_fasta/data_manager/data_manager_fetch_genome_all_fasta_dbkeys.py", line 497, in <module>
main()
File "/home/transgen/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/4d3eff1bc421/data_manager_fetch_genome_dbkeys_all_fasta/data_manager/data_manager_fetch_genome_all_fasta_dbkeys.py", line 478, in main
tmp_dir=tmp_dir)
File "/home/transgen/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/4d3eff1bc421/data_manager_fetch_genome_dbkeys_all_fasta/data_manager/data_manager_fetch_genome_all_fasta_dbkeys.py", line 300, in download_from_ucsc
url = _get_ucsc_download_address(params, dbkey)
File "/home/transgen/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/4d3eff1bc421/data_manager_fetch_genome_dbkeys_all_fasta/data_manager/data_manager_fetch_genome_all_fasta_dbkeys.py", line 260, in _get_ucsc_download_address
path_contents = _get_files_in_ftp_path(ftp, ucsc_path)
File "/home/transgen/galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/4d3eff1bc421/data_manager_fetch_genome_dbkeys_all_fasta/data_manager/data_manager_fetch_genome_all_fasta_dbkeys.py", line 65, in _get_files_in_ftp_path
ftp.retrlines('MLSD %s' % (path), path_contents.append)
File "/home/transgen/galaxy/database/dependencies/_conda/envs/__python@3.7/lib/python3.7/ftplib.py", line 468, in retrlines
with self.transfercmd(cmd) as conn, \
File "/home/transgen/galaxy/database/dependencies/_conda/envs/__python@3.7/lib/python3.7/ftplib.py", line 399, in transfercmd
return self.ntransfercmd(cmd, rest)[0]
File "/home/transgen/galaxy/database/dependencies/_conda/envs/__python@3.7/lib/python3.7/ftplib.py", line 365, in ntransfercmd
resp = self.sendcmd(cmd)
File "/home/transgen/galaxy/database/dependencies/_conda/envs/__python@3.7/lib/python3.7/ftplib.py", line 273, in sendcmd
return self.getresp()
File "/home/transgen/galaxy/database/dependencies/_conda/envs/__python@3.7/lib/python3.7/ftplib.py", line 246, in getresp
raise error_perm(resp)
ftplib.error_perm: 550 /goldenPath/hg38.analysisSet.fa/bigZips/: No such file or directory
Run the first 4 core DMs (used by many tools in non-obvious ways), in this exact order
Fasta fetcher 50 – has an option to pick UCSC as the data source.
SAM indexer
Picard indexer
2bit (twoBit) indexer
I could not find “SAM indexer” or “2bit (twoBit) indexer”, but only “data_manager_sam_fasta_index_builder” and “data_manager_twobit_builder” (or “sam_fasta_index_builder” and “twobit_builder_data_manager” as they are called in the version list). I tried to run “Create DBKey and Reference Genome” and then “Picard index” (i had not discovered other data managers yet at this time), and they worked well on sacCer3 and hg38. Then i tried MergeBamAlignment with newly acquired hg38, and it gave me “Do not use this function to merge dictionaries with different sequences in them. Sequences must be in the same order as well” error, but i think it is cos the input files were generated using hg38.analysisSet.fa instead of hg38.fa.
Also i found “Generate GATK-sorted Picard indexes” but i am not sure if i need this (our workflow uses gatk extensively, but it is gatk4 which has no native galaxy support as i know).