"Auto-detect" data type of uploaded files is set to ZIP for TXT, CSV, FASTA, TSV, and more

We just finished the upgrade to Galaxy 20.01 (from 19.09) and everything works well, except that all uploaded files default to the data type of ZIP. I’ve tested CSV, TSV, FASTA, TXT, and a couple more, and they all have “format: zip” once uploaded/processed and in my history.
I can, of course, manually change each item’s format using the “edit attributes” option, but this is not desirable.

Any ideas why this may be happening? Thanks in advance.

Can you share an example of such a dataset as you try to upload it ? Note that zip archives should not ever be auto-selected on a normal release of 20.01, we have introduced a zip sniffer only in Galaxy release 20.05. Do you have a custom datatype_conf.xml file or a zip datatype installed from the tool shed ?

2 Likes

@mvdbeek - Thanks for the reply. I don’t have any custom datatype_conf.xml files, however I did have a package from the toolshed installed called archive_datatypes (author is cmonjeau). It has its own datatype_conf.xml it seems… As soon as I removed that, all was resolved.

FASTA files now upload and auto-detect as FASTA, same for TXT, TSV, CSV (all I’ve tested so far).

Interesting how in 19.09 this tool from tool shed didn’t cause any issues, but as soon as we upgraded it did. Glad it’s resolved now!

1 Like

Great, thanks for confirming! We’ve fixed a bug that would result in datatypes installed from the tool shed possibly not loading in 20.05 (and backported the fix to 20.01), I guess that this is what caused the problem for you.

1 Like

@mvdbeek - Not sure if this is related or not, but now when I run some tools, I get a bunch of datatype-related errors in the stderr (see below). The tools still finish successfully and I get all my outputs, none of which are any sort of compressed file (zip/tar/tgz/etc…).

EDIT - looks like restarting Galaxy fixed this. I guess it had to recompile some files.

ERROR:galaxy.datatypes.registry:Error importing datatype module galaxy.datatypes.compressed
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 247, in load_datatypes
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype module galaxy.datatypes.compressed
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 247, in load_datatypes
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype module galaxy.datatypes.compressed
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 247, in load_datatypes
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype module galaxy.datatypes.compressed
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 247, in load_datatypes
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype module galaxy.datatypes.compressed
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 247, in load_datatypes
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype class for 'galaxy.datatypes.compressed:Zip'
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 493, in load_datatype_sniffers
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype class for 'galaxy.datatypes.compressed:Tgz'
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 493, in load_datatype_sniffers
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype class for 'galaxy.datatypes.compressed:Tbz2'
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 493, in load_datatype_sniffers
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype class for 'galaxy.datatypes.compressed:Fastqgz'
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 493, in load_datatype_sniffers
    module = __import__(datatype_module)
ImportError: No module named compressed
ERROR:galaxy.datatypes.registry:Error importing datatype class for 'galaxy.datatypes.compressed:Fastqbz2'
Traceback (most recent call last):
  File "galaxy/lib/galaxy/datatypes/registry.py", line 493, in load_datatype_sniffers
    module = __import__(datatype_module)
ImportError: No module named compressed