"An error occurred setting metadata for this data set"-STR DETECTION

libinglin · February 14, 2022, 1:47am

I am getting error with the STR DETECTION. This file cannot be used for further analysis.
error-1

/corral4/main/jobs/040/701/40701606/galaxy_40701606.sh: line 123: galaxy-set-metadata: command not found

jennaj · February 14, 2022, 11:21pm

Did you already click on the link to retry “auto-detection”? If not, do that first.

This looks like a job that failed for a technical reason. It was probably a cluster hiccup or there might be some problem with the format of the dataset. Common reasons include: partially uploaded data or possibly an uneven number of columns per row or the upstream tool failing to fully output the result. If the latter, you should try a rerun of that upstream tool.

libinglin · February 15, 2022, 3:07am

Hello
yeah, I click on the link to retry “auto-detection”. I Add an annotation to this dataset,but the next “count” missing columns of this database.

In addition, I found that all the data in this database are right aligned. Is there a problem with the format?

Thanks

libinglin · February 15, 2022, 3:13am

The figure below
No erroneous data was previously analyzed

libinglin · February 16, 2022, 3:09am

Dear Jennaj
yeah, I click on the link to retry “auto-detection”. I Add an annotation to this dataset,but the next “count” missing columns of this database.
In addition, I found that all the data in this database are right aligned. Is there a problem with the format?

jennaj · February 16, 2022, 7:46pm

Hi @libinglin

It looks like you are using a custom reference fasta for hg38 in some steps, and in other steps, you are assigning the built-in index for hg38.

Mixing up which reference genome is used in the same analysis almost never works out.

Try using the same exact reference genome for all steps. If you really need to use a custom genome, promote it to a custom build (with a custom “database aka dbkey” that isn’t already in use), then assign that custom database to your data.

Please note that using a very large custom genome fasta will cause some tools to run out of memory, including the “custom build” functionality. “Very large” includes the human genome, most other mammalian genomes, some plant genomes. For these cases, using the built-in index is a better choice.

I’ve added a tag to your post that links to other Q&A that describes how to format a custom genome plus how to create a custom build from one. You should start with the fasta formatting first, then try to create a build from it (will likely fail but you can try). Or, you can go directly to using the built-in indexed version of that same genome instead.

libinglin · February 18, 2022, 2:33am

thanks, I rerun several parameter options of “auto-detection” and the database format is ok. Thank you very much!

jennaj · February 18, 2022, 4:45am

Hi, The format may be technically correct, but it is not matching the built-in index exactly. This doesn’t mean your fasta data in the history is incorrect – just different in small technical ways from the indexed version and the tools cannot interpret the difference (most likely “>” title line content is different).

Please try again using the built-in index for all steps instead of using a custom reference fasta. That will avoid technical problems, and is known to work best when working with such a large genome.

Topic		Replies	Views
Metadata generation failed usegalaxy.org support troubleshooting	1	574	April 9, 2024
chimera.vsearch usegalaxy.org support	0	497	December 9, 2018
Please add the STR detection tool usegalaxy.eu support workflow , microsatellite	19	280	May 16, 2024
BWA-MEM results in "An error occurred setting metadata for this dataset" usegalaxy.org support metadata , mapping	10	2794	January 21, 2019
issues with the pipeliine DNA Methylation data analysis usegalaxy.org support reference-index	1	217	December 13, 2023

"An error occurred setting metadata for this data set"-STR DETECTION

Related topics