"An error occurred setting metadata for this data set"-STR DETECTION

I am getting error with the STR DETECTION. This file cannot be used for further analysis.
error-1

/corral4/main/jobs/040/701/40701606/galaxy_40701606.sh: line 123: galaxy-set-metadata: command not found

1 Like

Hi @libinglin

Did you already click on the link to retry “auto-detection”? If not, do that first.

This looks like a job that failed for a technical reason. It was probably a cluster hiccup or there might be some problem with the format of the dataset. Common reasons include: partially uploaded data or possibly an uneven number of columns per row or the upstream tool failing to fully output the result. If the latter, you should try a rerun of that upstream tool.

Hello
yeah, I click on the link to retry “auto-detection”. I Add an annotation to this dataset,but the next “count” missing columns of this database.


In addition, I found that all the data in this database are right aligned. Is there a problem with the format?


Thanks

The figure below
No erroneous data was previously analyzed

Dear Jennaj
yeah, I click on the link to retry “auto-detection”. I Add an annotation to this dataset,but the next “count” missing columns of this database.
In addition, I found that all the data in this database are right aligned. Is there a problem with the format?

Hi @libinglin

It looks like you are using a custom reference fasta for hg38 in some steps, and in other steps, you are assigning the built-in index for hg38.

Mixing up which reference genome is used in the same analysis almost never works out.

Try using the same exact reference genome for all steps. If you really need to use a custom genome, promote it to a custom build (with a custom “database aka dbkey” that isn’t already in use), then assign that custom database to your data.

Please note that using a very large custom genome fasta will cause some tools to run out of memory, including the “custom build” functionality. “Very large” includes the human genome, most other mammalian genomes, some plant genomes. For these cases, using the built-in index is a better choice.

I’ve added a tag to your post that links to other Q&A that describes how to format a custom genome plus how to create a custom build from one. You should start with the fasta formatting first, then try to create a build from it (will likely fail but you can try). Or, you can go directly to using the built-in indexed version of that same genome instead.

thanks, I rerun several parameter options of “auto-detection” and the database format is ok. Thank you very much!

Hi, The format may be technically correct, but it is not matching the built-in index exactly. This doesn’t mean your fasta data in the history is incorrect – just different in small technical ways from the indexed version and the tools cannot interpret the difference (most likely “>” title line content is different).

Please try again using the built-in index for all steps instead of using a custom reference fasta. That will avoid technical problems, and is known to work best when working with such a large genome.