Regex Problem using Import tool on QIIME2 for Galaxy

I’m trying to use the ‘qiime2 tools import’ for SampleData[SequencesWithQuality] with ‘Single Lane Per Sample Single End Fastq Directory Format’.
I think I’ve correctly build the manifest and metada files. But I keep getting the same persistent error regarding the element ‘name’.
This parameter asks me for “Filename to import the data as. Must match regex: .+_.+_L[0-9][0-9][0-9]_R[12]_001.fastq.gz” but it doesn’t matter which name I try it always seems to fail, so Galaxy doesn’t even execute my job.
I’ve also tried to look up RegEx matchers to know if my created name would be suitable but these don’t seem to be helpful.
Can you help me in this matter? I’m not really a coder hahaha but I really want to understand this.

p.s. everything runs perfectly when i’m using QIIME2 on command line, the problem seems to be nested in galaxy… (which I prefer because buttons)

Update:

The import function has more discussion cross posted over at the EU Matrix chat → You're invited to talk on Matrix


Welcome, @sashaphex

The “element identifiers” are the names of the files inside the collection. When creating a collection, the file extensions are usually removed – that is why the extra option to add them back in is also listed.

Tools in the Collection Operations tool group can be used for renaming element identifiers. Is that what you are doing?

Example for using the tool when run in Galaxy. Note that the naming is more specific than command line (technical reasons…). If sample ids are included in any other inputs, the naming should be consistent.

SampleIdIndexedSingleEndPerSampleDirFmt

Single-end reads in fastq.gz files where base filename is the sample id

The full file name, minus the extension (.fastq.gz) is the sample id.

  • sample-42_S1_L001_R1_001.fastq.gz is sample-42_S1_L001_R1_001

Please post back a few of the element identifiers names if you would like more help. You could also post a shared history link. Troubleshooting errors

Hey @jennaj
Thanks for such a quick response.
My problem lies here:


I can’t seem to progress after this step because I can’t even execute my job…
Do you know how I could solve this or maybe if there is a way to work around it? Maybe the collection method was you mentioned?

p.s. thank you for the access to the EU Matrix chat, I will also check there for insight

Meanwhile in the Matrix people already helped me with this problem.

So for the data “ARA.SOIL.1fastq.gz”, the name “ARA_S1_L001_R1_001.fastq.gz” doesn’t create an error.

Hope this helps anyone having the same problem.

2 Likes

A post was split to a new topic: Troubleshooting Galaxy Docker