umi-tools extract help with complex regex on Galaxy

Hi everyone,
I am running a UMI-tools extract command that works on HPC:

(example of nextflow command)
umi_tools extract --extract-method=regex
–bc-pattern=".+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})$"
-I ${sample_name}_trimmed.fastq
-S ${sample_name}_umi_cleaned.fastq > ${sample_name}_umi_tools.log

but it does not run properly on Galaxy

I need to specify the following regex so UMI-tools detect the UMI at the end:
.+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})$

However, when I run it on GA, the + and the $ seem to be dropped:
In the log, the pattern comes up as: .(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})

And although UMI runs it fails to correctly detect the adapter and UMI at the end of the read. I end up with a few hundred reads 1-2 bp long instead of several million reads with their proper UMI extracted.

I also tried to use backslashes :.\+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})\$
and an asterisk
`.*(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})``

But both failed

Thanks for your help

Hi @abracarambar,
I opened a PR in order to fix it.

Regards