renaming reads with barcode using cutadapt

Hi there,

I’m trying to cut barcodes out of my reads and append them to the read name. Cutadapt clearly shows how to do this using the -u --rename function and then indicating barcode = {cut_prefix} (suffix, in my case). The cutadapt documentation is here Cutadapt Documentation.
Unfortunately, this functionality seems to be lost in the Galaxy wrapper. You are allowed to add a prefix or suffix to the name, but it only seems to put 1 text value in as the prefix. Does anyone know of a workaround or maybe I’m doing something wrong?

Barcode splitter seems to separate every barcode into different files. I have a random 9bp sequence set, so there would be 10’s thousands of files with 1 sequence in them each.


1 Like

Which galaxy server are you using?
Maybe you can concatenate the data from the info file to reconstruct the reads and play with the adapter names:

The info file contains information about the found adapters. The output is a tab-separated text file. Each line corresponds to one read of the input file.

Columns contain the following data:

1st: Read name
2nd: Number of errors
3rd: 0-based start coordinate of the adapter match
4th: 0-based end coordinate of the adapter match
5th: Sequence of the read to the left of the adapter match (can be empty)
6th: Sequence of the read that was matched to the adapter
7th: Sequence of the read to the right of the adapter match (can be empty)
8th: Name of the found adapter
9th: Quality values corresponding to sequence left of the adapter match (can be empty)
10th: Quality values corresponding to sequence matched to the adapter (can be empty)
11th: Quality values corresponding to sequence to the right of the adapter (can be empty)
The concatenation of columns 5-7 yields the full read sequence. Column 8 identifies the found adapter. Adapters without a name are numbered starting from 1. Fields 9-11 are empty if quality values are not available. Concatenating them yields the full sequence of quality values.

If no adapter was found, the format is as follows:

Read name
The value -1
The read sequence
Quality values

I am using EU and US servers (mostly EU during the day time). They have versions 1.16.5 and 1.16.6 of the tool, respectively … but neither have --rename functionality.

Where do I find the info file? Can I feed it back into Galaxy? Or do I have to process it with another platform?

Sorry, I’m a chemist, so this is all very new to me!


1 Like

Rick, you can find a button to enable the info file as output of cutadapt in both Galaxy servers you’re using.

Perfect! There is also a wildcard output file! I put in my 3’ adapter as ADAPTERNNNNNNNNN. The wildcard output is the Barcode sequences found for N’s along with the readnames. Now, I just need to figure out how to merge this with the trimmed sequences.

Thank you!

You’re welcome. There are tools in Galaxy to work with, like, merge columns, cut columns, awk, etc.

1 Like

Hi @Rickpatbrown,
I’m currently updating cutadapt in order to include the --rename option. It will be available in a few days.

1 Like

Hi @Rickpatbrown, we have updated cutadapt recently. The last version will be available in in a few days.



@gallardoalba Fantastic! Thanks!

1 Like