Stacks2: process_Radtags won't find paired-end data

Hi everyone!

I’m trying to use the tool Stacks2 : process_radtags to demultiplex paired-end data. I’ve got 2 huge files, R1 and R2, each one at about 45 Gb, in fastqsanger.gz format. The problem I encounter is that when I select “paired-end files” in the first line of the tool, it can’t find my data files. On the other end, if I leave it at “Single-end files”, it finds both of the files with no problem. Should I interlace my two files for this tool to work?

Can anyone help me with this small issue?

Thanks!

Hi @Joelle99,
you need to create a paired collection. In order to do it you need to select Operation on multiple datasets, then select your paired files and finally Build a Dataset Pair.

Regards

1 Like

Hi, and thank you for your answer!
Sadly, I had already tried beforehand to build a dataset pair, but it still can’t find my file with my two reads…

Have you double checked that the format that was assigned by Galaxy is what the tool requires?

Yes I did ! Here’s what they ask for :
image

Both of my files are in fastqsanger.gz, which should then be fine I guess…

Is that the value Galaxy assigned to the uploaded files? You can double check by clicking the edit attributes pencil icon on the dataset, then going to the “Datatypes” tab

Yes it is! Here’s one file for example :
image

Can you do a screen cap of the collection you had created?

There you go!

I’ll admit I have zero experience working with paired collections. Are you sure this is a ‘list:pair’ and not somehow a ‘pair’.
Someone else is going to have to weigh in on this.

1 Like

Well, I followed @gallardoalba instructions so I guess it must be okay? To be honest, this is my first time analysing sequencing data, so I’m just starting to learn how Galaxy (and Rad-Seq analysis) works…

Hi Joelle!

Thank you for reporting your issue. It seems to me there is something wrong somewhere… First, on the process radtags tool (-> maybe it can be of interest for us to know if are you using the last version of the tool ? / in which Galaxy instance ?) , if you have paired-end data, you need to select as input, R1 file(s) (or collection) in the first “paired-end reads infile(s)” ““box”” and R2 file(s) (or collection) in the second “paired-end reads infile(s)” ““box””. So here, it seems to me you created a collection with both R1 and R2 files, so it’s a good dataset pair ;), but with this tool, you need to use separately R1 and R2. After saying that, normaly this tools is not “looking” at the name or content of the files when you are “just” selecting it… so… normally, if the datasets formats are ok, the tool will display these datasets on the “select box”… One exaplanation for ma can be that there is an issue using fastqsanger.gz WHEN in a collection… so MAYBE, you need to use Fastqsanger or fastqsanger.gz input files (so not in a collection) OR use fastqsanger files if in a collection… I have to test it!

So, please, can you try to “just” select your fastqsanger.gz from your history (not in collection) ? And please, pay attention to the fact that your files need to be named something like"name_R1_001.fastq" and “name_R2_001.fastq” so the execution of the tool can work :wink:

Additionally, don’t hesitate to share your history so I can have a look.

1 Like

Hi! Thanks a lot for your return.

So I first used Stacks:process_radtags (Version 1) with no problem with the same files. But, for any reason, it won’t detect my files (even if the files are separate, i.e not in a collection) with Stacks2:process_radtags (Version 2). I am working in the EU instance of Galaxy. Here are my two files :

Thank you! :slight_smile:

OK! It appears there is a pb with the last stacks2 process radtags tool … And I don’t know why, there is no more process radtags tool named as the last one I used. Can you try to access and use this last version who is working for me on eu but seems to be ~hidden : the last procrad tool who is working properly for me

1 Like

Hi again :slight_smile:

Yes, this is also the version I used (1.46), it works well! I based my analysis on this one since I wasn’t able to make the new one (version 2.4) work. I just wished to be able to use the new one in the future as it seems a few improvements were done. But if there’s a true bug, I at least know that it’s not my fault if I can’t get it to work!

I’ll keep checking in the future to see if the bug is resolved and the tool usable for my analysis.

Thank you! :slight_smile:

Hi again :wink:

In fact, it appears that to work properly with the 2.4 tool version, you need to use the “Build list of Dataset Pairs” function, not “Build a Dataset Pair”, even if you only have one pair… I think this will be ok!

2 Likes

It does work!!!

Thank you very much for your time and answer! I’ll be able to test this new version now :slight_smile:

Have a good day/evening!

1 Like