Create a collection from a tabular file

Hi!
As an input, I have a single tabular file with several column as given in this example:

col1 col2 col3
1 red dog
2 red cat
3 blue cat
4 blue dog
5 green dog

I would like to create a collection from this single tab file. The collection consist into single datasets created from the input and split according to common column value.
As an example for column 2, the collection will look like:

Dataset_1
col1 col2 col3
1 red dog
2 red cat

Dataset_2
col1 col2 col3
3 blue cat
4 blue dog

Dataset_3
col1 col2 col3
5 green dog

1 Like

Hi @rmassei

Have you tried using this tool? Split file to dataset collection (link at ORG).

You could pre-filter the input, or do more rearrangement with the other collection tools as needed, but I think this operation should be possible. You can put all of that into your workflow for reuse.

We have some great guides about using data manipulation utilities. Most are the same as used command line, so searching the tool panel will find them, if that is something you are familiar with. If this is new to you, the tutorial guides can help you to get started.

We also have collection manipulation tutorials.

You have a nice puzzle to solve but I think this is definitely possible! If you get stuck, you can share back the file, your workflow, and the history with the current outputs, and we can try to give some more specific tips. :scientist:

1 Like

Hi @jennaj,
thanks for your nice and comprehensive answer!
While searching for the split file to dataset collection, it actually appeared also the Split to Group tool which is exaclty doing what I need :slight_smile:
Thank you for all the training matherial, I will troughly check it!

1 Like