Unfold bug, missing newline char

I’m having a problem with unfold tool.

I uploaded this file to galaxy and used unfold.

gene kegg
gene1 map1,map2
gene2 -
gene3 map1,map2,map3

I was expecting to have this format:

gene1 map1

gene1 map2

gene2 -

gene3 map1

gene3 map2

gene3 map3

but I get this type of output:

gene kegg
gene1 map1gene1 map2
gene2 -
gene3 map1gene3 map2gene3 map3

this is unfold version (Galaxy Version 9.5+galaxy3) at usegalaxy eu

Welcome @Alper_Yilmaz

Would you like to share your history with the example? A few representative lines is enough if you don’t want to share the whole thing (tool: Select first)! We can help with adjusting the parameters. And maybe suggest a tool to use first (to standardize the delimiters with one of the Replace+ tools).

Each custom manipulation can differ! To see an example, this topic is a good place to start. → How to solve problems with data formats: GTN Data Manipulation Olympics tutorials! with more at text-manipulation

For yours, I am interested in what the current input exactly looks like, including any whitespace, and what the the parameters choices were.

I’ll also try to start up a little test independently using what I think your inputs look like based on your shared lines so far. Maybe that dash is triggering some odd behavior! More soon and I’ll watch for your reply.

the url is usegalaxy eu then: u/alperyilmaz/h/unfold-problem (that’s a problem with forums, it does not allow sharing links.. how can we share histories if it does not allow sharing links?)

I tried test.tsv (linux end of line) and test_dos.tsv (windows end of line) and both failed with the “unfold” tool..

The expected result is:
gene kegg
gene1 map1
gene1 map2
gene2 -
gene3 map1
gene3 map2
gene3 map3

but the output is wrong..

Great, thanks @Alper_Yilmaz

I can reproduce this now, and it is a bug with at least two use cases, both overlapping in part with your example.

This is revealed if the “split column” is the last column of the file. I’ve ticketted the issue here with the full details. → Enhancement: adjust newline handling in tp_unfold_column_tool when unfolding the final column · Issue #1852 · bgruening/galaxytools · GitHub

From here, you can do one of these:

  1. use Add column to create a column in your file, use this tool, then remove those extra lines from your output (a filter tool – easier if your add column was REMOVEME type of text to search on.

  2. try a different tool such as the AWK tool → Text reformatting with awk (link at EU , link at :graduation_cap: GTN Training)

    Shared history to see the exact how-to details. → https://usegalaxy.eu/u/jenj/h/help-awk-ghelp-17687

    Select your file, then paste something like this into AWK Program leaving everything else at default options. The "," is the delminter portion.

    BEGIN { FS=OFS=“\t” } { n=split($2,a,“,”); for (i=1; i<=n; i++) { $2=a[i]; print } }

    Or, to always keep a header intact, try:

    BEGIN { FS=OFS=“\t” } NR==1 { print; next } { n=split($2,a,“,”); for (i=1; i<=n; i++) { $2=a[i]; print } }


Please give this a try and let us know how this goes! Thank you for following up! Very helpful and appreciated! :slight_smile:


ps: Everyone should be able to post a generated share link to a public Galaxy history, but not links from anywhere else (when a brand new user!). We would rather not have any filters at all that but the spam attacks have been fierce since the start of the year! Anyone else with an issue posting links, please start up your topic and ask so we can help!