Transpose operation says there is an extra column, while it does not

Hello
When trying to use the transpose function, I’m getting the following error message:

datamash: transpose input error: line 2 has 11693 fields (previous lines had 11694);
see --help to disable strict mode

I have had this error before, and after downloading the file and checking it on excel, I see this message is not true. Another proof is the total number of columns, seem on the first five lines display here

image

Any suggestions?

1 Like

Hi @Diogo_de_Moraes!

I suspect that there is one of these going on:

  • Empty values for some of the data values. Those can be replaced with a default value (for example, if the data is numerical, 0 can be a good choice for many cases).
  • Data values that contain whitespace (spaces). Those spaces would need to be replaced. Common in text data, especially in the headers.
  • Hidden characters. These are very easily introduced by moving data from Excel > export as tabular > use anywhere else. “Soft returns” in particular have been my nemesis in the past.

Tools in the group “Text manipulation” plus the tool “Select” can usually repair data. Many are the same as line-command versions of utilities with the same/similar name, others are combined line-command utilities. I can’t tell you exactly which to try without seeing the raw data and what is going on inside it content-wise.

Options:

  • Download the dataset from Galaxy and open it in a line-command editor that will allow you to view hidden characters (vi, vim, emacs) and find and replace empty fields (as needed).
  • Send in a bug report from the error dataset (if you are actually working at Galaxy Main https://usegalaxy.org). Please include a link to this post so I can find it quickly and associate the post + bug report into a single issue. I’m an admin at this server and bug reports are private to our admin team.
  • If working on some other public Galaxy server, a share link to the history can be generated (history menu gear icon > “Share or Publish” – be sure to check the box for “sharing included datasets” or I won’t be able to review with enough detail with a non-admin account). Then post that back here publically or sent it to me via a private direct message. Be sure to note the numbers of the input and error dataset just in case it is not obvious.

The graphic below shows how to send (and find!) direct messages here at Galaxy Help. My user name is @jennaj

ghelp-direct-message