Transcriptomics troubleshooting

Hi @Priyanka_Bhandary

For this kind of data – I usually look at the common keys between the inputs first. Computers are literal: Chr1, chr1, and 1 all mean something different.

Also: extra whitespace, empty values, empty blank trailing list, values with .N where N is a version being compared to values without the version, values that contain whitespace (check your fasta > lines) … and more.

If the basic formats are intact, more check usually involve:

  1. Genome assembly/build – use the exact same source for all inputs or expect problems. Or, you can get fancy and convert between builds – and that would be extra, upstream data prep steps (common manipulations, but not really “simple”).
  2. Chromosome names
  3. Gene names
  4. Transcript names
  5. Computed values – scientific notation versus not, and consistently used

The error you got is probably about missing/empty values, or values that are not matching up (not unique, or become not unique during processing/data reduction).

Columns of 0 values without a header describing the sample in a singleUniqueWord would be one example I’ve seen. Maybe look at the data immediately upstream from GOSEQ first? Can you run the tool directly and it works? Or also fails, the same as the workflow did?

Or, try what I usually do, instead check that the inputs make sense first, then review the workflow steps in order to make sure some stray setting isn’t causing the problem. Why do I back all the way up? Trying to diagnose technical problems based on scientific results is so much harder, and might be missed right up until the final data reduction. Or, might be missed entirely! Not all problems will fail a tool and instead just produce weird results.

Most of the Q&A at this forum involves some variation of the problems above. The solutions vary but are still similar, and the tools in this tutorial (more of a guide actually) can help to find content problems. Data Manipulation Olympics

After checking the above, and you are still stuck, try to reproduce the error with the smallest data possible at a public server. Then share the history back and we’ll take a look :slight_smile:
Troubleshooting errors