An error occurred when I used Trinity to analysis my RNA-seq data (clean paired-end fq.gz files)

Hi, I met an error when I used the Trinity tool in Galaxy website.
First, I uploaded my paired-end RNA-seq clean data (12 files in total, containing 3 control samples and 3 test samples, such as control1_1.fq.gz, control1-2.fq.gz, control2_1.fq.gz, control2_2.fq.gz, …, which had been quality-controlled and filtered by the fastp software.) from my computer. Then, I found the Trinity software by typing “trinity” in the tools bar in the upper left corner of the galaxy website homepage and used this tool. All other options are default, except for the option of ‘paired end’. In the “Left/Forward strand reads”, I selected all the six X_1.fa.gz files( x represents control1, control2, control3, test1, test2, test3). In the “Right/Reverse strand reads”, I selected all the six X_2.fa.gz files( x represents control1, control2, control3, test1, test2, test3). Afterwards, I clicked on the “run tools” button to start the analysis. However, the results turns red. One of them show that “Trinity on data 47, data 45, and others: Gene to transcripts map”, An error occurred with this dataset: 格式 tabular 数据库 [?]., while the other show that “Trinity on data 47, data 45, and others: Assembled Transcripts”, An error occurred with this dataset:格式 fasta 数据库 [?]
Could you help me to solve this problem? Thank you in advance.

Hi @luweidong
Try transcript assembly for a single sample. Does it work?
Check the standard output and standard error log file by clicking at info icon of any output from the failed job. In the middle window find the the log files and expand the black boxes. Do you see any meaningful messages at the end of the log files?
Kind regards,
Igor

Dear Igor:
Thank your kind help. I try to use the Trinity tool on the Galaxy website to assemble a paired-end single sample (two files, including forward and reverse data files) as you suggested, but an error still appeared. But I still could not find the reason which lead to my analysis failure. I uploaded the error log information

, could you help me to solve the problem? Thanks in advance!

Best wishes,

Weidong Lu

Hi @luweidong

The screenshot you posted is from the bug icon, not the job information icon. The latter has many more details that help with solving problems, and is what @igor was asking for to help more.

How/where to find job logs: Troubleshooting errors

When you get a red dataset in your history, it means something went wrong. But how can you find out what it was? And how can you report errors?

When something goes wrong in Galaxy, there are a number of things you can do to find out what it was. Error messages can help you figure out whether it was a problem with one of the settings of the tool, or with the input data, or maybe there is a bug in the tool itself and the problem should be reported. Below are the steps you can follow to troubleshoot your Galaxy errors.

  1. Expand the red history dataset by clicking on it.
  • Sometimes you can already see an error message here
  1. View the error message by clicking on the bug icon
  2. Check the logs. Output (stdout) and error logs (stderr) of the tool are available:
  • Expand the history item
  • Click on the details icon
  • Scroll down to the Job Information section to view the 2 logs:
    • Tool Standard Output
    • Tool Standard Error
  • For more information about specific tool errors, please see the Troubleshooting section

What to do next

  1. Please post back what is in the stderr logs.
  2. You could also copy/paste back that entire page, with each of the sections fully expanded. This will speed up getting help, since any of those details could be the clue to solving the input problem.

Next time, for the fastest help, go ahead and post back that entire job information page content when asking a question. Don’t worry about it being long. You’ll get faster and better help from many more people. If you decide not to post the full content for some reason, then feedback is mostly an educated guess or just more requests for that same information.

When working in Galaxy, most of the technical details are in that view, instead of having to list things out. Meaning, most of these questions are answered in a single view: What information should I include when reporting a problem?

Writing bug reports is a good skill to have as bioinformaticians, and a key point is that you should include enough information from the first message to help the process of resolving your issue more efficient and a better experience for everyone.

What to include

  1. Which commands did you run, precisely, we want details. Which flags did you set?
  2. Which server(s) did you run those commands on?
  3. What account/username did you use?
  4. Where did it go wrong?
  5. What were the stdout/stderr of the tool that failed? Include the text.
  6. Did you try any workarounds? What results did those produce?
  7. (If relevant) screenshot(s) that show exactly the problem, if it cannot be described in text. Is there a details panel you could include too?
  8. If there are job IDs, please include them as text so administrators don’t have to manually transcribe the job ID in your picture.

It makes the process of answering ‘bug reports’ much smoother for us, as we will have to ask you these questions anyway. If you provide this information from the start, we can get straight to answering your question!

Please post back the content, and we can try to help solve the problem :slight_smile:

Dear jennaj
Thank you for your kind reply.To illustrate the issue, I will briefly describe the specific details of the entire process as follows:
Using illumina instrument and double ended sequencing method, we measured the transcription of fungal mycelium under abiotic stress. The raw data files were fq.gz format. We upload our data from my computer to the Galaxy website. In the pop-up window for uploading files, we selected “regular” and the default “Auto-detect” was changed to “fq.gz”, while the other parameters were selected as default.The data upload process was successful, and the final status was displayed in green. Then we use the fastp and trimmomatic tools separately to filter the raw data in order to obtain clean data (the quality control work using Fastqc has been completed on the offline local computer).The data filtering work was also successful, and the output results of both software were displayed in green. Afterwards, we use the output data (just using two files of a sample, including sample1_read1 output and sample1_read2 output) of the fastp tool as the input files for transcript assemble. The color of the result turned red. In addition, we use the output data (just using two files of a sample, including sample1_read1_paired and sample1_read2_paired files) of the trimmomatic tool as the input files for transcript assemble, and the color of the result also turned red.
1,when I click on the red result made by the fastp tool, the wrong information displayed as follows:


CernVM-FS: loading Fuse module… done
CernVM-FS: mounted cvmfs on /scratch/03166/xcgalaxy/main/staging/50920122/.cvmfsexec/dist/cvmfs/data.galaxyproject.org
CernVM-FS: loading Fuse module… done
CernVM-FS: mounted cvmfs on /scratch/03166/xcgalaxy/main/

2,when I click on the red result made by the trimmomatic tool, the wrong information displayed as follows:


|      ||    \ |    ||    \ |    ||      ||  |  |
|      ||  D  ) |  | |  _  | |  | |      ||  |  |
|_|  |_||    /  |  | |  |  | |  | |_|  |_||  ~  |
  |  |  |    \  |  | |  |  | |  |   |  |

(The obove garbled codes, we don’t understand what they mean either)

Please post that exact information back, or we cannot help more here.

Find those here, thanks!

OK, jennaj, I got it. The screenshots were attached below.

Then I expand the black box of the Tool Standard Output for 41:

continued…

Then, I expanded the black box of the Tool Standard Error for 41. It just contained one line:

The wrong information of data 39 was different with the data 41. Its tool standard output was attached below:

continued…

Hi @luweidong
couple comments on the procedure. You can do all data transformation with fastp, there is no need for Trimmomatic. Add filter on read length, say 30 nt or longer. Note that some tools such as Trimmomatic apply filters in the specified order, so filter by length must be the last option.

Can you share the history, so I can have a look? The procedure: History menu > Share or publish > Make history accessible > copy the URL and paste in in reply to this email. If you have many files in the history, copy relevant files into a new history and share it.

Kind regards,

Igor

1 Like

The output result of fastp software is used as an input file for Trinity software, but the result is still incorrect and displayed in red. The corresponding error message is as follows:










Hi @luweidong
the links are for datasets. With these links I can download the data, but I need access to the history. You can delete the message with the links. Please try the following: History menu in the top right corner of the history panel > Share or Publish > Make history accessible > Copy and paste the URL here.

Thank you!

Kind regards,
Igor

Dear igor, this is the URL of history:

Best wishes,
Weidong Lu