goseq exceeding max run time

Has anybody had issues running goseq on galaxy? the first time I ran goseq it failed after 3 days (error message said it exceeded max. run time) so I fixed an issue with my dataset and hit execute again. It is still running and has been for 20 hours? Is there an issue I am missing or something I can do to speed it up? Thanks!

1 Like

using public galaxy main

Hi @Niamh

Thanks for letting us know which server you are working at: Galaxy Main https://usegalaxy.org.

There were some busier server days over the last few weeks but it doesn’t seem like that is your issue. Instead, the problem is most likely with the inputs (format or content).

Resolving input issues

There are three core inputs to this tool. Each is described in the tool form. It is very important that the format of each matches the examples. In particular, the first column (gene identifier) must be in the same format for all three of the inputs.

Meaning, if any contain extra content, such as a trailing .N after the gene name (where N is a number, specifically the version), the “dot” and “version-number” should be either present in all inputs, or better in most cases, removed from all inputs.

Please give this a review in your particular data. If all seems correct, then allow the job to fully process. If any are a mismatch, make corrections and rerun. If the job still ends with another error once the inputs seem OK, three options:

  1. Post back a few lines (2-3) from each of the three inputs. Please try to preserve formatting by quoting the text.
  2. Generate a share link to the history and post that back. Make sure that all inputs and outputs are in an undeleted state, and that “objects” in the history are shared.
  3. Send in a bug report from one of the failed datasets (“bug icon” in the red error dataset), then post back that you sent one in so we know when to look for it. Be sure to include a link to this Galaxy Help post in the comments so we can find it quickly and associate it with your question here.. If you already sent one in, go ahead and send it in again with the link, then let us know.

For any choice, you can post the information publically in this thread or post back to our direct message thread. Either is fine, it just depends on your comfort level with making a portion or all of your data public (for expanded community help with troubleshooting).

If you choose to keep this private, we won’t post anything back public that is private, just summarize the usage issue and how to address it, in general terms, IF it will help others. This tool and the tutorial that includes it do not have any current known issues - but if one is uncovered, we can ticket/address that with anonymous data as well.

Let’s start there. Thanks!

Gene differential expression file:

1 2
ENSSSCG00000024911 True
ENSSSCG00000023305 True
ENSSSCG00000023684 True

gene length file

1 2
GENEID Length
ENSSSCG00000048769 3130
ENSSSCG00000037372 1717
ENSSSCG00000027257 899

and I currently have no gene category file as any annotations I have downloaded have given me errors. If there is anyone who knows which go annotation file might work properly that might fix the issue.

here is a link to my data, I had previous steps in a different history including mapping and alignment steps and QC but this is the history for DeSeq2 and GoSeq
https://usegalaxy.org/u/niamh98/h/deseq2-research-project

thanks :slight_smile:

1 Like

Hi @Niamh

Try Biomart. The query will look something like the screenshot below. Export/download in TSV format (tabular) from this site then update the file to Galaxy.

  • Only these two columns (Gene and GO term) should be input to the Goseq tool for the “Categories” reference file.
  • Do not input any lines (genes) that do not have a GO term. Remove those first. There are a few ways to filter out genes that do not have a GO assignment, this is one:
    • Tool Select lines that match an expression
    • Option: Matching
    • Regular expression: ^E.+\sG.+$
  • More related data can be exported if you want, for later use.
    • Tools like Cut and Join and others in the Data Manipulation tool group can be used.

Thanks!

Hi, I tried this and it finally worked. Thank you so much for all your help ! :slight_smile:

1 Like