FastTree returns error (run on Roary data of 4 E. coli isolates)

Hi there,
I am trying to use FastTree to generate a tree to be visualized on Phandango.
I have data of 4 E. coli isolates sequences. Three of them are on our own collection (natural strains) and the fourth was a reference genome from NCBI of E. coli K12 MG1665 (This strain was also part of the experiments, thus I wanted to include it on the tree)
Originally I had my own annotations with Prokka (from Linux) on the three strains. With the Gff3 files, I ran Roary followed by FastTree (nucleotide) on these three isolates and it worked (used core aligment and newick file from Roary)

As I wanted to expand the analysis and include more sequences (for a better understanding of the core genome). I started simple, adding the 4th sequence (MG1665). Running Prokka in Linux for this sequence was no longer an option for me (changed institutions), so I decided to rerun prokka on the four isolates, including MG1665.
Then I ran Roary on those 4 strains and worked.
When running FastTree (nucleotide) on them, I got the following error: “This job was terminated because it used more memory than it was allocated”.

Since they are only 4 isolates, I wondered if this was a memory problem (I run on the galaxy guest, not a cloud server).

Maybe relevant… or not:

  • The three first strains were annotated from scaffolds, while the MG1665 was annotated from the chromosome sequence.
  • The Newick file from Roary for the 3 strains and 4 strains looks like this (P.S. I shortened the names of the three sequences on the last one)

3 strains: (38.27_PROKKA_09102020.gff:1.843208323,39.62_PROKKA_09102020.gff:0.806511634,9.54_PROKKA_09082020.gff:0.000000005);

4 strains:
(39.62.gff:0.417292218,9.54.gff:0.138018002,(38.27.gff:0.253699967,MG1655.gff:0.218665394)1.000:5.990038221);

From the Newick files, I see a big change, yet I don’t know if it is caused by real differences in the sample or by the processing. Trying different possibilities is tricky, as Roary takes around 12h to complete every combination of strains that I am using.

P.S. I am quite new to WGS analysis, so I apologize if I have done something stupid.

Thank you in advance.

Hi @Rebeca_Pallares,
which Galaxy instance are you using? It would be nice if you could send an error report (bug icon), since it can provide us additional information about the error.

Regards

Hi Alba, thanks for reaching
I was running the online tool (from Europe so I think European server)
As an update, I must say I tried to reproduce the Roary+FastTree that worked (rerunning same dataset) and FastTree is giving the same error. I think the problem is on the Roary output data though.
Before, Roary took really long to run and now it produces the analysis in a matter of minutes (yet results i.e. nhx file are really similar, although not exactly the same).

First run (38.27_PROKKA_09102020.gff:1.843208323,39.62_PROKKA_09102020.gff:0.806511634,9.54_PROKKA_09082020.gff:0.000000005);

New run
(38.27_PROKKA_09102020.gff:1.868025368,39.62_PROKKA_09102020.gff:0.804283884,9.54_PROKKA_09082020.gff:0.000000005);

When I re-run FastTree on the output that worked, it still works but not on the new outcome.
I am really puzzled about where can be the error, as for running Roary I am not tunning the settings.

Thanks for your assistance

Could you share your history with me? I’ll privide you my mail by DM.

Regards

Hi Cristobal,
Thank you for your message.

I was first working in https://usegalaxy.org/ and I have tried again in https://usegalaxy.eu/
In the Europe based web FastTree worked, with new analysis and also with the files provided by https://usegalaxy.org/.

Thanks for your time and for the service galaxy provide.
Best, Rebeca.

2 Likes