GTDB-Tk use and binning contigs

Jon_Colman · February 7, 2025, 7:52pm

I would like to take my contigs from Metaspades to run on the GTDB-Tk, what binning program should I use for input on GTDB-Tk?? I have WGS reads for which I’m trying to identify Mycobacterium Species. After pre-processing the paired reads, I mapped against a large Mycobacterium Database with Minimap and then assembled with MetaSpades.

Thanks

jennaj · February 11, 2025, 8:32pm

Hi @Jon_Colman

The author’s didn’t specify a preferred method in the primary publication (or I missed it!).

https://academic.oup.com/bioinformatics/article/36/6/1925/5626182

We have a Galaxy tutorial with an example, plus a short description of what the others do. Maybe try and compare the results? You might notice that one works better for your species/data “better” than others.

Hands-on: Binning of metagenomic sequencing data / Binning of metagenomic sequencing data / Microbiome

And finally, maybe try to find a publication that focuses on your species domain to learn what challenges others have identified and how those were accommodated. You could also reach out at the Microbiome chat at the top of the tutorial listing to ask the Galaxy scientists working in this domain if they know of a preference (someone may have a specialization, or can refer you to the best scientific forum). Actually, I’m going to cross-post this topic over there to get this started, but feel free to join that chat directly, too!

You're invited to talk on Matrix

Hope this helps!

paulzierep · February 12, 2025, 12:31pm

Dear @Jon_Colman ;
we are currently working on a MAGs workflow using 4 different binners and Das Tool for Bin refinement mags-individual-workflow. Since we use a consensus approach we do not relay on finding the best binner for a specific target but retrieve the best MAGs from all binners.
This Workflow also includes GTDB-tk for MAGs taxonomy assignment. Maybe you want to try that for your data. In this workflow MEGAHIT is used for assembly, but you could remove this step and use your MetaSpades Assembly as input.
Feel free to reach out if you need any help with the workflow. If it works for you we would be happy to add your analysis as a use case for the project where this workflow is developed. Best, Paul

Jon_Colman · February 12, 2025, 9:30pm

This looks interesting!!! So it looks like I’m taking a paired set of reads after trimming and adapter removal, and using that for input??? Which should eventually give me the GTDB-tk taxonomy?

This is what I’m currently doing, let me know if this makes sense?

I have some whole blood samples I’m working with that include numerous bacteria, to include mycobacterium and Plasmodium. From what I can tell, it appears that I may have a couple of species of Mycobacterium that aren’t in the Core-nt database (marked as minor contamination) the Plasmodium species has a newer reference done last year that’s not been put into the NCBI references yet. So what I’m doing is taking my cleaned reads and mapping with bowtie what “appears” to be in my samples, then using BBtools: Tadpole to error correct the mapped reads. I’m assuming from what I have read that Tadpole works well before assembly??
Is it recommended to remove the host reads first, or just run them all together??

paulzierep · February 21, 2025, 1:01pm

The pipeline I shared is based on trimmed reads, we usually use fastp for trimming and adapeter removal, but this is usually a use choice.
However, I would definitely remove host reads, e.g. using bowtie2 since this can effect assembly significantly.

Topic		Replies	Views
Issues with Read Assembly Using MEGAHIT and metaSPAdes in Galaxy galaxy-local	1	50	December 28, 2024
metagenome analysis of redundant data usegalaxy.org support	12	514	December 11, 2021
Metaphlan SGB to GTDB followup tool-help , microgalaxy , gtdb_to_taxdump	1	88	February 19, 2025
MetaWRAP in Galaxy tool-help , metawrapmg_binning	6	69	August 15, 2024
bacterial genome assembly workflow usegalaxy.org support gtn-tutorial , assembly	1	447	February 21, 2023

GTDB-Tk use and binning contigs

Related topics