Workflow specialized for eukaryotes using metagenomes

Hello everyone,

I’m really sorry to have to ask, as I don’t want to just offload my problems, but I’m currently stuck.

I’m writing my master’s thesis and wanted to prepare some data for it.

I’m actually writing a purely literature-based paper on the influence of tire abrasion on eukaryotic microbes. I’ve now found a dataset in the NCBI that uses metagenomics to collect data on precisely this topic and only evaluates prokaryotes. I have now started trying to evaluate the eukaryotic microbes. However, I have several problems:

1. I have doubts about my workflow and cannot find a suitable database in Kraken2 that delivers good results. My workflow so far has been to download and extract reads in FASTQ, fastp, FilterwithSortMeRNA, and Kraken2. I am unsure whether this is suitable for a statistical evaluation of diversity.

2. I have been looking for a better way, but I cannot find a tool that initially excludes all prokaryotes. This would reduce the amount of data. Or another tool that is well suited for eukaryotes.

3. I am unsure whether, for the scientifically correct evaluation of alpha and beta diversity using R, subsamples of equal size should be formed first.

I apologize for my novice questions, but I am only studying to become a teacher and am still unfamiliar with working with the tools and data.

EDIT: Or is it perhaps even possible to integrate a tool such as EukDetect?

Hi @Marius_Sanders,

Can you filter out prokaryotic reads using Kraken2 and the standard database or mini-d? Enable Split classified and unclassified outputs option. You are after unclassified reads.

Description of Kraken2 databases: Index zone by BenLangmead

Kraken2’ PlusPF contains Standard plus Protozoa and fungi. It is available in Europe including two mini-versions, PlusPF-8 and PlusPF-16. Maybe this one can work for you directly, without read filtering. Try mini-versions of PlusPF. The same data for species coverage, but less k-mers. The results should be consistent with corresponding full-sized database.

It is OK to have account on different public Galaxy servers, one account per user per server.

Maybe someone else will comment on other questions.

Kind regards,

Igor

1 Like

Hi @Marius_Sanders

Have you seen our Metagenomics tutorials at the GTN training site?

Start here → :graduation_cap: Microbiome / Tutorial List

From there, I wonder if this pathway might be of interest. It explores diversity with a bit of a wider net. → Learning Pathway: Introduction to Galaxy and Ecological data analysis

Hope this helps! :slight_smile: