I would like to retrieve all bacteria sequences from whole genome. All I have is an assembled fasta file (contigs but not yet a long sequence genome). How could I retrieve the bacterial sequence from the contigs data I have?
I’m not sure if I have understood your question completely but you can clarify more about what you would like to do. Could you explain a bit more about your goals?
Meanwhile, I can share some analysis protocols through tutorials that may help to frame the kinds of questions we can answer here.
If you are completely new to Galaxy, this is a good place to start.
I would like to know the bacterial community of my sample. However, my data (shot gun metagenomics reads) is still contigs (more than 50 contigs for each sample, and in fasta file). Do I need to assemble these contigs and transform into scaffolding before running the microbiome analysis or can I retrieve all bacteria from these contigs reads directly?
There are a few tools you can use for metagenomic profiling. See that same training site for examples and the different tools you can try.
But in short, Kraken2 is usually a good choice for WGS reads. For amplicon, there is Mothur. Both are covered in the examples – along with all the other little steps – data preparation, assembly (if any), then result interpretation with graph generation.