VGP decontamination

Hello VGP mentors,

I was following the VGP assembly pipeline to assemble my newly sequenced genome, which is a big convinence for someone like me who didn’t have a lot of bioinformatics experience. However, I found there is supposed to be a decontamination step in the VGP workflow, while it’s not avaliable in the tutorial. Will there be an update for the decontamination step? It would be very helpful for my genome which contains microbial contaminations naturally. Thank you very much!

Best,
Zi Ye

Hello,

Thank you for your feedback!
We don’t have a tutorial for the decontamination step but you can find the workflow here: Dockstore.
The inputs are simply your assembled genome and the database you want to use to decontaminate your assembly.

Best

Delphine

2 Likes

Thank you very much for your reply and workflow!
Best regards,
Zi Ye

I’m interested in running this decontamination step and want to know if the database needs to be downloaded to your local machine or if it can be remote.

Also, what databases do you use for eukaryotic decon and where might I find them?

Thanks!

Cheers,
Emily

Hi @foreignsand

Good questions!

All of the inputs need to be inside of the environment the tools/workflow runs in. You can add reference data to your history before runtime, or input a URL link for a deferred download and it will happen when the workflow executes.

For practical reasons, downloading the file once (using either method) the first time is convenient, but you might want to copy that file into other histories for the subsequent runs. This can save space (copies of data do not consume extra disk space), and help everything to run a bit faster.

Or, you can download each time then set the workflow to delete intermediate files, so that the space is only temporarily used each run.

There is a guide here from the AU team that can help with the database choice. Galaxy Australia Media

Hope this helps! :slight_smile: