I’m new to Galaxy - Is it possible to use a Galaxy tool to edit individual vcf files in order to insert variants of interest WITHOUT having command line knowledge?
This is for educational purposes - we want to artificially create interesting cases but make them look like real data by using publicly available whole genome/exome datasets with specific variants inserted into them, we can then process the files through a commercial platform for analysis by students.
Other than working with a text editor (messy for big data) or tools which need command line I haven’t yet found a way of doing this.
If you are completely new to Galaxy, this is a tutorial that will be worth your time. It covers some of the basic functions, plus some more data manipulation examples.
Using one of the “find and replace” functions is probably the best choice for what you want to do. The Olympic tutorials have examples, and if you search with the keyword “replace” in the tool panel, you find more options. Each tool works a bit differently, so testing things is the best way to learn the small differences.
And, Galaxy also has a regular text editor – how well that would work depends on how large your file is – and it would be very easy to introduce a problem! But you can experiment.
Screenshot of how to find the text editor. Click on the visualize icon within an expanded dataset, then search for “editor” in the listing.
The idea from @gbbio is another alternative – meaning, create a VCF using a different species, eg cross species mappings will probably result in new variants you can use as test/example data. These would have more scientific relevance, and full data provenance, since the read data will back up the VCF data. You can do all of this in Galaxy, too. Please see our Variant tutorials for examples.