Does anyone have any pointers for indexing a reference sequence for use with GATK tools? I have indexed reference sequences that I use for BWA and minimap2, but I can’t get the reference sequences to appear with GATK. I receive an error when I use the “GATK-sorted picard indexes builder”.
For native genome fasta files to be accessible to this tool (option: Choose the source for the reference list) – adding the genome with the fetch DM is probably enough, although to really make new genomes useful there is a short-list of core recommended indexes. All have Data Mangers. You can certainly run more DMs after those (GATK4 Mutect2 won’t need more… but other tools can). See the topic below for help – the same process applies to all genomes you plan to index, not just the one referenced in that particular Q&A.
I am using GATK4 Mutect2 and I select a cached reference and every time I get the error “A USER ERROR has occurred: Argument reference was missing: Argument ‘reference’ is required.” with no reference passed to -R argument.
Everyone does some version of this kind of mixup. Mismatched inputs are one of the first things to check, along with format, whenever errors come up. Often much easier to spot in other people’s work than your own.