For the Maker genome annotation tool in Galaxy, in the Repeat Masking section, there are options for choosing a repeat library source.
The listed options are:
DFam (full version)
Custom library of repeats
When using MAKER on the command line, here are the options for Repeat Masking:
model_org=all #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)
How would I use a DFam database on the command line version?
For example, on Galaxy I use a DFam curated database for arthropoda, which I can type as the repeat source species.
But if I repeat this on the command line, do I run RepeatMasker separately with the database I want, and then input this into Maker?