I wonder where I can find the info of the version, the latest update time and source info of the built-in genomes. for instance, the Tair10 genome, is it the same with Ensemble Tair10 (05-Mar-2021) or the older version of Tair10 (2019-07-11) deposited on Tair FTP? many thanks.
The indexes that were created manually often have README files. The others have other types of encoding, since usually the database/dbkey is enough to learn the source and genome assembly/build.
The following files contain the fasta-formatted complete sequences of the 5 Arabidopsis chromosomes:
TAIR10_chr1.fas
TAIR10_chr2.fas
TAIR10_chr3.fas
TAIR10_chr4.fas
TAIR10_chr5.fas
Chloroplast chromosome:
TAIR10_ChrC.fas
Mitochondria chromosome:
TAIR10_ChrM.fas
These files provide details of the genome assembly updates:
TAIR8_Assembly_updates.xls
TAIR9_Assembly_updates.xls
Please note that assembly changes in TAIR8 only consisted of substitutions while TAIR9 assembly changes also included insertions and deletions. Therefore, coordinates of most genes changed from TAIR8 to TAIR9.
In TAIR10, no assembly updates were made.
Hope that helps. And, if you want to use a different or more current assembly than what is natively indexed, a custom genome → build can be used. Give the custom data a distinct database/dbkey or expect problems with tools due to conflicts with the built-in indexes.