I am trying to map some paired, trimmed reads to the Brachypodium distachyon Bd21-3 genome. I’ve downloaded two genomes and their associated annotation files from Ensembl plants or Phytozome. When I try to map my reads to either genome I get errors for both attempts.
For the first attempt I think the genome and annotation files are not in a matching format? I’m not sure about the error for the second attempt.
The error message from one set of RNA_STAR jobs has the following:
Fatal INPUT FILE error, no exon lines in the GTF file
Log files from the second set of jobs point on truncated read files:
EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length @LH00406:315:233MVCLT3:4:2298:28005:18406
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SOLUTION: fix your fastq file
Try it on a single sample. I also recommend HiSAT2 over RNA_STAR, but it is up to you. Both HiSAT2 and RNA_STAR can map reads to a genome without a gene annotation file. Again, it is up to you.
It seems, you trimmed paired-end data as single end. This operation most likely destroyed properly paired data. Trim reads in paired-end mode. Again, try the whole process on a single sample.