Hello,
I’m getting the following error when I run RNA-STAR on my paired-end data: “Metadata generation failed”. The R1 and R2 fastq.gz files are quite large (33 GB each). RNA-STAR runs for about 4 hours but doesn’t produce the mapped.bam files. Please let me know if you have any suggestions to troubleshoot this.
Thank you!
Hi @nakheay
This is a Python error.
The size of the read files could be a contributing factor.
Where to start:
-
Try at least one rerun for any failure.
-
Have you run any QA on the reads before mapping? If not, that would be a good place to start. Once the QA steps are done, consider running this tool to confirm the final, pre-mapping read content/format QC: FASTQ info validates single or paired fastq files
-
Are you incorporating a custom reference genome (fasta from the history)? If so, the fasta might need format standardization. faqs/galaxy/#working-with-very-large-fasta-datasets
-
Are you incorporating reference annotation (GTF, GFF3)? This might also need format standardization, plus confirmation that the identifiers in the annotation are a match with the reference genome (whether server indexed or custom genome). faqs/galaxy/#working-with-gff-gft-gtf2-gff3-reference-annotation
-
If all of the above checks out, you could consider down/sub sampling the reads. Tools in the group Seqtk could be used to create smaller yet representative read inputs.
Please give those a try and if you need more help, please create and post back a share link to the working history. Leave all inputs and outputs undeleted, for both mapping and QA steps. faqs/galaxy/#sharing-your-history