Hi, I’m new to the Galaxy. I was trying to follow the tutorial using my own data, but I encountered the issue at the very beginning. I downloaded these two files from Ensemble" Mus_musculus.GRCm39.dna.toplevel.fa and Mus_musculus.GRCm39.111.chr.gff3. But when I used GTF2GeneList function following the guide in the tutorial, I received the warning message:
Fatal error: Exit code 1 ()
Warning message:
In .local(con, format, text, …) :
gff-version directive indicates version is 3, not 2
Error in eval(quote(list(…)), env) : object ‘first_field’ not found
Calls: die … cat → paste → standardGeneric → eval → eval → eval
Execution halted
Hi @nahiznan
The error message indicates the file might be in GFF3 format. What datatype (format) do you see for this dataset in Galaxy? Click at name and check value next to ‘format’. Does it say GFF or GFF3 or GTF? If it is GTF (=GFF2) change it to GFF3 via Edit attributes (pencil icon).
Alternatively, upload to Galaxy GTF file, not GFF.
Kind regards,
Igor
Hello, @igor,
Thank you for your prompt reply. I checked the format, and it was GFF3. However, I noticed that in the tutorial, the fasta file was a cDNA file, and mine is a DNA fasta. Would it be a problem?
Should I use a different tool to generate the gene map?
Hi @nahiznan
it seems the issue is in the data format. The tool expects GFF2 (=GTF). Maybe convert GFF3 file into GTF using gffread or any other appropriate tool. However, the tutorial requires a file with transcripts, not genomic DNA.
Out of memory error: amount of memory requested by the job exceeded the allocation. Admins can increase the memory allocation. What Galaxy server do you use? However, I suspect the out of memory error was caused by use of genome assembly instead of transcriptome (fasta file you mentioned in reply). The best source of mouse data is UCSC Genome Browser (download section). Try transcriptome file before talking to the server admins.