Running MiRDeep2 with pooled reads from MiRDeep2 Mapper

I’m trying to run MiRDeep2 to identify novel miRNAs from my non-model species. I have successfully mapped my cleaned small RNA reads to my genome using MiRDeep2 Mapper and individually run the resulting collapsed read files and mapping ARF file, along with my genome, in MiRDeep2. I didn’t add any known miRNAs or precursors and used the default settings. I have four samples, so I tried to pool these together in MiRDeep2 Mapper, which seemed to work fine. However, when I tried to run the pooled collapsed read file and mapping ARF file in MiRDeep 2, it showed an error (Fatal error: Exit code 2 ()) and aborted the run after less than 2 minutes. This is the run log:

            #Starting miRDeep2
/usr/local/bin/miRDeep2.pl /mnt/user-data-volD/data31/d/9/9/dataset_d997c8d9-e39c-4884-8797-add72e81c3b1.dat /mnt/user-data-volD/data31/7/6/0/dataset_760419bc-8dbb-4a9f-a6b9-437ef1224a2b.dat /mnt/user-data-volD/data31/1/2/1/dataset_1217e5ba-a4fb-47a8-9089-444c32e2c82b.dat none none none -g 50000 -b 0

miRDeep2 started at 9:43:16


mkdir mirdeep_runs/run_18_08_2025_t_09_43_16

#testing input files
sanity_check_reads_ready_file.pl /mnt/user-data-volD/data31/d/9/9/dataset_d997c8d9-e39c-4884-8797-add72e81c3b1.dat

started: 9:43:26

ended: 9:43:37
total:0h:0m:11s

started: 9:43:37
sanity_check_genome.pl /mnt/user-data-volD/data31/7/6/0/dataset_760419bc-8dbb-4a9f-a6b9-437ef1224a2b.dat


ended: 9:43:38
total:0h:0m:1s

started: 9:43:38
sanity_check_mapping_file.pl /mnt/user-data-volD/data31/1/2/1/dataset_1217e5ba-a4fb-47a8-9089-444c32e2c82b.dat


ended: 9:43:41
total:0h:0m:3s

Pre-quantitation is skipped caused by missing file with known precursor miRNAs


#parsing genome mappings
parse_mappings.pl /mnt/user-data-volD/data31/1/2/1/dataset_1217e5ba-a4fb-47a8-9089-444c32e2c82b.dat -a 0 -b 18 -c 25 -i 5 > mirdeep_runs/run_18_08_2025_t_09_43_16/tmp/444c32e2c82b.dat_parsed.arf

started: 9:43:41

ended: 9:43:53
total:0h:0m:12s

#excising precursors
started: 9:43:53
excise_precursors_iterative_final.pl /mnt/user-data-volD/data31/7/6/0/dataset_760419bc-8dbb-4a9f-a6b9-437ef1224a2b.dat mirdeep_runs/run_18_08_2025_t_09_43_16/tmp/444c32e2c82b.dat_parsed.arf mirdeep_runs/run_18_08_2025_t_09_43_16/tmp/precursors.fa mirdeep_runs/run_18_08_2025_t_09_43_16/tmp/precursors.coords 50000
1	160804
2	133722
3	120368
4	111278
5	105234
6	100632
7	97412
8	94936
9	93082
10	91614
11	90340
12	89106
13	88418
14	87812
15	87366
16	86986
17	86680
18	86478
19	86264
20	86090
21	85962
22	85856
23	85768
24	85706
25	85588
26	85524
27	85446
28	85380
29	85278
30	85232
31	85158
32	85100
33	85064
34	85028
35	84982
36	84948
37	84862
38	84820
39	84786
40	84734
41	84646
42	84608
43	84550
44	84538
45	84502
46	84454
47	84420
48	84358
49	84294
No file mirdeep_runs/run_18_08_2025_t_09_43_16/tmp/precursors.fa_stack found

I’ve run this several different times, including with only 2 of my samples pooled, but it keeps aborting here. Any ideas? Thanks in advance!

Hi @jrw025

As a cross check, maybe compare the usage to a public Galaxy server?

The second tool test in here makes use of two pooled samples. Notice how each was given a sample identifier and input in a separate block. Click into the pencil icon then the details tab to see the command string.

Shared history → https://usegalaxy.eu/u/jenj/h/example-mirdeep2-mapper

In short, you are either not getting any hits at all, or the tool cannot match up the sample query to that sample’s mapped hits correctly, and you’ll need to determine why.

For line command help, this forum isn’t the best place to get advice, but to get you started, check the usage manual here. See example 6. The config.txt file is used to organize the pooled samples. This is what the Galaxy tool form is doing for you by breaking out the inputs – creating that sample sheet directly. Maybe the extra help in Galaxy can flush out how to use the tool in other ways?

Hope this helps! :slight_smile:

Thanks for your reply! I get hits when I run the samples individually, so it’s likely a matching problem. The usage manual looks very informative, thanks :smiley:

1 Like

Hi all,

I ran different samples pooled and didn’t come across the above issue at all, but the bottom of my tabular output is a bit strange. After the novel predictions, it has a section for the known miRNAs I’ve supplied. It looks a bit odd though, with the section “#miRBase miRNAs not detected by miRDeep2” followed by many empty rows,and then lists 4 of the known miRNAs I supplied. However, the columns the data are in don’t seem to match up and I’m not sure how to interpret it. Has anyone come across this before?

Hi @jrw025

I haven’t but maybe someone else will recognize it and comment.

My first instinct when I see a file like this, is to open it in an editor where I can review the whitespace (like vim).

Then, as a guess, content from one or more of the inputs is getting pushed out into this file. If you are running pooled without the sample designation (config.txt) it certainly seems possible that you may have data points that may not be failing the tool outright but that are still not being interpreted correctly. I’m not sure if I would trust this result!

All that said, this looks like a screenshot from Galaxy. Did you upload the result from the command line tool to view it? The whitespace in the file, if not standard “tabular” and not another supported report-file flavor of datatype, could have be compromised. Some of this depends on how you loaded it. The advanced option for “converting whitespace to tabs” in the advanced options can be very useful, or it can produce a view like yours, and other variations. Choosing “txt” is usually a safer choice to preserve everything without changes.

But this loops back to reviewing the original file in an editor, then reviewing the command string versus the manual. Tools can very happily do all sorts of odd things and still not have it be “bad enough” to fail. Galaxy has an Visualization → Editor function but I’m not sure if it will be helpful for this case or not from what you have shared so far. I guess you could try? But maybe start with a clean txt upload first.