Featurecounts tool update request

Hi Galaxy Developers,

Note, meta-feature = gene and feature = exon.

I need your help with incorporating a feature into the galaxy tool wrapper for a particular tool and update in the tool shed, so that galaxy administrators can install a new version.

Specifically, there is an option missing from the featurecounts tool (all available versions are the same), is it possible to update the tool?

The option I am referring to is assigning a fraction to “allow reads to contribute to multiple features (-O)”. In the featurecounts manual, found on the wehi website (http://bioinf.wehi.edu.au/featureCounts/) you can assign a fraction for reads that contribute to multiple features, however this is not an option in the galaxy version of the tool (skip to section 6.2.6 of the manual). Without this fraction option, a “multioverlapping” read that maps an exon-exon junction, or to an exon shared between genes, is counted twice, once for each feature.

image

I can only specify to count or discard these reads (with yes or no respectively), but it would be better to be able to divide the read count by the number of features that read maps to.

image

Ideally, this option would be set up the same as for the “count multi-mapping reads/fragments” option, with a drop down menu that when you choose enabled, then another option pops up to allow a fraction to be calculated.

image
image

The distinction between multioverlapping and multimapping reads is important, the multimapping reads are ambiguously mapped whereas the multioverlapping reads are mapped confidently. Adding a fraction option would vastly improve exon-level expression estimation, since counting a read twice just because it overlaps features over-estimates exon-level expression compared to gene-expression, where a read is just counted once for the gene regardless of whether it crosses an exon-exon border.

I hope this all made sense,
Thank you for your time,
Kind regards,
Jacqui

1 Like

Are you familiar with tool wrappers and editing those? There is documentation available on how to do it here if you are interested in learning: https://galaxyproject.org/tools/#writing-tools

You could issue a pull request to this repository with the desired change and it is likely that it will be incorporated after a review: https://github.com/galaxyproject/tools-iuc/tree/master/tools/featurecounts

Hi thanks for your prompt reply!

I am not familiar with tool wrappers or any sort of editing/coding… I was really hoping one of the galaxy team could do this for me. That’s not to say that I am not interested in learning, just that I don’t consider myself well versed enough to dare attempt this, since this tool update will be used by the greater galaxy community and not just by me.

I have never contributed to github before, but I found some instructions online so I will try to issue a pull request as you say.

Just because I am curious about the process, would the moderators on the github page review my request, incorporate changes if they are satisfied, then I need to find someone else to update the tool wrapper and toolshed?

Thank you very much for your time,
Kind regards,
Jacqui

No it seems pull requests in github are beyond me.

@astrov will take a look and depending on how much work it is, hopefully updated it soon. Thanks @astrov!

Hi Jacqui,

Looking at the documentation, am I correct in understanding that the --fraction flag is used in conjunction with both multimapping and multioverlap? And if so, would it make sense to add in the overlap flag to the multi-mapping drop down to make it:

Count Multi-aligned reads/fragments:
---- Disabled
---- Enable Multi-map (-M)
---- Enable Multi-overlap (-O)
---- Enable both multi-map and multi-overlap (-M -O)

And then keep the --fraction flag as a boolean based on the selection?

Hi astrov thanks for your reply!

Yes the dropdown options as you have sound good to me, but will need to be sure that the fraction can apply to both -M and -O, or perhaps they will need to be kept separate.

What I mean is, according to the featurecounts manual, the fraction for multioverlapping (-O) is defined as 1/y and the fraction for multi-map (-M) is 1/x. Since these two denominators are different terms in the original code, and set up independently from each other, it might not be possible to name one variable that would work with both -M and -O reads, thus it might be easier to keep them in separate drop down menus. But if it is possible to list as you have said, then your suggestions are perfect.

Thank you for your time,
Kind regards,
Jacqui

Excellent, I should have that ready soon.

I am out next week, but will have it done asap once I’m back

1 Like

That is greatly appreciated, thank you so much!

I have a PR in, it should be ready very soon

1 Like

Thank you

Hi @astrov just checking in on the progress of this, when it is done I’ll need to pass this information on to my galaxy Aus administrator contact so they can update the tool. Thanks!

It has been approved, waiting on a merge:

1 Like

OK thank you.

Hi @astrov,

Is it possible to add another update to featurecounts? If I am too late, or too demanding, then I understand if the answer is no. Specifically, I was thinking of the option to specify minimum phred quality score for alignment (-s minBaseQuality). Default is 13, most publications I’ve seen use 20 or more.

@j.heighway I see the flag you’re looking at in the documentation, but I think you’re in the exactsnp section?
-s in featurecounts refers to strand specificity.

Hi @astrov,

The initial request for change is the most important to get the analysis run, so if adding the extra change slows it down, then I’d prefer not to wait.

I see what you’re saying - I am in the exactSNP section. I couldn’t see where else in the documentation the phred score was defined, so I assumed that this was the place to set minimum phred scores before reads are discarded, but this is not the case? How can I find what the phred score cutoff used by the tool is?

I don’t actually see anything about phred scores in featurecounts, but you can use the BAM file filter tool to remove reads that fall below a given quality, then pipe that output into featurecounts.

1 Like

Thanks for clarifying, I was unsure whether it was at step of BAM generation or at the step of counts that these reads were removed, and then got further confused when I saw the mention of phreds in the featurecounts tool. You’ve cleared that up for me, thanks a lot!

1 Like

Hi @astrov,
The update has been installed in the Galaxy server, thank you so much for your work getting this feature included!

2 Likes