I am re-analyzing published data (deposited on SRA) to get a feel for the RNA seq workflow. I have had no problems up until I decided to compare the differences between meta-feature and individual feature counts in FeatureCount.
I aligned paired-end, trimmed reads using HISAT2 to human reference genome 38 (latest version, with latest GTF). Using FeatureCounts, set to unstranded, paired-end, enabled to count fragments, GFF feature: exon, GFF gene identifier: gene_id. The counts below for two genes of interest are at the meta-feature level (gene). These counts mirrored the raw counts included supplementals of the published study (many exon-spanning pairs visualized in IGV):
When I select on feature level “yes” (exon) for the same file, I got very, very different numbers (abbreviated example below).
Shouldn’t there be as many reads as in the meta-feature, but just binned separately?
I cannot seem to understand how these counts are off by 1-2 orders of magnitude. All other settings are default for FeatureCounts