truncated reports from metilene?

SusanD · August 17, 2023, 10:13am

Dear everyone,

I am perplexed by the outputs of metilene, and would like to know whether any of you also encountered a similar situation.

I could successfully run metilene to acquire DMR of my two samples, which in general looks fine, but I noticed that I only have 451 regions (lines) called in my metilene output (without filtering by q-value) and these results limit to chromosome 1 and a few chromosome 10; whereby I have a feeling that there is a truncation of the report, while I could not find a suitable parameter to adjust.

Does any of you experience the same? Could any of the experienced specialists give me some advice?

Thank you very much for reading my post.

jennaj · August 17, 2023, 4:20pm

Hi @SusanD

The first thing I would check is the content of the two inputs. Maybe extract the chromosomes of each? There could be a mismatch in the identifier formats, or files truncated upstream.

These resource should help. If after trying and still stuck, you can share back the history for more help. Leave the chromosome extraction outputs in the history please so we don’t have to do that over again.

https://training.galaxyproject.org/training-material/faqs/galaxy/datasets_chromosome_identifiers.html

And, you can manipulate/compare with a bunch of tools – the most commonly used are covered here. I would consider adding up the number of rows per chromosome in each file to also be very informative (tool: Group). Data Manipulation Olympics

Let’s start there, thanks!

SusanD · August 18, 2023, 2:13pm

Thank you for the very quick response, @jennaj.

I inspected again my inputs and tried two versions of chromosome annotation (with chr or without), both resulted in the same outputs. I also pre-sorted my bed files, and it resulted the same.

I myself have solved the problem by exporting the MethylDackel extracted bedgraph and executed metilene in Terminal. With this approach, I got a report that includes all chromosomes.

As I inspected the executed metilene again in Galaxy, I noticed that only regions in chromosome 1 and chromosome 10 are tested by the tool. I guess this is why the final reports were significantly truncated and limited in these two chromosomes.

hope this helps.

Kind regards,

Susan

jennaj · August 18, 2023, 5:20pm

Hi @SusanD

Thanks for posting the details back. Very odd, and we don’t have other reports about it!

I’m guessing there is a sort order problem between the data and a built-in database index. But that is a guess based on the chromosomes involved.

If you want to share back a history with the use case, I’d like to review it and be able to share it publicly with developers for a resolution. This is easiest path.

Or, if you want to do that privately, I can send you a direct message here. Then I can abstract/down sample the data to share with developers.

If there is a bug, especially some corner case bugs, we want to fix it. Whether related to sort order or not.

Thanks!

SusanD · August 21, 2023, 6:26am

Hi @jennaj,

Thank you for the very kind offer. I am more than glad to share my history.
Perhaps we do that privately through a way that you prefer?
Like you can pass to me your Galaxy account, so I can share with you my history?

Kind regards.

jennaj · August 21, 2023, 4:52pm

Hi @SusanD I sent you a direct message through the forum to share the link in. Thanks!