Attempting to use Stitch MAF blocks tool, but lack of documentation

Hello, I am trying to provide a list of input coordinates so that I can use this tool to stitch together existing MAF blocks, but it is surprisingly complicated. I don’t understand why galaxy is insisting that the “choosing intervals” input is a file (shouldn’t it just be intervals, such as (1,6)?

I’m also not sure what to put for MAF type. Any help would be immensely appreciated, thank you!

Welcome, @danielrebib

This publication I helped to author from 2012 covers the usage of this tool suite in great detail. It is open source. You’ll notice that the UI has been updated but the underlying logic and data formats are still the same, and the original example data is still available in a shared history. Please give that a review, as I think it will address all of your questions, but you can ask if anything is not clear.

Thanks so much, Jennifer!

I actually started on attempting writing a script to stitch the blocks together before I saw your reply. I’m halfway? through, but I’ll find your link super helpful when I’ll try to benchmark my results to see if the stitching works.

1 Like

Hi Jennifer,

Writing an algorithm to stitch blocks together was much more difficult than I anticipated! I am considering using your tool described in your publication, but read on a discussion forum somewhere that it is only suitable for hg19, and making it work for hg38 would require significant overhaul. Would you be able to confirm that, or it using Galaxy an option for me?

Thanks :slight_smile:

Things have changed since 2012 … These tools will work for any species (defined by a database metadata) that has a MAF reference provided. UCSC is a good source. Since you are working at your own server, you could index that reference MAF but it isn’t required.

Like most Galaxy tools, reference data can be supplied from the history and indexed at runtime, at the expense of longer/larger running jobs. Public servers have practical job limits, but Galaxy itself doesn’t. If a tool can be run on the command-line, it can run in Galaxy given the same resources. Galaxy is running the job on a cluster somewhere. The command-string is exposed where the job logs are located. Some servers even provide cloud costs for that exact job, and a carbon footprint is coming soon. The EU server will have more details than most since they use a unified cluster provided by an EU international consortium (easier to process those details when all clusters are the same-ish).

This is a test history with some data loading into it – query and target.

It will have a reference MAF and a set of RefSeq coding exon coordinates. Both based on hg38’s chr22 to keep the test smaller but still representative. Once done, any of the MAF tools should technically work fine. Do you want to get a copy of it to play around in? If you run into a problem, share a history back. Makes it easier to troubleshoot with a real example :slight_smile:

Hi Jennifer,

I would love a copy of the history, thank you!
Which information should I provide so that it could be shared with my galaxy account?

Hi @danielrebib This is the link again. When you click on it, the option to import the history will become available.

That copy will be your own independent version to work in at that same server. To move your history to another server, create a history archive. That will be a zip file that you can download anywhere, and has a link you can use to directly transfer it to another Galaxy server without an intermediate download step (including your own server).

To share your work back, attempting to reproduce problems at a public server tends to work best. But you can also try doing everything in the reverse.

Hi Jennifer,

I see, thank you. I tried copying the history, which requires me to log in. It seems my email isn’t associated with any account (which is problematic since I can see that email from the logged-in galaxy account). Another thread showed me that the issue is because I made my account using galaxy over network, which isn’t “connected” to main galaxy.

Independently, from my logged in galaxy-over-network account, it seems I can’t import any local files as data set without getting “[Errno 1] Operation not permitted:”. This is probably a different issue from the first.

By chance, are there resources you could point me towards that solves these issues? I couldn’t find any. Sorry for the endless questions…

Hi @danielrebib

If you don’t have an account at yet, you can create one :slight_smile:

Accounts are server specific. One account at each public service is expected and encouraged to maximize access to all the different resources each host. :slight_smile: