could you make KMC from the toolshed (available Galaxy Australia) also available in usegalaxy.eu?

igor · October 11, 2024, 3:24am

Hi @bejo
Are you after analysis of F and R reads in a single KMC Counter job? If yes, consider concatenate and merge the read files head to tail. It works on GZipped files, as well, at least the latest versions of Concatenate do.
Hope that helps.
Kind regards,
Igor

bejo · October 11, 2024, 7:59am

Hi @jennaj @igor
I know how to upload FASTA files and merge them into a collection.

However, I am using the download function in Galaxy to retrieve sequence libraries from NCBI using “fasterq dump tool”. So you get fwd and reverse libraries paired and it would be handy and these can be parsed in parallel. (with two random FASTA files, uploaded them as “collection” and are indeed parsed together resulting a single output file).
QUESTION: How can you manage the paired FWD and REV sequence libraries are handled as a “collection”, so you avoid the need to start 2 jobs manually?

Alternatively, I took to random fasta files. In KMC you can select both single FASTA as individual datasets (assuming KMC will handle both files separately, in one go, producing separate output files), but this runs into an error.

Finally, I would like to construct a Workflow but first I need to understand how to parse paired seq-libraries in one in KMC. Could it be a minor issue in KMC?

jennaj · October 11, 2024, 5:30pm

Hi @bejo

You can change the shape of your collections, and merge, split, concatenate the data inside of them.

Would this tool help?

Concatenate multiple datasets tail-to-head while specifying how

igor · October 14, 2024, 12:58am

Hi @bejo,

As @jennaj suggested, you can manipulate the data using tools from Collection Operation section.

I noticed an issue with KMC Counter on multiple inputs and notified the owner of the wrapper. I hope the issue will be resolved in reasonable time. Currently it works only on a single file, so, for now, Concatenate is the way to go, if you are after multiple files.

On Galaxy Australia, dedicated data importing tools are very slow, imho, ~ten times slower compared to import by URL. Because of this, I do not use fasterq_dump etc. I import reads using URLs from ENA. In this scenario I ends up with files, not a collection.

Kind regards,
Igor

bejo · October 14, 2024, 7:05am

Dear all, great to hear your feedback

@jennaj shaping collections I was not aware off. I appreciate you indicate this to me. I am going to try this.

@igor indeed importing data seems slow, remarkably it is called “faster"q_dump” I will try the URL option too.

bejo · November 26, 2024, 10:51am

KMC is available for some time now. I already frequently used KMC. In github the option “complex” is another option in KMC. It would give the user other filtering options, rather than comparing two datasets with each other.

I would like to find shared kmers in a set of samples. I can also perform a chain of intersections, but that’s not that handy. According to github, the KMC function complex would do this trick as well:
"complex operation allows to define operations for more than 2 input kmer sets.

I am wondering if this option in KMC could be made available.

bejo · November 28, 2024, 8:25am

@igor have you seen my message about the “complex” function for KMC?

Topic		Replies	Views
Requesting to add Cap3 tool usegalaxy.org support tool-install , toolshed	2	765	June 30, 2022
Managing access to Galaxy Tool Shed Tools on usegalaxy.eu tool-install	2	17	February 19, 2025
About upload tools to galaxy tool Shed usegalaxy.org support tool-install , tool-dev , galaxy-local , toolshed , public-galaxy-server	4	688	July 19, 2019
Help needed regarding usegalaxy.eu and usegalaxy.org usegalaxy.eu support usegalaxyorg , gtn-tutorial , share-or-publish , history-transfer , upload-html	5	2049	October 18, 2021
Add tools in usegalaxy.org usegalaxy.org support tool-install	2	439	April 15, 2024

could you make KMC from the toolshed (available Galaxy Australia) also available in usegalaxy.eu?

Related topics