total bases in all sequences

humaira · January 1, 2019, 5:27pm

Hi,
I am window user and working on fasta files. I want to correct number of total bases in all sequences.
Thx

bjoern.gruening · January 1, 2019, 5:36pm

What do you mean with correct number of bases? Do you mean counting? Or do you want to remove bases?

humaira · January 1, 2019, 6:03pm

I mean counting.

bjoern.gruening · January 1, 2019, 6:17pm

Hi,

you can you this tool for example: https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/fasta_compute_length/fasta_compute_length/1.0.1

Cheers,
Bjoern

humaira · August 14, 2019, 6:18pm

Hi
I used : [https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/fasta_compute_length/fasta_compute_length] to compute length of sequences. The file was generated like this

c3386_g1_i2 len=1028 path=[159:0-115 79:116-1027]	984
c3389_g1_i1 len=1737 path=[1:0-1736]	1650
c3389_g2_i1 len=280 path=[3686:0-279]	268
c3391_g1_i1 len=473 path=[1:0-160 162:161-234 518:235-398 162:399-472]	455
c3391_g1_i2 len=270 path=[1:0-160 162:161-234 236:235-269]	258
c3392_g1_i1 len=397 path=[349:0-248 224:249-396]	385
c3393_g1_i1 len=223 path=[201:0-159 361:160-179 381:180-222]	209
c3397_g1_i1 len=617 path=[595:0-616]	581

now i want to correct the count number in fasta file.

jennaj · September 23, 2019, 9:42pm

@humaira

For most use cases, description line content in fasta datasets will cause problems with tools and should be removed. Only the identifier is used (first “word” in the “>” title line.

FAQ: https://galaxyproject.org/support/

Common datatypes explained
Specifally here: https://galaxyproject.org/learn/datatypes/#fasta

If for some reason you really do need the lengths in the fasta headers, the “len=NNN” portion of the identifier could be recreated from this data, but not the “path=[coordinates]” portion.

Use tools from the GENERAL TEXT TOOLS groups. The manipulation would probably involve a workflow such as: Fasta-to-Tabular > Add column to an existing dataset > Merge Columns together > Tabular-to-Fasta. Or, if you are able to construct substitution expressions, use the tool Text transformation with sed.

Thanks!

Topic		Replies	Views
correction of count number in fasta file	0	268	August 15, 2019
Removal of spaces from fasta file fasta-manipulation , custom-genome	4	2421	January 2, 2019
Trim - Cannot trim to large positions tool-help	1	9	March 4, 2025
I need a tool that can extract bases before a certain position bed , third-party-identities	12	1805	July 24, 2019
Deleting sequence identifier line usegalaxy.org support fasta-manipulation	1	76	April 9, 2024

total bases in all sequences

Related topics