How to get gene list from bed file containing chromosome number, start and end positions in 1st, 2nd, and 3rd columns?

pallabi · April 30, 2024, 11:28pm

jennaj · April 30, 2024, 11:32pm

This tutorial includes all of the manipulations. It focuses on something else specific, but what you will do is about the same.

Hope that helps!

pallabi · May 9, 2024, 7:54am

Hi, thanks for the reply. I have tried following the tutorial. I found that it has changed the chromosome coordinates after intersection. My input file contains chromosome number, start, end, sequence, score in the columns according to this order. After intersection, sequence and score columns are removed. And the coordinates of start and end have also changed in the output so, technically i don’t find any similarities between my input and output file.

jennaj · May 9, 2024, 4:44pm

Hi @pallabi

This tutorial has more manipulations, and the tool panel has more tools. Most are wrapped versions of command line utilities – either the actual utility or a duplication. → GTN Materials Search

But let’s get your use case clarified a bit more and come up with a solution.

These are the coordinates that you want to associate with a “gene”, correct?

I put that in quotes since what you will be mapping to is a transcript footprint, then that transcript is associated with a gene bound. More than one transcript might match, and those might all map to the same gene or might not. It depends on the coordinates: how much of the genome they cover and related.

So you are using the sequence as the name column, and have score, but do not have the strand. Not having the strand might matter for scientific reasons.

And only the first three columns will be preserved with some tools. Others can preserve all columns.

As a reference, this is the BED datatype specification: Genome Browser FAQ

I’m curious now about what you did. What is the content of the file with the genes? The file that was extracted from UCSC?

And, if you care about stranded results, your starting BED should have the 6th column. Do you have that information? Or are you just looking for genomic overlaps that are not stranded?

You can post back screenshots or copy/paste. Thanks!

Topic		Replies	Views
ExomeDepth Column gene usegalaxy.org support bed , text-manipulation , data-manipulation	1	365	May 24, 2023
Question about the 'get flanks' tool usegalaxy.org support bed , text-manipulation , igv	4	431	August 25, 2022
Sort on BED file usegalaxy.org support text-manipulation	3	888	March 25, 2022
Extracting portion of fasta sequences from a multifasta file having contigs names and start-stop positions usegalaxy.org support	0	432	February 24, 2022
Annotate Genomic coordinates/ regions usegalaxy.eu support	3	778	June 21, 2021

How to get gene list from bed file containing chromosome number, start and end positions in 1st, 2nd, and 3rd columns?

Related topics