Sequence to gene name

Nothing · July 22, 2019, 1:13pm

Hi all!

I have a list of ~5000 sequences of length 80. I want to obtain gene symbols to these sequences. Initially I used R and retrieved gene symbol to more than half of these sequences. As the process was very time consuming, I thought I can use methods used in RNAseq, which I am not familiar with. I tried using Galaxy, and the result I got are correct (I checked against my partiall R results which in turn I had manually verified parts of using UCSC genome browser.
What I do is as follows:
1- Upload the sequences in FASTA format.
2- Hisat2, choose
Source for the reference genome: use a genome from history
Select the reference genome: hg38 ncRNA+CDS
Is this a single or paired library: single end
Specify strand information: F
3- and then pass results to HTSeqCount with following options:
GFF= hg38.gtf
Stranded=NO

I also used StringTie instead of HTSeqCount, and Salmon instead of HiSat2. But no succes.

jennaj · July 22, 2019, 8:51pm

Hi @Nothing

Option A:

Directly map with BLASTN to the genome, to get the mapping coordinates
Filter the results so each has a unique hit
Compare those coordinates with a gene/transcript annotation dataset’s coordinates (BED, GTF, etc)
Rename the transcript identifiers with gene identifiers/symbols

Option B:

Use the Jupyter Interactive Environment to use R (and other packages) directly in Galaxy. Launch this from an expanded dataset by clicking on this icon:

Nothing · July 23, 2019, 5:23am

Thank you very much for your response.

Topic		Replies	Views
ref_gene_id featurecounts usegalaxy.org support	6	3173	May 22, 2019
Mstrg convertion into gene symbol. Is this possible? usegalaxy.eu support transcriptomics , reference-annotation , rna-seq , stringtie	1	388	March 5, 2024
How to get specific genes to show up - Mapping or replacing gene identifiers usegalaxy.org support annotation , transcriptomics	5	1702	February 5, 2019
hisat2 and featurecounts usegalaxy.org support gtn-tutorial , workflow , galaxy-local , mapping , transcriptomics , featurecounts	23	2063	October 28, 2024
Linking mouse gene IDs/name to Encode IDs following Salmon and DESeq analysis transcriptomics , resources , tool-help , salmon	2	14	April 16, 2025

Sequence to gene name

Related topics