Hi all,
I have analyzied my RNA-Seq data. I have used this tools:
Download sequences(SRA) from ncbi database.
FastQC (Check quality of sequencing).
Trimmomatic(the quality of each raw library is analyzed and sequencing adapters
and bad quality reads are removed)
I have used paired end datas as input in hisat.
I had htseq count.
I have used deseq2 package in galaxy to get up and dawn genes.
now i dont know how can i get novel lncRNAs?
Is the goal to find out which lncRNAs are present and differentially expressed in your RNA-seq data? Or to also perform your own lncRNA discovery to use with DE (or other) analysis?
Known non-coding genomic annotation could be incorporated into the count/differential expression analysis. And may have already been, but perhaps filtered out – it depends on which annotation features were present in your GTF and used to generate counts.
For lncRNA discovery (and subsequent DE with your RNA-seq data), you’ll need to create genomic annotation data that contains non-coding genomic feature predictions.
Note that genome annotation tools won’t work for larger eukaryotic genomes when working at any public Galaxy server (the analysis will be too large). And that may not even be necessary as there are several non-coding annotation resources already available for many model organisms. For example, Gencode includes lncRNAs for human and mouse in the complete annotation GTFs, and as distinct GTFs: https://www.gencodegenes.org/
Check your annotation GTF input – does it include non-coding RNA annotation? If so, that information could be used to filter your differentially expressed genes.
Predicting long non-coding RNA is a non-trivial analysis, but there are several domain-specific Galaxy servers that focus in this area. Many have tutorials, examples, novel tools, linked publications. Some analysis work with RNA-seq data directly (assembly, annotation). To review options, go the Galaxy Platform directly, tab into Public Galaxy servers, and keyword search (example: “rna”).