What are the RNA-seq data processing steps in Galaxy according to de novo approach?

Hi @f_kurt

Please see this prior Q&A, it covers most of what you are asking about. It is worth reviewing first: What to do with trinity output files

For tools that appear to be missing, it could be that the Galaxy server you are working at does not have them installed. Or, the tool panel search is not finding them for some reason (search term typo?). You can also directly check.

  • Trimmomatic is often under the tool group “FASTQ Quality Control”. And yes, performing QA is probably a good idea. FastQC is just a quality report – no changes are made to the data. It is important that both ends of paired-end data are input to assembly, and Trimmomatic will output pairs that are still paired post-QA.
  • Assemblystats is available in the ToolShed but is not hosted at all public Galaxy servers.
  • Fasta Statistics is hosted and is usually under the tool group “FASTA/FASTQ”. Or, you may be more interested in Compute contig Ex90N50 statistic and Ex90 transcript count from a Trinity assembly in the tool group “RNA Analysis”.

The “Transcriptomics” Topic of the GTN Tutorials covers RNA-seq analysis (but not for all of this exact analysis – is still a work in progress – and explained in the linked Q&A above):

Thanks!