Trying to Create Transcriptomes as FASTA (current workflow HISAT2-Samtools-Cufflinks)

mhlee · May 24, 2022, 7:26pm

Hello,
I am trying to generate fasta files of the transcriptome of a species with RNAseq data I have currently for use in ExUTR. I have run HISAT2 with the specific option for Cufflinks downstream analysis and then sorted and indexed BAM files with Samtools. However, my code (both outside of Galaxy and while attempting Galaxy) has stalled. I my sorted bam files but attempting to run them on cufflinks is generating this set of errors.


Fatal error: Exit code 1 ()
Fatal error: Matched on Error
Tool generated the following standard error:

Traceback (most recent call last):
  File "/opt/galaxy/tools/ngs_rna/cuffsuite2/cufflinks_wrapper.py", line 9, in <module>
    from galaxy.datatypes.util.gff_util import parse_gff_attributes, gff_attributes_to_str
ImportError: No module named galaxy.datatypes.util.gff_util
Galaxy job runner generated the following standard error:

Traceback (most recent call last):
  File "metadata/set.py", line 1, in <module>
    from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata()
  File "/opt/galaxy/lib/galaxy_ext/metadata/set_metadata.py", line 20, in <module>
    from galaxy.metadata.set_metadata import set_metadata
  File "/opt/galaxy/lib/galaxy/metadata/__init__.py", line 14, in <module>
    from galaxy.model import store
  File "/opt/galaxy/lib/galaxy/model/store/__init__.py", line 15, in <module>
    from bdbag import bdbag_api as bdb
  File "/opt/galaxy/.venv/lib/python3.6/site-packages/bdbag/__init__.py", line 23, in <module>
    from distutils.util import strtobool
  File "/opt/galaxy/.venv/lib/python3.6/distutils/__init__.py", line 44, in <module>
    from distutils import dist, sysconfig  # isort:skip
ImportError: cannot import name 'dist'

I am looking for a way to resolve this.

The options I ran are as follows:

|AM or BAM file of aligned RNA-Seq reads|* 1:Gbfemale-mated-bac2.sorted.bam|
| — | — |
|Max Intron Length|300000|
|Min Isoform Fraction|0.1|
|Pre MRNA Fraction|0.15|
|Use Reference Annotation|Use reference annotation|
|Reference Annotation|* 5: VectorBase-57_GbrevipalpisIAEA.gff|
|Count hits compatible with reference RNAs only|No|
|Perform Bias Correction|Yes|
|Reference sequence data|cached|
|Using reference genome|VectorBase-49_GbrevipalpisIAEA_Genome|
|Use multi-read correct|No|
|Apply length correction|Cufflinks Effective Length Correction|
|Global model (for use in Trackster)|No dataset.|
|Set advanced Cufflinks options|Yes|
|Library prep used for input reads|Auto Detect|
|Mask File||
|Inner mean distance|45|
|Inner distance standard deviation|20|
|Max MLE iterations|5000|
|Alpha value for the binomial test used during false positive spliced alignment filtration|0.001|
|percent read overhang taken as suspiciously small|0.09|
|Intronic overhang tolerance|8|
|Maximum genomic length of a given bundle|3500000|
|Maximum number of fragments per locus|1000000|
|Minimal allowed intron size|50|
|Minimum average coverage required to attempt 3prime trimming.|10|
|The fraction of average coverage below which to trim the 3prime end of an assembled transcript.|0.1|

igor · May 25, 2022, 12:57am

BAMs created in Galaxy are coordinate sorted and indexed. Tools, such as HiSAT2, are in fact pipelines and include samtool sort etc.
Cufflinks tools are marked as deprecated on (some) Galaxy servers. Have a look at De novo transcriptome reconstruction with RNA-Seq

Topic		Replies	Views
Request for help to solve the problem in proceeding of expression analysis tool-help , cufflinks	1	3	December 4, 2024
RNAseq analysis salmon	2	1377	May 20, 2019
Error running Cufflinks usegalaxy.org support server-side-error , pulsar	4	1530	October 2, 2020
RNAseq mapping issues usegalaxy.eu support	7	18	June 2, 2025
Issue with Tutorial: De novo transcriptome reconstruction with RNA-Seq - HISAT2 mapping failures usegalaxy.org support troubleshooting , mapping	5	913	October 4, 2019

Trying to Create Transcriptomes as FASTA (current workflow HISAT2-Samtools-Cufflinks)

Related topics