GWAS troubleshooting

Dongqi_Xie · December 3, 2025, 10:14pm

Hi,

Have you solved this issue? I also came across the same error saying “no CDS” in my gtf file.

Thanks!

jennaj · December 3, 2025, 10:43pm

The original topic has an example history with all of the steps completed, so I think this is working unless there is a new problem. We can investigate!

XRef

The important part for the “no CDS” error is that those coding regions are originally provided from a known annotation source (Gencode, UCSC, RefSeq) that are later mapped onto any predicted annotation created (from Stringtie) during the discovery steps of the protocol.

If you didn’t included any known annotation yet, then that is where to start. If you did include known annotation but it didn’t seem to “map over” onto your final combined annotation from the discovery steps, we can probably help to resolve this. Most of the topics at this forum with the reference-annotation tag are about resolving similar data issues if you want to peek at what those look like first.

Then, would you like to clarify more about your situation? You could include: Are you following the tutorial methods exactly? Or using your own data? With the human genome assembly or a different species’ assembly? With a custom genome or using a server index?

You could also share back your history so far and we’ll be able to see those details, and maybe what is going wrong, plus learn the server you are working at and see the full error logs. How to share your work is here. → How to get faster help with your question

Let’s start there!

Dongqi_Xie · December 4, 2025, 3:40pm

Hi Jennaj,

Thanks for your reply. I did follow the tutorial. I was analyzing my own data generated from human cells and used genome build and annotation file downloaded from Gencode. I checked the file on my gtf file and it shows “CDS” attribute in column 3. While the annotation file generated by Stringtie Merge doesn’t seem to contain “CDS” attribute. I am not sure whether this could be an problem but I will firstly check the history you shared in the post.

Thanks!

jennaj · December 4, 2025, 5:56pm

Hi @Dongqi_Xie

It sounds like everything is set up well so far!

Known: Gencode annotation will have CDS regions annotated.
Discovery: Stringtie annotation will NOT have CDS regions annotated (does not predict coding region footprints)

These two are then compared to put all of the annotations together into a reconciled “master annotation” file. Then this is what is used for the differential exon expression analysis.

The exact step is what was discussed in the original topic.

Did you run your data through this way? Or am I misunderstanding where the problem is coming up?

If this was the step, and the annotation didn’t actually merge together for some reason, there is likely a mismatch somewhere between reference data assembly versions (the actual chromosome names and strings of bases). We can help to troubleshoot here if you would like to share back your history for review. (you can unshare after!). Screenshots may also work but you’ll need to capture a lot of details for the jobs, parameters, and the input data content.

FAQ: Sharing your History

Another example for getting Gencode data synched up was in this topic. The user decided to use the Gencode human assembly – both the fasta and the gtf – since this avoided issues that can come up between patch assembly alt/haplotype fragments that they cared about. UCSC and Gencode have slightly different versions – we have the UCSC version indexed on the server but you can use a custom genome from Gencode if you want to. The first files in this shared history are the data they selected and each have the original file names as hosted at Gencode → https://www.gencodegenes.org/human/.

These concerns about reference data would happen anywhere a large analysis like this would be done, especially when the comparisons are so detailed – differential exons or variants in particular. The tools are reviewing the reference very closely and will complain over even minor unexpected file format/content variations.

Ok, so that’s a lot of information! But the correction is probably small, we just need to review to learn where to make the adjustment. I’ll watch for your reply.