RNAseq of mouse with AAV synthetic transgene - how to do STAR alignment?

Hi everyone! I’m trying to do RNAseq from mice that have been treated with an AAV vector containing an artificial transgene. I want to look at transgene expression and changes to endogenous mouse gene expression. How do I do the alignment for gene expression counts for a genetically modified mouse?

So far I have/(am currently) merging FASTA files of GRCm39.11 with the FASTA of my AAV transgene plasmid using the “FASTA Merge Files and Filter Unique Sequences” tool

I’m very unexperienced in bioinformatics… What do I do next? I don’t know how to edit the gtf file since the exported features from Snapgene plasmid map don’t match the same columns as the mouse gtf file and I’m not sure what to use as start and stop sites.

I appreciate your help!
R

Welcome, @RebeccaW

This is correct for the reference sequence (fasta) portion.

For the reference annotation, you would normally just concatenate the extra line (in GTF format) onto the full GTF file.

But – do you know where the insertion is yet? You can share the data you have now if you want.

Thank you! The transgene is delivered via AAV, so it shouldn’t (in most cases) integrate, although a very small percentage still will integrate randomly.

1 Like

Hi @RebeccaW

Thanks for explaining. :slight_smile:

If the annotation is expected to be on it’s “own chromosome” then you can create a line for the plasmid “chromosome” with appropriate coordinates, then add that line to your GTF file. How to format → Genome Browser FAQ (GTF format specification)

And as a reminder: Make sure the plasmid label has the same identifier in both the fasta and the GTF, and that it doesn’t conflict with any of the standard chromosome name identifiers already in use. This avoids other types of technical problems.

Thanks!

Thank you!

1 Like