Hi,
I want to call a variant from a gene on a publicly available data set.
These are transgenic mice expressing a human transgene not present in rodents.
My plan is to map the read to a human reference genome using STAR and then call variants with FreeBayes. My question is if instead of using a whole reference genome, I can only use Chromosome 22 which is where the gene is.
I’m not exactly sure what you are trying to do but that you can explain more if needed.
STAR will not create a reference assembly.
You can use an existing assembly from your target genome, in whole or in part, as a Custom Genome/Transcriptome to call variants that exist in the RNA-seq dataset. You can also assemble a reference transcriptome.
That said, I would suggest reviewing the publication related to the work you are trying to replicate – what reference did they use? It should be the same in your work or expect different results/statistics.
Thanks for the answer, my previous post was unclear. I rewrote it for clarity.
From your answer though I think that what I want to do is called a “custom genome”. I am exploring the tool NormalizeFASTA from the link.