Hi
I wish to extract alignments between sequences for human and macaque, for which I have the genome coordinates of the human (hg19) in bed format. If I upload a file containing the entry below:
chr1 840486 841186 XLOC_000010 0 + 840486 841186 0,0,0 2 573,5, 0,695,
However, just sampling the outputs, I get these below:
Extract genomic DNA:
chr1:840486-841186(+) CTGTTGGGTCCCCTCCCCGCCCCACCGCGTCCCAGGGAACCCCGGCAGGG CACCCAGTGAGGGGGGCCCGGGCGTCCGCCCATTCCTCACTGCTGTCCCC GCCTGTGCCCGAAACCCCCGTTCACGTTCACCGAGAAAACAGACATAAAC CCAGCCAGGcacatccactagaatggctgtgatttcagaaaagcggacgt aagtgctgccgaggagatggaggcgttggaccccctcgcgcattgtcggg gcgggtgcagccgcggtgcaaaaggaccttcctcagaaagttgagcacga agttcccacaggcccggaagttcccctcccgggcgctccccagagagctg aagacTGGGCGCGTGCCGGCGGCAAATGTTCACAGCAAAGGGCGGCCCAG TGCGCAGCTGGAACATCAGCCCCAGGCGGTCGCTGCGACAGGGACGAGCC TGGGAAACGTGAaaatgtccagaacagggaatccacagataaagaaaaga catcgtggttgccagagctcgcgggagggggcaacagggaccgactgctt aacgtgtatggttccccttcagggtgaggacgtgttctgggacgaggtcg aggtcagggttgcaggacaagatgaaagcgctaaatgccactgaattgtt tgctttaacgtcattaattttgttatgtgaatttcatctcaatAGAATAA
Stitch gene blocks:
hg19.XLOC_000010 CTGTTGGGTCCCCTCCCCGCCCCACCGCGTCCCAGGGAACCCCGGCAGGGCACCCAGTGAGGGGGGCCCGGGCGTCCGCCCATTCCTCACTGCTGTCCCCGCCTGTGCCCGAAACCCCCGTTCACGTTCACCGAGAAAACAGACATAAACCCAGCCAGGcacatccactagaatggctgtgatttcagaaaagcggacgtaagtgctgccgaggagatggaggcgttggaccccctcgcgcattgtcggggcgggtgcagccgcggtgcaaaaggaccttcctcagaaagttgagcacgaagttcccacaggcccggaagttcccctcccgggcgctccccagagagctgaagacTGGGCGCGTGCCGGCGGCAAATGTTCACAGCAAAGGGCGGCCCAGTGCGCAGCTGGAACATCAGCCCCAGGCGGTCGCTGCGACAGGGACGAGCCTGGGAAACGTGAaaatgtccagaacagggaatccacagataaagaaaagacatcgtggttgccagagctcgcgggagggggcaacagggaccgactgcttaacgtgtatggttccccttcaggAATAA
Am I interpreting this wrong, or are the exons wrongly annotated as both uppercase and lowercase? The motif ‘TTCAGG’ should be the end of the first exon, but in the first example it is lowercase (I’ve marked as bold in the sequence) and in the second case is lowercase.
It seems like the sequence is as I would expect, but the uppercase/lowercase of exons/introns are not as I would expect. I need both exon and intron sequence for further analysis so this is an issue, unless I am missing something obvious here.
Can anybody help?
Thanks
Liam