We can finally report the 5' end of the
unc-22 message and the N- terminus of twitchin. We sequenced an additional 7825 bps to the right of our previous sequence and now have a total of 54,963 bp of genomic sequence. Previously, we believed that a likely initiator methionine for twitchin resided at 21,952, but this is located 12,914 bp downstream of the Tc1 insertion site for the
unc-22 allele
st139. Computer inspection for likely coding sequence and motifs I and II were not successful. During a search for cDNA clones for
spe-17, one clone, pSL12, turned-out to hybridize only to the
unc-22 message and its sequence is distributed between position 3958 and 9439. The remaining gap in coding sequence was filled by PCR of first strand cDNA using one primer for coding sequence just 3' of the previous ATG and one primer in coding sequence defined by pSL12. The 5' end of the message was determined by primer extension and by virtue of being trans-spliced to SL1 RNA (indicated by PCR with first strand cDNA and SL1 primer, and because pSL12 begins with the last 9 bp of SL1). Including a 5' untranslated sequence of 149 bp, we need to tack-on an additional 2522 bp of message, yielding a final
unc-22 message of 21, 614 bp! This additional coding sequence is interrupted by 17 introns, the largest being 7402 bp (final total of 30 introns) and 17 new exons each no longer than 370 bp, the smallest being 54 bp. Most importantly, this adds 791 amino acids (total of 6840) to twitchin resulting in a single polypeptide of 753,570 Da. There are 4 additional copies of motif II (total of 30) and 2 segments of 167 amino acids and 189 amino acids having no homologies to proteins in the databases. The Tc1 insertion site for
st139 resides in coding sequence for the third copy of motif II, 7 bp downstream of the 7th intron. In the following figure motif I copies are denoted 1-31 and motif II copies are denoted 1'-30'. [See Figure 1]