lin-4 encodes a small RNA that acts as an antisense repressor of
lin-14 and
lin-28 translation. Based on Genefinder analysis of the genomic sequence,
lin-4 was predicted to be located in an intron of a host gene, which we have tentatively named
lho-1 (
lin-4 host gene).
lho-1 was predicted to have 10 exons, and was found to contain weak similarity to a small region of precursors of vasotocin and mammalian vasopressin. As yet the function of
lho-1 is not known. The original Genefinder prediction placed
lin-4 in a 1.29kb intron between exons 1 and exon 2 of
lho-1. To clarify the structural and regulatory relationship between
lin-4 and
lho-1, we performed RT-PCR to determine the actual intron/exon structure of
lho-1 in the
lin-4 region, and to identify its 5' terminal sequence. We also used host gene:GFP fusions to identify
lho-1 promoter sequences. cDNA complementary to the 5' sequence of
lho-1 mRNA was synthesized from total C. elegans RNA using Tth DNA polymerase at 67!C (high temperature was required owing to apparent secondary structure in
lho-1 5' sequences). cDNA sequences were amplified by PCR using a SL1 primer and another primer specific to
lho-1 exon 3. Sequence analysis of the resulting PCR product showed that, contrary to Genefinder predictions, exon 1 of
lho-1 is 4.11 kb upstream of exon 2, located between bp 2229 to 2403 of cosmid F59G1.
lin-4 is located between bp 6168-6189 of F59G1, between exons 1 and 2. Exon 2 and 3 are as expected from Genefinder predictions. There is no in-frame ATG in exons 1 or 2, while in exon 3, there are three in-frame ATG codons, suggesting that the translation start site of
lho-1 is in exon 3. To identify the location of
lho-1 promoter sequences with respect to the
lin-4 promoter, three different GFP fusion protein constructs were made with various lengths of
lho-1 5' sequences upstream of exon 2. (GFP was inserted into a BamHI site in exon 3, in frame with the
lho-1 predicted protein coding sequence.) These constructs were injected into
dpy-20 worms along with
dpy-20 cosmid and GFP expression was assayed in transformed lines. The shortest fusion construct, PFL105, containing only 750 bp upstream of exon 2 (essentially just
lin-4 and its promoter sequence--this construct is sufficient to rescue
lin-4 null mutants), showed no GFP expression. A longer construct, PFL103, which includes 3.5kb upstream of exon 2, showed weak GFP expression in HSN neurons, intestine, and a few neurons in the head and tail of adults. Animals transformed with the longest construct, PFL 107, which includes 4.48kb upstream of exon 2 including exon 1 and 370bp upstream of exon 1, displayed strong GFP expression in HSNs, vulva, intestine, neurons in the tail, and numerous neurons in the head along the pharynx. Expression was also seen in adults in a cell tentatively identified as ALA. From these results, we conclude that
lin-4 is located within a 4.11kb intron of
lho-1. The expression of
lho-1::GFP requires greater than 3.5 kb upstream of
lho-1 exon 2, while
lin-4 expression requires only about 700bp of sequence upstream of exon 2, indicating that the expression of
lho-1 and
lin-4 are regulated separately. The expression pattern of
lho-1::GFP suggests that
lho-1may encode a protein that functions in the C. elegans nervous system.