Throughout the C. elegans sequencing project Genefinder was the primary protein-coding gene prediction program. These initial predictions were manually reviewed by curators as part of a "first-pass annotation" and are actively curated by WormBase staff using a variety of data and information. In the WormBase data release WS133 there are 22,227 protein-coding gene, including 2,575 alternatively-spliced forms. Twenty-eight percent of these have every base of every exon confirmed by transcription evidence while an additional 51% have some bases confirmed. Most of the genes are relatively small covering a genomic region of about 3 kb. The average gene contains 6.4 coding exons accounting for about 26% of the genome. Most exons are small and separated by small introns. The median size of exons is 123 bases, while the most common size for introns is 47 bases. Protein-coding genes are denser on the autosomes than on chromosome X, and denser in the central region of the autosomes than on the arms. There are only 561 annotated pseudogenes but estimates but several estimates put this much higher.
The normal karyotype of Caenorhabditis elegans, with its five pairs of autosomes and single pair of X chromosomes, is described. General features of chromosomes and global differences between different chromosomal regions are discussed. Abnormal karyotypes, including duplications, deficiencies, inversions, translocations and chromosome fusions are reviewed. The effects of varying ploidy and of varying gene dosage are summarized. Dosage-sensitive genes seem to be rare in C. elegans, and the organism is able to tolerate substantial levels of aneuploidy. However, autosomal hemizygosity for more than about 3 % of the total genome may be incompatible with viability.
Methods Cell Biol,
Caenorhabditis elegans is in all likelihood the first metazoan animal whose entire genome will be determined. In addition, a very detailed description of the animal's morphology, development, and physiology is available (see elsewhere in this book, and Wood, 1988). Thus, the complete phenotype and genotype of an animal will be known. What is not known is how genotype determines phenotype; to study this, one needs to establish connections between genome sequence and phenotypes. Much has been done by classic or forward genetics: mutagenesis experiments have identified loci involved in a specific trait. Many of these loci have already been defined at the molecular level, and the genome sequence will certainly aid in the identification of many more. The opposite approach, reverse genetics, becomes naturally more important when more of the genome sequence is determined: Given the sequence of a gene of which nothing else is know, how can the function of that gene be determined? Reverse genetics is more than targeted inactivation. One can study a gene's function by several approaches...|
The switching on or off of specific genes is a fundamental aspect of cellular differentiation during metazoan development. The molecular events involved in this switching are not yet understood, but they are now subject to analysis with the current technology available in molecular biology. Much of the work directed toward the understanding of developmental gene regulation has focused on the genes encoding the protein actin. Actin is the major thin-filament protein in both muscle and nonmuscle cells. The protein sequences of actins from a variety of tissues in several organisms have been determined, and the actin genes from a number of organisms have been isolated and are currently being studied. This work has revealed that actins are evolutionarily conserved, are encoded in most species by multigene families, and are differentially regulated, both spatially and temporally, during