Most Eukaryotic Genes Are Mosaics of Introns and Exons
Note: This table identifies the amino acid encoded by each triplet. For example, the codon 5 AUG 3 on mRNA specifies methionine, whereas CAU specifies histidine, UAA, UAG, and UGA are termination signals. AUG is part of the initiation signal, in addition to coding for internal methionine residues. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.5. Amino Acids Are Encoded by Groups of Three Bases Starting from a Fixed Point Figure 5.32. Initiation of Protein Synthesis. Start signals are required for the initiation of protein synthesis in (A) prokaryotes and (B) eukaryotes. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.5. Amino Acids Are Encoded by Groups of Three Bases Starting from a Fixed Point Table 5.5. Distinctive codons of human mitochondria Codon Standard code Mitochondrial code UGA UGG AUA AUG AGA AGG I. The Molecular Design of Life Stop Trp Ile Met Arg Arg Trp Trp Met Met Stop Stop 5. DNA, RNA, and the Flow of Genetic Information 5.6. Most Eukaryotic Genes Are Mosaics of Introns and Exons In bacteria, polypeptide chains are encoded by a continuous array of triplet codons in DNA. For many years, genes in higher organisms also were assumed to be continuous. This view was unexpectedly shattered in 1977, when investigators in several laboratories discovered that several genes are discontinuous. The mosaic nature of eukaryotic genes was revealed by electron microscopic studies of hybrids formed between mRNA and a segment of DNA containing the corresponding gene (Figure 5.33). For example, the gene for the β chain of hemoglobin is interrupted within its amino acid-coding sequence by a long intervening sequence of 550 base pairs and a short one of 120 base pairs. Thus, the βglobin gene is split into three coding sequences. 5.6.1. RNA Processing Generates Mature RNA At what stage in gene expression are intervening sequences removed? Newly synthesized RNA chains (pre-mRNA) isolated from nuclei are much larger than the mRNA molecules derived from them: in the case of β-globin RNA, the former sediment at 15S in zonal centrifugation experiments (Section 4.1.6) and the latter at 9S. In fact, the primary transcript of the β-globin gene contains two regions that are not present in the mRNA. These intervening sequences in the 15S primary transcript are excised, and the coding sequences are simultaneously linked by a precise splicing enzyme to form the mature 9S mRNA (Figure 5.34). Regions that are removed from the primary transcript are called introns (for intervening sequences), whereas those that are retained in the mature RNA are called exons (for expressed regions). A common feature in the expression of split genes is that their exons are ordered in the same sequence in mRNA as in DNA. Thus, split genes, like continuous genes, are colinear with their polypeptide products. Splicing is a facile complex operation that is carried out by spliceosomes, which are assemblies of proteins and small RNA molecules (Section 28.3.4). This enzymatic machinery recognizes signals in the nascent RNA that specify the splice sites. Introns nearly always begin with GU and end with an AG that is preceded by a pyrimidine-rich tract (Figure 5.35). This consensus sequence is part of the signal for splicing. 5.6.2. Many Exons Encode Protein Domains Most genes of higher eukaryotes, such as birds and mammals, are split. Lower eukaryotes, such as yeast, have a much higher proportion of continuous genes. In prokaryotes, split genes are extremely rare. Have introns been inserted into genes in the evolution of higher organisms? Or have introns been removed from genes to form the streamlined genomes of prokaryotes and simple eukaryotes? Comparisons of the DNA sequences of genes encoding proteins that are highly conserved in evolution suggest that introns were present in ancestral genes and were lost in the evolution of organisms that have become optimized for very rapid growth, such as prokaryotes. The positions of introns in some genes are at least 1 billion years old. Furthermore, a common mechanism of splicing developed before the divergence of fungi, plants, and vertebrates, as shown by the finding that mammalian cell extracts can splice yeast RNA. Many exons encode discrete structural and functional units of proteins. An attractive hypothesis is that new proteins arose in evolution by the rearrangement of exons encoding discrete structural elements, binding sites, and catalytic sites, a process called exon shuffling. Because it preserves functional units but allows them to interact in new ways, exon shuffling is a rapid and efficient means of generating novel genes (Figure 5.36). Introns are extensive regions in which DNA can break and recombine with no deleterious effect on encoded proteins. In contrast, the exchange of sequences between different exons usually leads to loss of function. Another advantage conferred by split genes is the potentiality for generating a series of related proteins by splicing a nascent RNA transcript in different ways. For example, a precursor of an antibody-producing cell forms an antibody that is anchored in the cell's plasma membrane (Figure 5.37). Stimulation of such a cell by a specific foreign antigen that is recognized by the attached antibody leads to cell differentiation and proliferation. The activated antibody-producing cells then splice their nascent RNA transcript in an alternative manner to form soluble antibody molecules that are secreted rather than retained on the cell surface. We see here a clear-cut example of a benefit conferred by the complex arrangement of introns and exons in higher organisms. Alternative splicing is a facile means of forming a set of proteins that are variations of a basic motif according to a developmental program without requiring a gene for each protein. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.6. Most Eukaryotic Genes Are Mosaics of Introns and Exons Figure 5.33. Detection of Intervening Sequences by Electron Microscopy. An mRNA molecule (shown in red) is hybridized to genomic DNA containing the corresponding gene. (A) A single loop of single-stranded DNA (shown in blue) is seen if the gene is continuous. (B) Two loops of single-stranded DNA (blue) and a loop of double-stranded DNA (blue and green) are seen if the gene contains an intervening sequence. Additional loops are evident if more than one intervening sequence is present. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.6. Most Eukaryotic Genes Are Mosaics of Introns and Exons Figure 5.34. Transcription and Processing of the β -globin gene. The gene is transcribed to yield the primary transcript, which is modified by cap and poly(A) addition. The intervening sequences in the primary RNA transcript are removed to form the mRNA. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.6. Most Eukaryotic Genes Are Mosaics of Introns and Exons Figure 5.35. Consensus Sequence for the Splicing of mRNA Precursors. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.6. Most Eukaryotic Genes Are Mosaics of Introns and Exons Figure 5.36. Exon Shuffling. Exons can be readily shuffled by recombination of DNA to expand the genetic repertoire. I. The Molecular Design of Life 5. DNA, RNA, and the Flow of Genetic Information 5.6. Most Eukaryotic Genes Are Mosaics of Introns and Exons Figure 5.37. Alternative Splicing. Alternative splicing generates mRNAs that are templates for different forms of a protein: (A) a membrane-bound antibody on the surface of a lymphocyte, and (B) its soluble counterpart, exported from the cell. The membrane-bound antibody is anchored to the plasma membrane by a helical segment (highlighted in yellow) that is encoded by its own exon.