Prokaryotic DNABinding Proteins Bind Specifically to Regulatory Sites in Operons
by taratuta
Comments
Transcript
Prokaryotic DNABinding Proteins Bind Specifically to Regulatory Sites in Operons
III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Bacteria such as E. coli usually rely on glucose as their source of carbon and energy. However, when glucose is scarce, E. coli can use lactose as their carbon source even though this disaccharide does not lie on any major metabolic pathways. An essential enzyme in the metabolism of lactose is β-galactosidase, which hydrolyzes lactose into galactose and glucose. These products are then metabolized by pathways discussed in Chapter 16. This reaction can be conveniently followed in the laboratory through the use of alternative galactoside substrates that form colored products such as X-Gal (Figure 31.1) An E. coli cell growing on a carbon source such as glucose or glycerol contains fewer than 10 molecules of β-galactosidase. In contrast, the same cell will contain several thousand molecules of the enzyme when grown on lactose (Figure 31.2). The presence of lactose in the culture medium induces a large increase in the amount of β-galactosidase by eliciting the synthesis of new enzyme molecules rather than by activating a preexisting but inactive precursor. A crucial clue to the mechanism of gene regulation was the observation that two other proteins are synthesized in concert with β-galactosidase namely, galactoside permease and thiogalactoside transacetylase. The permease is required for the transport of lactose across the bacterial cell membrane. The transacetylase is not essential for lactose metabolism but appears to play a role in the detoxification of compounds that also may be transported by the permease. Thus, the expression levels of a set of enzymes that all contribute to the adaptation to a given change in the environment change together. Such a coordinated unit of gene expression is called an operon. 31.1.1. An Operon Consists of Regulatory Elements and Protein-Encoding Genes The parallel regulation of β-galactosidase, the permease, and the transacetylase suggested that the expression of genes encoding these enzymes is controlled by a common mechanism. Francois Jacob and Jacques Monod proposed the operon model to account for this parallel regulation as well as the results of other genetic experiments (Figure 31.3). The genetic elements of the model are a regulator gene, a regulatory DNA sequence called an operator site, and a set of structural genes. The regulator gene encodes a repressor protein that binds to the operator site. The binding of the repressor to the operator prevents transcription of the structural genes. The operator and its associated structural genes constitute the operon. For the lactose (lac) operon, the i gene encodes the repressor, o is the operator site, and the z, y, and a genes are the structural genes for β-galactosidase, the permease, and the transacetylase, respectively. The operon also contains a promoter site (denoted by p), which directs the RNA polymerase to the correct transcription initiation site. The z, y, and a genes are transcribed to give a single mRNA molecule that encodes all three proteins. An mRNA molecule encoding more than one protein is known as a polygenic or polycistronic transcript. 31.1.2. The lac Operator Has a Symmetric Base Sequence The operator site of the lac operon has been extensively studied (Figure 31.4). The nucleotide sequence of the operator site shows a nearly perfect inverted repeat, indicating that the DNA in this region has an approximate twofold axis of symmetry. Recall that cleavage sites for restriction enzymes such as EcoRV have similar symmetry properties (Section 9.3.3). Symmetry in the operator site usually corresponds to symmetry in the repressor protein that binds the operator site. Symmetry matching is a recurring theme in protein-DNA interactions. 31.1.3. The lac Repressor Protein in the Absence of Lactose Binds to the Operator and Blocks Transcription How does the lac repressor inhibit the expression of the lac operon? The lac repressor can exist as a dimer of 37-kd subunits, and two dimers often come together to form a tetramer. In the absence of lactose, the repressor binds very tightly and rapidly to the operator. When the lac repressor is bound to DNA, it prevents bound RNA polymerase from locally unwinding the DNA to expose the bases that will act as the template for the synthesis of the RNA strand (Section 28.1.3). How does the lac repressor locate the operator site in the E. coli chromosome? The lac repressor binds 4 × 106 times as strongly to operator DNA as it does to random sites in the genome. This high degree of selectivity allows the repressor to find the operator efficiently even with a large excess (4.6 ×106) of other sites within the E. coli genome. The dissociation constant for the repressor-operator complex is approximately 0.1 pM (10-13 M). The rate constant for association ( 1010 M-1 s-1) is strikingly high, indicating that the repressor finds the operator by diffusing along a DNA molecule (a onedimensional search) rather than encountering it from the aqueous medium (a three-dimensional search). Inspection of the complete E. coli genome sequence reveals two sites within 500 bp of the primary operator site that approximate the sequence of the operator. Other lac repressor dimers can bind to these sites, particularly when aided by cooperative interactions with the lac repressor dimer at the primary operator site. No other sites that closely match sequence of the lac operator site are present in the rest of the E. coli genome sequence. Thus, the DNA-binding specificity of the lac repressor is sufficient to specify a nearly unique site within the E. coli genome. The three-dimensional structure of the lac repressor has been determined in various forms. Each monomer consists of a small amino-terminal domain that binds DNA and a larger domain that mediates the formation of the dimer and the tetramer (Figure 31.5). A pair of the amino-terminal domains come together to form the functional DNA-binding unit. Complexes between the lac repressor and oligonucleotides that contain the lac operator sequence have been structurally characterized. The lac repressor binds DNA by inserting an α helix into the major groove of DNA and making a series of contacts with the edges of the base pairs as well as with the phosphodiester backbone (Figure 31.6). For example, an arginine residue in the α helix forms a pair of hydrogen bonds with a guanine residue within the operator. Other bases are not directly contacted but may still be important for binding by virtue of their effects on local DNA structure. As expected, the twofold axis of the operator coincides with a twofold axis that relates the two DNA-binding domains. 31.1.4. Ligand Binding Can Induce Structural Changes in Regulatory Proteins How does the presence of lactose trigger expression from the lac operon? Interestingly, lactose itself does not have this effect; rather, allolactose, a combination of galactose and glucose with an α-1,6 rather than an α-1,4 linkage, does. Allolactose is thus referred to as the inducer of the lac operon. Allolactose is a side product of the β-galactosidase reaction produced at low levels by the few molecules of β-galactosidase that are present before induction. Some other βgalactosides such as isopropylthiogalactoside (IPTG) are potent inducers of β-galactosidase expression, although they are not substrates of the enzyme. IPTG is useful in the laboratory as a tool for inducing gene expression. How does the presence of the inducer modulate gene expression? When the lac repressor is bound to the inducer, the repressor's affinity for operator DNA is greatly reduced. The inducer binds in the center of the large domain within each monomer. This binding leads to local conformational changes that are transmitted to the interface with the DNA-binding domains (Figure 31.7). The relation between the two small DNA-binding domains is modified so that they cannot easily contact DNA simultaneously, leading to a dramatic reduction in DNA-binding affinity. Let us recapitulate the processes that regulate gene expression in the lactose operon (Figure 31.8). In the absence of inducer, the lac repressor is bound to DNA in a manner that blocks RNA polymerase from transcribing the z, y, and a genes. Thus, very little β-galactosidase, permease, or transacetylase are produced. The addition of lactose to the environment leads to the formation of allolactose. This inducer binds to the lac repressor, leading to conformational changes and the release of DNA by the lac repressor. With the operator site unoccupied, RNA polymerase can then transcribe the other lac genes and the bacterium will produce the proteins necessary for the efficient utilization of lactose. The structure of the large domain of the lac repressor is similar to those of a large class of proteins that are present in E. coli and other bacteria. This family of homologous proteins binds ligands such as sugars and amino acids at their centers. Remarkably, domains of this family are utilized by eukaryotes in taste proteins and in neurotransmitter receptors, as will be discussed in Chapter 32. 31.1.5. The Operon Is a Common Regulatory Unit in Prokaryotes Many other gene-regulatory networks function in ways analogous to those of the lac operon. For example, genes taking part in purine and, to a lesser degree, pyrimidine biosynthesis are repressed by the pur repressor. This dimeric protein is 31% identical in sequence with the lac repressor and has a similar three-dimensional structure. However, the behavior of the pur repressor is opposite that of the lac repressor: whereas the lac repressor is released from DNA by binding to a small molecule, the pur repressor binds DNA specifically only when bound to a small molecule. Such a small molecule is called a corepressor. For the pur repressor, the corepressor can be either guanine or hypoxanthine. The dimeric pur repressor binds to inverted repeat DNA sites of the form 5 -ANGCAANCGNTTNCNT-3 , in which the bases shown in boldface type are particularly important. Examination of the E. coli genome sequence reveals the presence of more than 20 such sites, regulating 19 operons including more than 25 genes (Figure 31.9). Because the DNA binding sites for these regulatory proteins are relatively short, it is likely that they evolved independently from one another and are not related by divergence from an ancestral regulatory site. Once a ligandregulated DNA-binding protein is present in a cell, binding sites may evolve adjacent to additional genes, allowing them to become regulated in a physiologically appropriate manner. Binding sites for the pur repressor have evolved in the regulatory regions of a wide range of genes taking part in nucleotide biosynthesis. All such genes can then be regulated in a concerted manner. 31.1.6. Transcription Can Be Stimulated by Proteins That Contact RNA Polymerase All the DNA-binding proteins discussed thus far function by inhibiting transcription until some environmental condition, such as the presence of lactose, is met. There are also DNA-binding proteins that stimulate transcription. One particularly well studied example is the catabolite activator protein (CAP), which is also known as the cAMP response protein (CRP). When bound to cAMP, CAP, which also is a sequence-specific DNA-binding protein, stimulates the transcription of lactose- and arabinose-catabolizing genes. Within the lac operon, CAP binds to an inverted repeat that is centered near position -61 relative to the start site for transcription (Figure 31.10). CAP functions as a dimer of identical subunits. The CAP-cAMP complex stimulates the initiation of transcription by approximately a factor of 50. A major factor in this stimulation is the recruitment of RNA polymerase to promoters to which CAP is bound. Studies have been undertaken to localize the surfaces on CAP and on the α subunit of RNA polymerase that participate in these interactions (Figure 31.11). These energetically favorable protein-protein contacts increase the likelihood that transcription will be initiated at sites to which the CAP-cAMP complex is bound. Thus, in regard to the lac operon, gene expression is maximal when the binding of allolactose relieves the inhibition by the lac repressor, and the CAP-cAMP complex stimulates the binding of RNA polymerase. The E. coli genome contains many CAP-binding sites in positions appropriate for interactions with RNA polymerase. Thus, an increase in the cAMP level inside an E. coli bacterium results in the formation of CAP-cAMP complexes that bind to many promoters and stimulate the transcription of genes encoding a variety of catabolic enzymes.When grown on glucose, E. coli have a very low level of catabolic enzymes such as β-galactosidase. Clearly, it would be wasteful to synthesize these enzymes when glucose is abundant. The inhibitory effect of glucose, called catabolite repression, is due to the ability of glucose to lower the intracellular concentration of cyclic AMP. 31.1.7. The Helix-Turn-Helix Motif Is Common to Many Prokaryotic DNA-Binding Proteins The structures of many prokaryotic DNA-binding proteins have now been determined, and amino acid sequences are known for many more. Strikingly, the DNA-binding surfaces of many (but not all) of these proteins consist of a pair of α helices separated by a tight turn (Figure 31.12). This helix-turn-helix motif is present in the lac repressor family, CAP, and many other gene-regulatory proteins. In complexes with DNA, the second of these two helices (often called the recognition helix) lies in the major groove, where amino acid side chains make contact with the edges of base pairs, whereas residues of the first helix participate primarily in contacts with the DNA backbone. Helix-turn-helix motifs are very frequently present on proteins that bind DNA as dimers, and thus two of the units will be present, one on each monomer. In this case, the two helix-turn-helix units are related by twofold symmetry along the DNA double helix. Although the helix-turn-helix motif is the most commonly observed DNA-binding unit in prokaryotes, not all regulatory proteins bind DNA through such units. A striking example is provided by the E. coli methionine repressor (Figure 31.13). This protein binds DNA through the insertion of a pair of β strands into the major groove. We shall shortly encounter a variety of other DNA-binding motifs found in eukaryotic cells. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.1. Following the β-Galactosidase Reaction. The galactoside substrate X-Gal produces a colored product on cleavage by β-galactosidase. The appearance of this colored product provides a convenient means for monitoring the amount of the enzyme both in vitro and in vivo. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.2. β-Galactosidase Induction. The addition of lactose to an E. coli culture causes the production of βgalactosidase to increase from very low amounts to much larger amounts. The increase in the amount of enzyme parallels the increase in the number of cells in the growing culture. β-Galactosidase constitutes 6.6% of the total protein synthesized in the presence of lactose. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.3. Operons. (A) The general structure of an operon as conceived by Jacob and Monod. (B) The structure of the lactose operon. In addition to the promoter (p) in the operon, a second promoter is present in front of the regulator gene (i) to drive the synthesis of the regulator. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.4. The LAC Operator. The nucleotide sequence of the lac operator shows a nearly perfect inverted repeat, corresponding to twofold rotational symmetry in the DNA. Parts of the sequences that are related by this symmetry are shown in the same color. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.5. Structure of the LAC Repressor. A lac repressor dimer is shown bound to DNA. A part of the structure that mediates the formation of lac repressor tetramers is not shown. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.6. LAC Repressor-DNA Interactions. The lac repressor DNA-binding domain inserts an α helix into the major groove of operator DNA. A specific contact between an arginine residue of the repressor and a G-C base pair is shown at the right. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.7. Effects of IPTG On LAC Repressor Structure. The structure of the lac repressor bound to the inducer isopropylthiogalactoside (IPTG), shown in orange, is superimposed on the structure of the lac repressor bound to DNA, shown in purple. The binding of IPTG induces structural changes that alter the relation between the two DNA-binding domains so that they cannot interact effectively with DNA. The DNA-binding domains of the lac repressor bound to IPTG are not shown, because these regions are not well ordered in the crystals studied. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.8. Induction of the LAC Operon. (A) In the absence of lactose, the lac repressor binds DNA and represses transcription from the lac operon. (B) Allolactose or another inducer binds to the lac repressor, leading to its dissociation from DNA and to the production of lac mRNA. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.9. Binding-Site Distributions. The E. coli genome contains only a single region that closely matches the sequence of the lac operator (shown in blue). In contrast, 20 sites match the sequence of the pur operator (shown in red). Thus, the pur repressor regulates the expression of many more genes than does the lac repressor. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.10. Binding Site for Catabolite Activator Protein (CAP). This protein binds as a dimer to an inverted repeat that is at the position -61 relative to the start site of transcription. The CAP binding site on DNA is adjacent to the position at which RNA polymerase binds. III. Synthesizing the Molecules of Life 31. The Control of Gene Expression 31.1. Prokaryotic DNA-Binding Proteins Bind Specifically to Regulatory Sites in Operons Figure 31.11. Structure of a Dimer of CAP Bound to DNA. The residues in each CAP monomer that have been implicated in direct interactions with RNA polymerase are shown in yellow.