Comments
Transcript
90 233 Rearrangement of Immunoglobulin Genes
wea25324_ch23_732-758.indd Page 740 740 12/21/10 1:53 PM user-f469 /Volume/204/MHDQ268/wea25324_disk1of1/0073525324/wea25324_pagefile Chapter 23 / Transposition P elements are now commonly used as mutagenic agents in genetic experiments with Drosophila. One advantage of this approach is that the mutations are easy to locate; we just look for the P element and it leads us to the interrupted gene. Molecular biologists also use P elements to transform flies—that is, to carry manipulated genes into flies. Antigen-binding sites S SUMMARY The P-M system of hybrid dysgenesis in Drosophila is caused by the conjunction of two factors: (1) a transposable element (P) contributed by the male, and (2) M cytoplasm contributed by the female, which allows transposition of the P element. Hybrid offspring of P males and M females therefore suffer multiple transpositions of the P element. This causes damaging chromosomal mutations that render the hybrids sterile. On the other hand, P females contain a suppressor of transposition (a group of piRNAs targeting the P element), so offspring of either P or M males and P females are fertile. P elements have practical value as mutagenic and transforming agents in genetic experiments with Drosophila. 23.3 Rearrangement of Immunoglobulin Genes Rearrangements of the mammalian genes in B cells that produce antibodies, or immunoglobulins, and in T cells that produce T-cell receptors, use a process that closely resembles transposition. Even the recombinases involved in antibody and T-cell receptor gene rearrangements resemble transposases. Because of these similarities, we include these rearrangements in this chapter. As mentioned in Chapter 3, an antibody is composed of four polypeptides: two heavy chains and two light chains. (Similarly, T-cell receptors contain one large b-chain and one smaller a-chain.) Figure 23.11 illustrates an antibody schematically and shows the sites that combine with an invading antigen. These sites, called variable regions, vary from one antibody to the next and give these proteins their specificities; the rest of the protein (the constant region) does not vary from one antibody to another within an antibody class, though some variation occurs between the few classes of antibodies. Any given immune cell can make antibody with only one kind of specificity. Remarkably enough, humans have immune cells capable of producing antibodies to react with virtually any foreign substance we would ever encounter. That means we can make many millions of different antibodies. Does this imply that we have millions of different antibody genes? That is an untenable hypothesis; it would place an impossible burden on our genomes to carry all the S S SS S Figure 23.11 Structure of an antibody. The antibody is composed of two light chains (blue) bound through disulfide bridges to two heavy chains (pink), which are themselves held together by a disulfide bridge. The antigen-binding sites are at the amino termini of the protein chains, where the variable regions lie. necessary genes. So how do we solve the antibody diversity problem? As unlikely as it may seem, a maturing B cell, a cell that is destined to make an antibody, rearranges its genome to bring together separate parts of its antibody genes. The machinery that puts together the gene selects these parts at random from heterogeneous groups of parts, rather like ordering from a luncheon menu (“Choose one from column A and one from column B”). This arrangement greatly increases the variability of the genes. For instance, if 41 possibilities are present in “column A” and 5 in “column B,” the total number of combinations of A 1 B is 41 3 5 or 205. Thus, from 46 gene fragments, we can assemble 205 genes. And this is just for one of the antibody polypeptides. If a similar situation exists for the other, the total number of antibodies will be the product of the numbers of the two polypeptides. This description, though correct in principle, is actually an oversimplification of the situation in the antibody genes; as we will see, they have somewhat more complex mechanisms for introducing diversity, which lead to an even greater number of possible antibody products. Studies on mammalian antibodies have revealed two families of antibody light chains called kappa (k) and lambda (l). Figure 23.12 illustrates the arrangement of the gene parts for a human k light chain. “Column A” of this “menu” contains 41 variable region parts (V); “Column B” contains 5 joining region parts (J). The J segments actually encode the last 12 amino acids of the variable region, but they are located far away from the rest of the V region and close to a single constant region part. This is the situation in the germ cells, before the antibody-producing cells differentiate and before rearrangement brings the two unlinked regions together. The rearrangement and expression events are depicted in Figure 23.12. wea25324_ch23_732-758.indd Page 741 12/21/10 1:53 PM user-f469 /Volume/204/MHDQ268/wea25324_disk1of1/0073525324/wea25324_pagefile 23.3 Rearrangement of Immunoglobulin Genes 741 (a) κ light chain coding regions 5J 41 V C (b) Rearrangement J1 J3 J5 Germ-line DNA V1 V2 V3 V4 C Recombination J2 J4 B-cell DNA C V3 J2J3 J4 J5 V2 Transcription J2 J3 J4 J5 RNA transcript C RNA splicing V3 Messenger RNA V3 J 2 C Translation Protein V C Figure 23.12 Rearrangement of an antibody light chain gene. (a) The human k-antibody light chain is encoded in 41 variable gene segments (V; light green), five joining segments (J; red), and one constant segment (C; blue). (b) During maturation of an antibodyproducing cell, a DNA segment is deleted, bringing a V segment (V3, in this case) together with a J segment (J2 in this case). The gene can now be transcribed to produce the mRNA precursor shown here, with extra J segments and intervening sequences. The material between J2 and C is then spliced out, yielding the mature mRNA, which is translated to the antibody protein shown at the bottom. The J segment of the mRNA is translated into part of the variable region of the antibody. First, a recombination event brings one of the V regions together with one of the J regions. In this case, V3 and J2 fuse together, but it could just as easily have been V1 and J4; the selection is random. After the two parts of the gene assemble, transcription occurs, starting at the beginning of V3 and continuing until the end of C. Next, the splicing machinery joins the J2 region of the transcript to C, removing the extra J regions and the intervening sequence between the J regions and C. It is important to remember that the rearrangement step takes place at the DNA level, but this splicing step occurs at the RNA level by mechanisms we studied in Chapter 14. The messenger RNA thus assembled moves into the cytoplasm to be translated into an antibody light chain with a variable region (encoded in both V and J) and a constant region (encoded in C). Why does transcription begin at the beginning of V3 and not farther upstream? The answer seems to be that an enhancer in the intron between the J regions and the C region activates the promoter closest to it: the V3 promoter in this case. This also provides a convenient way of activating the gene after it rearranges; only then is the enhancer close enough to turn on the promoter. The rearrangement of the heavy chain gene is even more complex, because there is an extra set of gene parts in between the V’s and J’s. These gene fragments are called D, for “diversity,” and they represent a third column on our menu. Figure 23.13 shows that the heavy chain is assembled from 48 V regions, 23 D regions, and 6 J regions. On this basis alone, the cell can put together 48 3 23 3 6, or 6624 different heavy chain genes. Furthermore, 6624 different heavy chains combined with 205 k light chains and 170 l light chains yield almost 2.5 million different antibodies or, strictly speaking, 2.5 million different combinations of variable regions. But there are even more sources of diversity. The first derives from the fact that the mechanism joining V, D, and J segments, which we call V(D)J joining, is not precise. It can add or delete bases on either side of the joining site. Heavy chain coding regions 48 V 23 D 6J C Figure 23.13 Structure of antibody heavy chain coding regions. The human heavy chain is encoded in 48 variable segments (V; light green), 23 diversity segments (D; purple), 6 joining segments (J; red), and 1 constant segment (C; blue). wea25324_ch23_732-758.indd Page 742 742 12/21/10 1:53 PM user-f469 /Volume/204/MHDQ268/wea25324_disk1of1/0073525324/wea25324_pagefile Chapter 23 / Transposition This leads to extra differences in antibodies’ amino acid sequences. Another source of antibody diversity is somatic hypermutation, or rapid mutation in an organism’s somatic (nonsex) cells. In this case, the mutations occur in antibody genes, probably at the time that a clone of antibodyproducing B cells proliferates to meet the challenge of an invader. Genetic and biochemical analysis has shown that somatic hypermutation occurs in two steps. First, a cytidine deaminase that is induced during B cell activation deaminates cytosines to uracils during DNA replication. Next, the uracils attract either the mismatch repair process or uracil-N-glycosylase, which removes the uracils, leaving abasic sites. In either case, a single-strand break occurs, and the cell then “repairs” the break with the same auxiliary DNA polymerases used in translesion bypass (Chapter 20): DNA polymerases z, h, u, and possibly ι. These polymerases are error prone, so many mutations are created. Together, imprecise joining of gene segments and somatic hypermutation magnify the number of possible antibodies tremendously. In fact, it has been estimated that the total number of antibodies one can make in a lifetime is as high as 100 billion. This surely seems enough to match any attacker. SUMMARY The immune systems of vertebrates can produce billions of different antibodies to react with virtually any foreign substance. These immune systems generate such enormous diversity by three basic mechanisms: (1) assembling genes for antibody light chains and heavy chains from two or three component parts, respectively, each part selected from heterogeneous pools of parts; (2) joining the gene parts by an imprecise mechanism that can delete bases or even add extra bases, thus changing the gene; and (3) causing a high rate of somatic mutations, probably during proliferation of a clone of immune cells, thus creating slightly different genes. Recombination Signals How does the recombination machinery determine where to cut and paste to bring together the disparate parts of an immunoglobulin gene? Susumu Tonegawa examined the sequences of many mouse immunoglobulin genes (encoding k and l light chains, and heavy chains) and noticed a consistent pattern (Figure 23.14a): Adjacent to each coding region lies a conserved palindromic heptamer (7-mer), with the consensus sequence 59-CACAGTG-39. This heptamer is accompanied by a conserved nonamer (9-mer) whose consensus sequence is 59-ACAAAAACC-39. The heptamer and nonamer are separated by a nonconserved spacer contain- CACAGTG ACAAAAACC GGTTTTTGT CACTGTG (a) λ-chain Vλ 7 κ-chain Vκ 7 H-chain VH 7 23 12 23 9 12 9 9 9 9 9 9 7 D 7 12 23 23 12 7 Jλ 7 Jκ 7 JH 9 (b) V D J C Figure 23.14 Signals for V(D)J joining. (a) Arrangement of signals around coding regions for immunoglobulin k and l light chain genes and heavy chain gene. Boxes labeled “7” or “9” are conserved heptamers or nonamers, respectively. Their consensus sequences are given at top. The 12-mer and 23-mer spacers are also labeled. Notice the arrangement of the 12 signals and 23 signals such that joining one kind to the other naturally allows assembly of a complete gene. (b) Schematic illustration of the arrangement of the 12 and 23 signals in an immunoglobulin heavy chain gene. The yellow symbols represent 12 signals, and the orange triangles represent 23 signals. Notice again how the 12/23 rule guarantees inclusion of one of each coding region (V, D, and J) in the rearranged gene. (Source: (a) Adapted from Tonegawa, S., Somatic generation of antibody diversity. Nature 302:577, 1983.) ing either 12 bp (a 12 signal) or 23 (61) bp (a 23 signal). The arrangement of these recombination signal sequences (RSSs, Figure 23.14b) is such that recombination always joins a 12 signal to a 23 signal. This 12/23 rule stipulates that 12 signals are never joined to each other, nor are 23 signals joined to each other, and thus ensures that one, and only one, of each coding region is incorporated into the mature immunoglobulin gene. Aside from the existence of consensus RSSs, what is the evidence for their importance? Martin Gellert and colleagues have systematically mutated the heptamer and nonamer by substituting bases, and the spacer regions by adding or subtracting bases, and observed the effects of these alterations on recombination. They measured recombination efficiency in the following way: They built a recombinant plasmid with the construct shown in Figure 23.15. The first element in this construct is a lac promoter. This is followed by a 12 signal, then a prokaryotic transcription terminator, then a 23 signal, and finally a cat reporter gene. They made mutations throughout these RSSs, then introduced the altered plasmids into a pre-B cell line. Finally, they purified the plasmids from the pre-B cells and introduced them into chloramphenicol-sensitive E. coli cells and tested them for chloramphenicol resistance. If no recombination took place, the transcription terminator wea25324_ch23_732-758.indd Page 743 12/21/10 1:53 PM user-f469 /Volume/204/MHDQ268/wea25324_disk1of1/0073525324/wea25324_pagefile 23.3 Rearrangement of Immunoglobulin Genes Plac Transcription terminator cat GTCGAC CTGCAG H12 CACAGTG S12 N12 CTACAGACTGGA ACAAAAACC 743 GGATCC CTCGGG H23 CACAGTG Figure 23.15 Structure of reporter construct used to measure effects of mutations in RSSs on recombination efficiency. Gellert and coworkers made a recombination reporter plasmid containing a lac promoter and cat gene separated by an insert containing a transcription terminator flanked by a 12 signal and a 23 signal. Recombination between the two RSSs either inverts or deletes the terminator, allowing expression of cat. Transformation of bacterial cells prevented cat expression, and therefore chloramphenicol resistance was almost nonexistent. On the other hand, if recombination between the 12 signal and the 23 signal occurred, the terminator was either inverted or deleted, and therefore inactivated. In that case, cat expression occurred under control of the lac promoter, and many chloramphenicol-resistant colonies formed. This experiment showed that many alterations in bases in the heptamer or nonamer reduced recombination efficiency to background level. The same was true of insertions and deletions of bases in the spacer regions. Thus, all these elements of the RSSs are important in V(D)J recombination. SUMMARY The recombination signal sequences (RSSs) in V(D)J recombination consist of a heptamer and a nonamer separated by either 12-bp or 23-bp spacers. Recombination occurs only between a 12 signal and a 23 signal, which guarantees that only one of each coding region is incorporated into the rearranged gene. S23 N23 GTAGTACTCCACTGTCTGGCTGT ACAAAAACC with the rearranged plasmid yields many CAT-producing colonies that are chloramphenicol-resistant. On the other hand, transformation of bacteria with the unrearranged plasmid yields almost no chloramphenicol-resistant colonies. (Source: Adapted from Hesse, J., M. R. Lieber, K. Mizuuchi, and M. Gellert, V(D)J recombination: a functional definition of the joining signals. Genes and Development 3:1053–61, 1989.) better, so something seemed to be missing. Baltimore’s group sequenced the whole genomic fragment containing most of RAG-1 and found another whole gene tightly linked to it. They wondered whether this other gene might also have something to do with V(D)J joining, so they tested this genomic fragment plus a RAG-1 cDNA in the same transfection experiment. When they introduced the two DNAs together into the same cell, they found many more drug-resistant cells. In this way, they discovered that two genes are responsible for V(D)J recombination, and they named the second RAG-2. RAG-1 and RAG-2 are expressed only in pre-B and pre-T cells, where V(D)J joining of immunoglobulin and T-cell receptor gene segments, respectively, are occurring. The T-cell receptors are membrane-bound antigen-binding proteins with an architecture similar to that of the immunoglobulins. The genes encoding the T-cell receptors rearrange according to the same rules that apply to the immunoglobulin genes, complete with RSSs containing 12 signals and 23 signals. Thus, RAG-1 and RAG-2 are apparently involved in both immunoglobulin and T-cell receptor V(D)J joining. The Recombinase Mechanism of V(D)J Recombination David Baltimore and his colleagues searched for the gene(s) encoding the V(D)J recombinase using a recombination reporter plasmid similar to the one we just discussed, but designed to operate in eukaryotic cells by conferring resistance to the drug mycophenolic acid. They introduced this plasmid, along with fragments of mouse genomic DNA, into NIH 3T3 cells, which lack V(D)J recombination activity, and tested for recombination by assaying for drugresistant 3T3 cells. This led to the identification of a recombination-activating gene (RAG-1) that stimulated V(D)J joining activity in vivo. However, the degree of stimulation by a genomic clone containing most of RAG-1 was modest—no more than that obtained with whole genomic DNA. Furthermore, cDNA clones containing the whole RAG-1 sequence did no V(D)J joining is imprecise, which contributes to the diversity of products from the process. Both loss of bases and addition of extra bases at the joints are frequently observed. This is good for immunoglobulin and T-cell receptor production, because it adds to the variety of proteins that can be made from a limited repertoire of gene segments. How do we explain this imprecision? Figure 23.16 illustrates the mechanism of cleavage at the RSSs that flank an intervening segment between two coding segments. We see that the products of the RAG-1 and RAG-2 genes, Rag-1 and Rag-2, respectively, first nick the DNAs at the joints. Then the new 39-hydroxyl groups attack phosphodiester bonds on the complementary strands, liberating the intervening segment and forming hairpins at the ends of the coding segments. These hairpins are the key to the wea25324_ch23_732-758.indd Page 744 744 12/21/10 1:53 PM user-f469 /Volume/204/MHDQ268/wea25324_disk1of1/0073525324/wea25324_pagefile Chapter 23 / Transposition Coding region 1 Coding region 2 Hairpins 1 3-OH— Intervening region 2 —3-OH Lost from cell New joint coding region Figure 23.16 Mechanism of cleavage at RSSs. Nicking of opposite strands (vertical arrows) occurs at RSSs at the junctions between coding regions (red) and the intervening region (yellow). The new 39-hydroxyl groups (blue) attack and break the opposite strands, forming hairpins and releasing the intervening segment, which is lost. Finally, the hairpins open, and the two coding regions are joined by an imprecise mechanism. (Source: Adapted from Craig, N.L., V(D)J recombination and transposition: closer than expected. Science 271:1512, 1996.) imprecision of joining; they can open up on either side of the apex of the hairpin, and bases can then be added or subtracted to make the DNA ends blunt for joining. The Rag-1 and Rag-2 proteins hold both hairpins together in a complex so they can join covalently with each other. How do we know hairpins form? They were first found in vivo, but in very low concentration. Gellert and his colleagues later developed an in vitro system in which they could be readily observed. Figure 23.17a illustrates one of the labeled substrates these workers used. It was a 50-mer labeled at one 59-end with 32P. It contained a 12 signal, represented by a yellow symbol, flanked by a 16-bp segment on the left; the right-hand end of the fragment was, therefore, a 34-bp segment, which included the 12 signal. A similar substrate contained the same flanking segments, but had a 23 signal instead of a 12 signal. Thus, it was 61 bp long. Gellert and colleagues incubated these substrates with RAG1 and RAG2, the human homologs of mouse Rag1 and Rag2, respectively, then electrophoresed the products under nondenaturing conditions to see if any DNA cleavages had occurred (Figure 23.17b). They found a 16-mer, demonstrating that a double-stranded cleavage had occurred. However, nondenaturing gel electrophoresis could not distinguish between a true double-stranded 16-mer and a 16-mer with a hairpin end, so these workers subjected the same products to denaturing polyacrylamide gel electrophoresis in the presence of urea and at an elevated temperature (Figure 23.17c). Under these conditions, a double-stranded 16-mer would give rise to two singlestranded 16-mers. On the other hand, a 16-mer with a hairpin at the end would give rise to a single-stranded 32-mer. This is what Gellert and coworkers observed whenever the DNA contained either a 12 signal or a 23 signal and both RAG1 and RAG2 proteins were present. A DNA with no 12 or 23 signal gave no product, hairpin or otherwise, and reactions lacking either RAG1 or RAG2 protein gave no product (Figure 23.17d). Thus, RAG1 and RAG2 recognize both the 12 signal and the 23 signal and cleave the DNA adjacent to the signal, forming a hairpin at the end of the coding segment. Moreover, the 16-mer product from the nondenaturing gel yielded only hairpin product on the denaturing gel, demonstrating that no simple double-stranded 16-mer formed. But labeled DNA migrating with the substrate in the nondenaturing gel yielded a small amount of 16-mer in the denaturing gel. This cannot have come from a doublestranded break, or it would not have remained with the substrate in the nondenaturing gel. Thus, it must have come from a nick in the labeled strand. The 16-mer created by the nick would have remained base-paired to its partner during nondenaturing electrophoresis, but would have migrated independently as a 16-mer during denaturing electrophoresis. Thus, single-stranded nicking is apparently also part of the action of RAG1 and RAG2 proteins. To investigate further the relationship between nicking and hairpin formation, Gellert and colleagues ran a timecourse study in which they incubated the substrate for increasing lengths of time with RAG1 and RAG2 proteins and then subjected the products to denaturing gel electrophoresis. They found that the nicked species appeared first, followed by the hairpin species. This suggested that the nicked species is a precursor of the hairpin species. To test this hypothesis, they created nicked intermediates and incubated them with RAG1 and RAG2. Sure enough, the RAG1 and RAG2 converted the nicked DNAs to hairpins. Subsequent work by Gellert’s group has shown the sequence of events seems to be: RAG1 and RAG2 nick one DNA strand adjacent to a 12 signal or a 23 signal; then the newly formed hydroxyl group attacks the other strand in a transesterification reaction, forming the hairpin, as was illustrated in Figure 23.16. What enzyme opens up the hairpins created by RAG1 and RAG2? Michael Lieber and colleagues demonstrated in 2002 that an enzyme called Artemis carries out this function. On its own, Artemis has exonuclease activity. However, in conjuction with DNA-PK cs, Artemis gains endonuclease activity that can cleave hairpins. You may recongnize DNA-PK cs from our discussion in Chapter 20 of nonhomologous DNA end-joining (NHEJ) for repair of double-strand DNA breaks. In fact, joining of the opened hairpins resembles NHEJ and relies on the NHEJ machinery. Artemis is also required to cleave the hairpins created during the rearrangement of T cell receptor genes, which