Examination of ThreeDimensional Structure Enhances Our Understanding of Evolutionary Relationships
by taratuta
Comments
Transcript
Examination of ThreeDimensional Structure Enhances Our Understanding of Evolutionary Relationships
I. The Molecular Design of Life 7. Exploring Evolution 7.2. Statistical Analysis of Sequence Alignments Can Detect Homology Figure 7.11. Alignment of Identities Only Versus the Blosum 62 Matrix. Repeated shuffling and scoring reveal the significance of sequence alignment for human myoglobin versus lupine leghemoglobin with the use of either (A) the simple, identity-based scoring system or (B) the Blosum-62 matrix. The scores for the alignment of the authentic sequences are shown in red. The Blosum matrix provides greater statistical power. I. The Molecular Design of Life 7. Exploring Evolution 7.2. Statistical Analysis of Sequence Alignments Can Detect Homology Figure 7.12. Alignment of Human Myoglobin and Lupine Leghemoglobin. The use of the Blosum-62 substitution matrix yields the alignment shown between human myoglobin and lupine leghemoglobin, illustrating identities (orange) and conservative substitutions (yellow). These sequences are 23% identical. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Sequence comparison is a powerful tool for extending our knowledge of protein function and kinship. However, biomolecules generally function as intricate three-dimensional structures rather than as linear polymers. Mutations occur at the level of sequence, but the effects of the mutations are at the level of function, and function is directly related to tertiary structure. Consequently, to gain a deeper understanding of evolutionary relationships between proteins, we must examine three-dimensional structures, especially in conjunction with sequence information. The techniques of structural determination are presented in Chapter 4. 7.3.1. Tertiary Structure Is More Conserved Than Primary Structure Because three-dimensional structure is much more closely associated with function than is sequence, tertiary structure is more evolutionarily conserved than is primary structure. This conservation is apparent in the tertiary structures of the globins (Figure 7.13), which are extremely similar even though the similarity between human myoglobin and lupine leghemoglobin is just barely detectable at the sequence level and that between human hemoglobin ( α chain) and lupine leghemoglobin is not statistically significant (15.6% identity). This structural similarity firmly establishes that the framework that binds the heme group and facilitates the reversible binding of oxygen has been conserved over a long evolutionary period. Anyone aware of the similar biochemical functions of hemoglobin, myoglobin, and leghemoglobin could expect the structural similarities. In a growing number of other cases, however, a comparison of three-dimensional structures has revealed striking similarities between proteins that were not expected to be related. A case in point is the protein actin, a major component of the cytoskeleton, and heat shock protein 70 (Hsp-70), which assists protein folding inside cells. These two proteins were found to be noticeably similar in structure despite only 15.6% sequence identity (Figure 7.14). On the basis of their three-dimensional structures, actin and Hsp-70 are paralogs. The level of structural similarity strongly suggests that, despite their different biological roles in modern organisms, these proteins descended from a common ancestor. As the three-dimensional structures of more proteins are determined, such unexpected kinships are being discovered with increasing frequency. The search for such kinships relies ever more frequently on computer-based search procedures that allow the three-dimensional structure of any protein to be compared with all other known structures. 7.3.2. Knowledge of Three-Dimensional Structures Can Aid in the Evaluation of Sequence Alignments The sequence-comparison methods described thus far treat all positions within a sequence equally. However, examination of families of homologous proteins for which at least one three-dimensional structure is known has revealed that regions and residues critical to protein function are more strongly conserved than are other residues. For example, each type of globin contains a bound heme group with an iron atom at its center. A histidine residue that interacts directly with this iron (residue 64 in human myoglobin) is conserved in all globins. After we have identified key residues or highly conserved sequences within a family of proteins, we can sometimes identify other family members even when the overall level of sequence similarity is below statistical significance. Thus, the generation of sequence templates conserved residues that are structurally and functionally important and are characteristic of particular families of proteins can be useful for recognizing new family members that might be undetectable by other means. A variety of other methods for sequence classification that take advantage of known three-dimensional structures also are being developed. Still other methods are able to identify relatively conserved residues within a family of homologous proteins, even without a known three-dimensional structure. These methods are proving to be powerful in identifying distant evolutionary relationships. 7.3.3. Repeated Motifs Can Be Detected by Aligning Sequences with Themselves More than 10% of all proteins contain sets of two or more domains that are similar to one another. The aforedescribed sequence search methods can often detect internally repeated sequences that have been characterized in other proteins. Where repeated units do not correspond to previously identified domains, their presence can be detected by attempting to align a given sequence with itself. This alignment is most easily visualized with the use of a self-diagonal plot. Here, the protein sequence is displayed on both the vertical and the horizontal axes, running from amino to carboxyl terminus; a dot is placed at each point in the space defined by the axes at which the amino acid directly below along the horizontal axis is the same as that directly across along the vertical axis. The central diagonal represents the sequence aligned with itself. Internal repeats are manifested as lines of dots parallel to the central diagonal, illustrated by the plot in Figure 7.15 prepared for the TATA-box-binding protein, a key protein in the initiation of gene transcription (Section 28.2.3). The statistical significance of such repeats can be tested by aligning the regions in question as if these regions were sequences from separate proteins. For the TATA-box-binding protein, the alignment is highly significant: 30% of the amino acids are identical over 90 residues (Figure 7.16A). The estimated probability of such an alignment occurring by chance is 1 in 1013. The determination of the three-dimensional structure of the TATA-box-binding protein confirmed the presence of repeated structures; the protein is formed of two nearly identical domains (Figure 7.16B). The evidence is convincing that the gene encoding this protein evolved by duplication of a gene encoding a single domain. 7.3.4. Convergent Evolution: Common Solutions to Biochemical Challenges Thus far, we have been exploring proteins derived from common ancestors that is, through divergent evolution. In other cases, clear examples have been found of proteins that are structurally similar in important ways but are not descended from a common ancestor. How might two unrelated proteins come to resemble each other structurally? Two proteins evolving independently may have converged on a similar structure in order to perform a similar biochemical activity. Perhaps that structure was an especially effective solution to a biochemical problem that organisms face. The process by which very different evolutionary pathways lead to the same solution is called convergent evolution. One example of convergent evolution is found among the serine proteases. These enzymes, to be discussed in more detail in Chapter 9, cleave peptide bonds by hydrolysis. Figure 7.17 shows for two such enzymes the structure of the active sites that is, the sites on the proteins at which the hydrolysis reaction takes place. These active-site structures are remarkably similar. In each case, a serine residue, a histidine residue, and an aspartic acid residue are positioned in space in nearly identical arrangements. As we will see, this is the case because chymotrypsin and subtilisin use the same mechanistic solution to the problem of peptide hydrolysis. At first glance, this similarity might suggest that these proteins are homologous. However, striking differences in the overall structures of these proteins make an evolutionary relationship extremely unlikely (Figure 7.18). Whereas chymotrypsin consists almost entirely of β sheets, subtilisin contains extensive α -helical structure. Moreover, the key serine, histidine, and aspartic acid residues do not occupy similar positions or even appear in the same order within the two sequences. It is extremely unlikely that two proteins evolving from a common ancestor could have retained similar active-site structures while other aspects of the structure changed so dramatically. 7.3.5. Comparison of RNA Sequences Can Be a Source of Insight into Secondary Structures A comparison of homologous RNA sequences can be a source of important insights into evolutionary relationships in a manner similar to that already described. In addition, such comparisons provide clues to the threedimensional structure of the RNA itself. As noted in Chapter 5, single-stranded nucleic acid molecules fold back on themselves to form elaborate structures held together by Watson-Crick base-pairing and other interactions. In a family of sequences that form such base-paired structures, base sequences may vary, but base-pairing ability is conserved. Consider, for example, a region from a large RNA molecule present in the ribosomes of all organisms (Figures 7.19). In the region shown, the E. coli sequence has a guanine (G) residue in position 9 and a cytosine (C) residue in position 22, whereas the human sequence has uracil (U) in position 9 and adenine (A) in position 22. Examination of the six sequences shown in Figure 7.20 (and many others) reveals that the bases in positions 9 and 22 retain the ability to form a Watson-Crick base pair even though the identities of the bases in these positions vary. Base-pairing ability is also conserved in neighboring positions; we can deduce that two segments with such compensating mutations are likely to form a double helix. Where sequences are known for several homologous RNA molecules, this type of sequence analysis can often suggest complete secondary structures as well as some additional interactions. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.13. Conservation of Three-Dimensional Structure. The tertiary structures of human hemoglobin ( α chain), human myoglobin, and lupine leghemoglobin are conserved. Each heme group contains an iron atom to which oxygen binds. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.14. Structures of Actin and the Large Fragment of Heat Shock Protein 70 (Hsp-70). A comparison of the identically colored elements of secondary structure reveals the overall similarity in structure despite the difference in biochemical activities. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.15. A Self-Diagonal Plot For the TATA-Box-Binding Protein From the Plant Arabidopsis. Self-diagonal plots are used to search for amino acid sequence repeats within a protein. The central diagonal is the sequence aligned with itself. Red dots indicating a correspondence of amino acids appear where two or more amino acids in a row match. Lines of dots, highlighted in pink, parallel to the central diagonal suggest an internal repeat. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.16. Sequence Alignment of Internal Repeats. (A) An alignment of the sequences of the two repeats of the TATA-box-binding protein. The amino-terminal repeat is shown in green and the carboxyl-terminal repeat in blue. (B) Structure of the TATA-box-binding protein. The amino-terminal domain is shown in green and the carboxyl-terminal domain in blue. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.17. Convergent Evolution of Protease Active Sites. The relative positions of the three key residues shown are nearly identical in the active sites of the serine proteases chymotrypsin and subtilisin. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.18. Structures of Chymotrypsin and Subtilisin. The β strands are shown in yellow and α helices in blue. The overall structures are quite dissimilar, in stark contrast with the active sites, shown at the top of each structure. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships Figure 7.19. Comparison of RNA Sequences. (A) A comparison of sequences in a part of ribosomal RNA taken from a variety of species. (B) The implied secondary structure. Bars indicate positions at which Watson-Crick base-pairing is completely conserved in the sequences shown, whereas dots indicate positions at which Watson-Crick base-pairing is conserved in most cases. I. The Molecular Design of Life 7. Exploring Evolution 7.3. Examination of Three-Dimensional Structure Enhances Our Understanding of Evolutionary Relationships