THE INTEGRITY PAPERS | Genre Group | ceptualinstitute.com |
Presented at NECSI's second International
Conference on Complexity Studies
Nashua, New Hampshire October, 1998
NECSI InterJournal |
Evolution of Efficient
strategies for evolution:
Chance Favors the Prepared Genome
Lynn Helena Caporale
Sherman Square, New York, NY 10023
caporale@usa.net
Abstract
Efficient strategies for genome evolution emerge under the pressure of natural selection. These strategies provide a route through the space of possible genomes, and thus a selective advantage, as evolutionary innovation need not await the results of purely random mutation. Natural selection acts not only on individual gene products, but, at a higher level, on the mechanisms that generate diversity, and the sequence specificity of their actions. DNA sequences encode information that modulates the rate of genetic alteration; thus, genetic variation can become less probable at sites at which variation disturbs the active scaffold or other essential residues. The fittest strategies survive, along with the genomes that encode them.
Introduction
Adaptation is a hard problem. Evidence does not support the assumption that mutation is a purely random process (1). Rejecting purely random mechanisms of genetic variation is not a refutation, but rather provides a deeper understanding, of the theory of Natural Selection of Darwin and Wallace. Genomes that evolve efficient biochemical systems to navigate through the space of possible future genomes would be favored by natural selection for their descendants can adapt more quickly when confronted by environmental challenges.
Genomes evolve a balance, between fidelity and exploration. Fidelity of replication and repair is necessary to maintain a genome; exploration of new sequences helps a genomes descendants respond to evolving challenges, and to take advantage of opportunities, in the environment. Jumps in genome efficiency could fuel rapid expansion of species into novel niches as each innovation evolves.
The replication of DNA requires catalysis by proteins (enzymes). The fidelity of enzyme complexes that copy, move, and repair nucleic acid sequences differ among each other and differ along the sequence of each molecule of DNA (1,2,3,4). Where the fidelity of an enzyme copying a particular sequence is high, that sequence will be preserved; where the fidelity is low, that sequence will change. Changes at certain sequences are more risky than others. Because enzymes that repeatedly copy, move and repair a DNA sequence affect the evolution of that DNA sequence, they fall under natural selection.
DNA can encode information that modulates its own rate of evolution (5). This information can emerge in sequences that are not translated into protein (i.e. intergenic regions and introns). In addition, because the genetic code is degenerate, information that modulates the rate and type of genetic change also can evolve underneath protein-coding information. Figure 1 illustrates how, in DNA that encodes surface protein of the spirochete that causes Lyme disease; sequences that target genetic change underlie the protein coding sequence (6). The spirochete has evolved a very focused mechanism for generating genetic variation to handle its major predictable challenge, the host immune response. Similarly, our immune system has evolved a very focused strategy of searching sequence space, which expedites its response to pathogens.
Regulated, Targeted, Genetic Variation Immune System Genes as a Paradigm
Many of the strategies for efficient exploration of sequences space that are available to genomes are exemplified by the vertebrate immune system (7) (Fig. 2). Most researchers were surprised to learn that immunoglobulin genes are assembled during each individuals lifetime, using a palette of potential pathogen binding (v) regions a potential binding site placed in a sequence context that increases its rate of sequence variation several orders of magnitude. By focusing sequence exploration to the site that binds each pathogen uniquely, the immune system can evolve increased binding potency, while conserving effector functions. DNA encoding the newly-evolved binding site can be moved next to an effector region appropriate to the challenge (e.g. crossing the placenta, release of histamine ). Thus, a single ligand binding (variable) domain can be harnessed for a series of distinct effector responses by attachment to distinct effector domains (8). The result, in the immune system, is a combinatorial exploration of binding and effector pairs, which facilitates a rapid, effective response to environmental challenges (pathogens).
Evolution as a learned skill
The ability to adapt and evolve can be viewed as a skill, which a genome learns as it moves through time and generations. The Lyme disease spirochete provides an example of biochemical mechanisms that have evolved the capacity to be prepared for very predictable challenges with stereotyped genomic alterations. While most challenges are not predictable, for each genome, certain genetic challenges, and certain genetic changes, are orders of magnitude more likely (1,3,4). Which genomes are likely to survive? Those for which the most probable changes provide effective responses to the most probable challenges.
Genomic Strategies
Generate diversity by diverse mechanisms
Diversity among a genome's descendants improves the chance that some will survive, or even flourish, as the environment presents challenges, and opportunities. Barbara McClintock observed that environmental stress could trigger genome reorganization (11). A subpopulation that hypermutates may adapt more rapidly, even under conditions when the population is not dividing, and is, indeed, starving (1,12). Thus, mechanisms that generate genome diversity emerge through natural selection (13). Mechanisms that generate multiple sequence changes in a single step bypass unselected neutral, and negatively selected, sequences that may lie on the point mutation pathway between the current sequence and a more optimal sequence. Natural selection works with a far more colorful palette than monotonous nucleotide (base) change (14).
Harness available information through horizontal transfer and duplication
Rather than "reinvent the wheel", genomes have evolved efficient mechanisms for the reuse and adaptation of functional code. This use of preexisting information can be broken into two categories information from the environment and information already in use elsewhere within the genome,.
Information from the environment. Organisms have evolved mechanisms to harness information available in their environment, by taking up and incorporating DNA (15, 10). Integrons, genes and groups of genes that move in and out of bacterial genomes like "cassettes", spread antibiotic resistance and pathogenicity (17).
Gene Duplication: Genomes that evolve mechanisms to duplicate a gene, can explore variation around the functional framework in the duplicate, while maintaining the function of the original copy. This provides a strong selective advantage over populations that begin with random sequences and/or test every mutation and every insertion site. The efficiencies of duplication has led to the emergence of large gene families(18), as well as homologous regulatory pathways and cascades.
Build upon Functional modules: Natural selection appears to have driven repetitive sequences, and mobile elements (transposons), into introns (19) and in untranslated regions (21). Thus, repetitive sequences avoid disruption of functional modules by facilitating recombination between, not within, the modules (such as exons and regulatory regions).
Combinatorial Exploration: Modularizing the genome (22) facilitates combinatorial exploration of properties, including, structural and functional motifs, transport, binding, catalytic, regulation, signaling domains, regulatory and signaling pathways. Recombination can link proteins to domains that affect its interaction with other proteins, that regulate its activity, or which direct it to the nucleus, cytoplasm, or cell membrane, or affect its half-life. Through combinations of domains or motifs into multidomain proteins, a genome can create and explore logic gates, integrating signals from different signaling pathways, learning what types of connections are most likely to be useful (1)(Figure 3).
Exploring regulatory networks and tuning the level of gene activity: Often, duplicated mobile elements insert into regulatory sites (24, 25). The duplication and movement of regulatory regions provides the ability to explore the phenotypic results of linking, through regulation, sets of structurally unlinked loci (11) Tandem repeats can modulate gene expression, acting as "tuning knobs", dialing up and down the level of expression as the number of repeats in a tandem array increases and decreases when the polymerase (enzyme that copies DNA) slips (26).
Focus Variation
The location, and timing of variation can be focused, to facilitate exploration of new selective binding sites while conserving essential scaffold and active site residues. Information in a gene that focuses variation can be duplicated with the gene and thus be conserved throughout a gene family. Gene families should be viewed as systems, rather than lists of related genes (27).
Implications
Implications for medicine.
As entire genomes are sequenced, our blindfolds are removed, and we can target the strategies encoded by genomes that threaten us. Examples include the following:
High mutation rate. One of the strategies used by the HIV genome to avoid destruction by the immune system is a very high mutation rate (28). A drug that decreases the HIV mutation rate would be expected to improve the immune systems chances of combating the infection.
Antigenic Variation: For organisms such as Lyme disease, malaria, and trypanosomes, which have evolved a directed system of surface antigen variation, strategies that interfere with the biochemical components of this system should help the immune system stay ahead of the parasite.
Integrons: Blocking the movement of plasmids and insertion of bacterial integrons would decrease the spread of antibiotic resistance.
Gene amplification: Many tumors and pathogens amplify genes that encode pumps, which protect these organisms by removing drugs from the cells. Possible therapeutic approaches include blocking these pumps (30), or, more generally, blocking mechanisms that amplify genes (31).
Tumors: Radiation and chemotherapy cause DNA damage. DNA damage triggers the death of tumor cells through programmed cell death pathways. However, tumor cells develop defects in cell death pathways (32). Once DNA damage does not selectively kill a tumor cell, might it not be a mistake to continue to dose the tumor with DNA altering agent? Arent we running the risk that this might lead to the emergence of more aggressive subpopulations of tumor cells?
Implications for coding systems:
Strategies employed by genomes that may prove useful in evolving other coding systems include the following.
Generate diversity: To maximize the chance of identifying a good solution, generate a diverse population of progeny by a variety of mechanisms. Genomes experience deletions, point mutations, duplication, recombination, etc.
Tune the probability of variation: The type and rate of variation at each position can be tuned through selection by the likelihood of variation at that position of providing a useful path to a solution.
Underlay location-specific information about the rate/type of change: Information that affects the point-by-point fate of the genome can evolve underneath the (protein) coding sequence. Repetitive sequences and transposable elements encourage recombination between, rather than within, functional modules.
Duplicate/focused hypermutation cycles: Duplication and recombination of functional modules provides a much more efficient path to the evolution of functionally useful structures, than random mutation of random sequences. Through this process, families of related genes emerge with conserved scaffolds and active sites, and varied binding sites (targets of action). Genetic change can be focused to potential binding sites.
Combinatorial Exploration: By dividing a coding system up into functional modules, it is possible to create an efficient path through sequence space and to explore the effect of linking, through regulation, physically separated functions.
Place Generators of Diversity under selective pressure: The most important lesson to be learned from looking at the evolution of genomes is the importance of having several different types of generators of variation (or editors than can be turned off by external events), with sequence-dependent rates of variation, and of placing the generators of variation under natural selection. This allows the site-specific variation in the rate of change to be focused in the areas where experience indicates it is most likely to generate a new function, and to move it away from areas where changes typically have done more harm than good.
Summary: Chance Favors the Prepared genome
Selective pressure on diversity generators allows the fine tuning of the probability and type of changes throughout the genome based upon the success of the genomes ancestors with that type of change. Natural selection favors the emergence of different types of mechanisms for exploring diversity, efficient strategies for exploration of the space of possible genomes, and tunable connections within pathways in genomes. As illustrated by such diverse systems as immunoglobulins, and Lyme disease, the genome can evolve a world view, of which types of changes are most likely to yield a new function and less likely to destroy an essential amino acid or function or scaffold. This world view is expressed through the uneven probability of different types of genetic change. In that way, chance favors the prepared genome.
Figure 1 Conserved 17 base pair repeat in Lyme Disease
alternative sequence AGAAGGCGCAATCAAAG
actual coding sequence TGAGGGGGCTATTAAGGE G A I K
A direct repeat of five amino acids, EGAIK, is conserved in a surface lipoprotein of the spirochete that causes Lyme disease (6). Due to the degeneracy of the genetic code, >15 alternative DNA sequences would encode this string of amino acids (one of which is illustrated), yet the amino acid repeat is encoded by identical 17 base repeats. Thus, pressure to conserve these five amino acids does not in itself explain conservation of the DNA sequence. The authors propose that the conserved repeats are recognition sites for a cassette mechanism by which the spirochete inserts variant sequences into its surface protein. This surface antigen sequence variation forms part of a strategy that the spirochete has evolved for protection from its most predictable challenge, the host immune system
Figure 2:
The strategies of the vertebrate immune system include
a palette of stored functional modules, representing potential binding (v) regions
sequences (recombination signal sequences: small boxes to right of Vs/left of Js) that target recombination of binding and effector domains to locations where recombination will not disrupt function
variation is focused to the antigen binding site itself, increasing the range of ligands it can bind, while preserving the antibody protein framework. At least two mechanisms focus genetic exploration within the binding site. One results in point changes, the other, patches of new sequence.
Figure 3: Modules facilitate integration of signaling pathways
Domain C triggers a downstream signaling pathway. However, it is inactive unless activated by Domain A. Domain A can activate Domain C if domain A is activated by a signaling pathway, but this activation is blocked by Domain B. If domain B is activated by a separate signaling pathway, the block is released (23). Thus, if the signaling pathways that regulate A and B are active, C will trigger its downstream pathway. Thus, the combination of three protein modules can achieve a logic gate: if this and this, then this.
References
1. Caporale, L.H. (1999) Molecular Strategies in Biological Evolution. Ann. NYAS. in press
2. Beard WA; Osheroff WP; Prasad R; Sawaya MR; Jaju M; Wood TG; Kraut J; Kunkel TA; Wilson SH J Biol Chem, 271: 12141-4 1996
3. Glickman, B.W. and Ripley, L.S. Structural intermediates of deletion mutagenesis: A role for palindromic DNA Proc. Natl. Acad. Sci. USA 81:512-516 (1984)
4. Rosche, W.A., Trinh, T.Q. and R.R. Sinden. (1997) Leading strand specific spontaneous mutation corrects a quasipalindrome by an intermolecular strand switch mechanism. J. Mol. Biol., 269, 176-187.
5. Caporale, L.H. Is there a higher level genetic code that directs evolution? Mol. Cell. Biochem. 64: 5-13 (1984)
6. Zhang, J-R, Hardham, J.M., Barbour, A.G. and Norris, S.J. (1997) Antigenic Variation in Lyme Disease Borreliae by Promiscuous Recombination of VMP-like Sequence Cassettes Cell 89 , 275-285
7. Tonegawa, S. Somatic Generation of Antibody Diversity Nature 302:575-581 (1983)
8. Stavnezer, J., Immunoglobulin class switching Current Opinion in Immunology 1996, 8: 199-205.
9. Goyenechea, B. and Milstein, C. (1996) Modifying the sequence of an immunoglobulin V-gene alters the resulting pattern of hypermutation. Proc. Natl. Acad. Sci. USA 93: 13979-13984
10. McCormack, W.T., Hurley, E.A. and Thompson, C.B. Germline maintenance of the pseudogene donor pool for somatic immunoglobulin gene conversion in chickens. Mol. Cell. Biol. 13:821-830 (1993)
11. The Dynamic Genome ed. Fedoroff, N. and Botstein, D. (1992) CSHL Press, New York
12. Foster PL (1999) Ann. Rev. Genetics (in preparation)
13. Arber, W. Evolution of prokaryotic genomes Gene 135:49-56 (1993)
14. Shapiro, J.A. Natural Genetic Engineering in Evolution. Genetica 86:99-111 (1992)
15. Avery, O.T., MacLeod, C.. and McCarty, M. (1944) Journal of Experimental Medicine, 79: 137-58 available online at http://www.profiles.nlm.nih.gov/CC/A/A/A/M/_/ccaaam.pdf
16. Smith HO; Tomb JF; Dougherty BA; Fleischmann RD; Venter JC (1995) Frequency and distribution of DNA uptake signal sequences in the Haemophilus influenzae Rd genome [see comments] Science, 269(5223):538-40
17. Recchia, G.D., Hall, R.M. Gene Cassettes: a new class of mobile element. Microbiology 141:3015- (1995)
18. see, e.g. http://speedy.mips.biochem.mpg.de/mips/programs/classification.html
19. Gilbert, W. (1978) Why genes in pieces? Nature 271 501
20. see e.g. Yulug, I.G., Yulug, A., and Risher, E.M.C. (1995) The Frequency and Position of Alu repeats in cDNAs as Determided by Database Searching Genomics 27: 544-548
21. Puget, N., Torchard, D., Serova-Sinilnikova, O.M., Lynch, H.T., Feunteun, J., Lenoir, G. M. and Mazoyer, S. (1997) A 1kb Alu-mediated Germ-Line Deletion Removing BRCA1 Exon 17. Cancer Research 57 828
22. See articles by Shapiro, Fedoroff and Iida in reference 1
23. This example was inspired by reading, but is not meant to be an accurate representation of, Nimnual et al. (98) Science 279 561
24. Britten RJ (1997) Mobile elements inserted in the distant past have taken on important functions. Gene, 205:177-82
25. McHaffie GS; Ralston SH (1995) Origin of a negative calcium response element in an ALU repeat: implications for regulation of gene expression by extracellular calcium. Bone, 17:11-4
26 Trifonov, E.N. (1999) Annals
27. Caporale, L.H. (1998) Genomic Strategies for Evolutionary Adaptation: The rate, location and extent of genetic variation is not monotonous Interjournal Genetics manuscript 178 submitted http://phys4.harvard.edu/~purcell/necsi/Caporale_IJ.html
28. Peliska JA; Benkovic SJ (1994 ) Fidelity of in vitro DNA strand transfer reactions catalyzed by HIV-1 reverse transcriptase. Biochemistry, 33:3890-5
29. Wainberg MA; Drosopoulos WC; Salomon H; Hsu M; Borkow G; Parniak M; Gu Z; Song Q; Manne J; Islam S; Castriota G; Prasad VR Enhanced fidelity of 3TC-selected mutant HIV-1 reverse transcriptase. Science, 271:1282-5, 1996
30. Sarkadi B; Muller M Semin Cancer Biol, 8(3):171-82 1997
31. Shimizu N; Itoh N; Utiyama H; Wahl GM (1998) J Cell Biol, 140:1307-20
32. Lowe SW; Bodis S; McClatchey A; Remington L; Ruley HE; Fisher DE; Housman DE; Jacks T p53 status and the efficacy of cancer therapy in vivo. Science, 266(5186):807-10 1994
NECSI InterJournal |
THE INTEGRITY PAPERS (links to CEPTUAL READINGS)
GENRE WORKS (other writers)
POETICS
MINDWAYS (links to GLOBAL THINKERS)
"NON-FRACTAL COMPLEXITY" (order video)