naprawaDNAHuman Molecular Genetics 2


Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair

9.6. DNA repair

DNA in cells suffers a wide range of damage:

All these lesions must be repaired if the cell is to survive. The importance of effective DNA repair systems is highlighted by the severe diseases affecting people with deficient repair systems (see below).

9.6.1. DNA repair usually involves cutting out and resynthesizing a whole area of DNA surrounding the damage

To cope with all these forms of damage, cells must be capable of several different types of DNA repair (for reviews, see the October 1995 issue of Trends in Biochemical Sciences). DNA repair seldom involves simply undoing the change that caused the damage. Almost always a stretch of DNA containing the damaged nucleotide(s) is excised and the gap filled by resynthesis. There are at least five main types of DNA repair in human cells:

All these systems, except for direct repair, require exo- and endonucleases, helicases, polymerases and ligases, usually acting in multiprotein complexes that have some components in common. Sorting out the individual pathways has been greatly aided by the very strong conservation of repair mechanisms across the whole spectrum of life. Not only the reaction mechanisms but also the protein structures and even gene sequences are often conserved from E. coli to man. A downside of the conservation is a confusing gene nomenclature, referring sometimes to human diseases (XPA etc.), sometimes to yeast mutants (RAD genes) and sometimes to mammalian cell complementation systems (ERCC = excision repair cross-complementing) - for example XPD, ERCC2 and RAD3 are the same gene in man, mouse and yeast. Generally eukaryotes have multiple proteins corresponding to each single protein in E. coli, so that, for example, nucleotide excision repair requires six proteins in E. coli but at least 30 in mammals.

9.6.2. DNA repair systems share components and processes with the transcription and recombination machinery

As well as sharing components with each other, many repair systems share components with the machinery for DNA replication, transcription and recombination. DNA polymerases and ligase are required for both DNA replication and resynthesis after excision of a defect. The recombination machinery is involved in double-strand break repair. The link with transcription is particularly intriguing ( 0x01 graphic
Lehmann, 1995). The general transcription factor TFIIH is a multiprotein complex that includes the XPB and XPD proteins. TFIIH exists in two forms. One form is concerned with general transcription and the other with repair, probably specifically repair of transcriptionally active DNA. This system is deficient in two rare diseases, Cockayne syndrome (MIM 216400) and trichothiodystrophy (MIM 601675). Clinically and in cell biology, CS and TTD both overlap XP, and in some cases the same genes are responsible, but CS and TTD patients have developmental defects that presumably reflect defective transcription, and they do not have the cancer susceptibility of XP patients.

9.6.3. Hypersensitivity to agents that damage DNA is often the result of an impaired cellular response to DNA damage, rather than defective DNA repair

Many human diseases that involve hypersensitivity to DNA-damaging agents, or a high level of cellular DNA damage, are not caused by defects in the DNA repair systems themselves, but by a defective cellular response to DNA damage. Normal cells react to DNA damage by stalling progress through the cell cycle at a checkpoint until the damage has been repaired, or triggering apoptosis if the damage is irrepairable. Part of the machinery for doing this involves the ATM protein. The role of ATM is described in Section 18.7.3. Briefly, it senses DNA damage and relays the signal to the p53 protein, the `guardian of the genome'. People with no functional ATM have ataxia telangiectasia (MIM 208900; Lambert et al., 1998). Their cells are hypersensitive to radiation, and they have chromosomal instability and a high risk of malignancy, but the DNA repair machinery itself is intact. Fanconi anemia (MIM 227650) is another heterogeneous group of diseases (at least five complementation groups) marked by defective responses to DNA damage, without specific defects in DNA repair.

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.6. DNA repair

0x01 graphic


Figure 9.21. A possible scheme for nucleotide excision repair in humans. (A) XPA protein recognizes damaged DNA and binds to it, directly or by binding to RPA, a single-strand binding protein. (B) The DNA-XPA-RPA complex recruits the TFIIH transcription factor. TFIIH is a multiprotein complex that includes the XPB and XPD proteins. These are helicases of opposite polarity, and they open up a single-stranded bubble in the DNA, about 30 nucleotides long. (C) Two cuts are made in the sugar-phosphate backbone of the damaged strand. XPF + ERCC1 cut at the 5′ end, and XPG cuts the 3′ end. (D) DNA polymerase ε together with replication factor C and the DPE2 subunit synthesize DNA to fill the gap. (E) DNA ligase seals the gap. Over 30 proteins are involved in mammalian nucleotide excision repair, and this simplified scheme does not include the likely requirement to remodel chromatin structure as part of the process (after 0x01 graphic
Lehmann, 1995).

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.5. Slipped strand mispairing during DNA replication can cause insertions or deletions. Short tandem repeats are thought to be particularly prone to slipped strand mispairing, i.e. mispairing of the complementary DNA strands of a single DNA double helix. The examples show how slipped strand mispairing can occur during replication, with the lower strand representing a parental DNA strand and the upper blue strand representing the newly synthesized complementary strand. In such cases, slippage involves a region of nonpairing (shown as a bubble) containing one or more repeats of the newly synthesized strand (backward slippage) or of the parental strand (forward slippage), causing, respectively, an insertion or a deletion on the newly synthesized strand. Note that it is conceivable that slipped strand mispairing can also cause insertions/deletions in nonreplicating DNA. In such cases, two regions of nonpairing are required, one containing repeats from one DNA strand and the other containing repeats from the complementary strand ( 0x01 graphic
Levinson and Gutman, 1987).

Human Molecular Genetics 2 0x01 graphic
18. Cancer genetics 0x01 graphic
18.7. Control of the integrity of the genome

0x01 graphic


Figure 18.17. The MutHLS error correction system in E. coli. A replication error introduces a mismatch (A). The MutS protein binds to mismatched base pairs (B). In an ATP-dependent reaction, a MutS-MutL-MutH complex is formed which probably brings any GATC sequence located within 1 kb either side of the mismatch into a loop (C). MutH makes a single-strand cut 5′ to the GATC sequence (D). The E. coli Dam methylation system methylates A in GATC, but in newly synthesized DNA only the template strand is methylated. MutH specifically cuts the unmethylated (newly synthesized) strand (D). Exonucleases, DNA polymerase and DNA ligase then strip back and repair the DNA (E). See 0x01 graphic
Modrich (1995).

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair

9.1. An overview of mutation, polymorphism, and DNA repair

As in other genomes, the DNA of the human genome is not a static entity. Instead, it is subject to a variety of different types of heritable change (mutation). Large-scale chromosome abnormalities involve loss or gain of chromosomes or breakage and rejoining of chromatids (see Section 2.6). Smaller scale mutations can be grouped into different mutation classes and can also be categorized on the basis of whether they involve a single DNA sequence (simple mutations - Section 9.2) or whether they involve exchanges between two allelic or nonallelic sequences (Section 9.3). Three classes of small-scale mutation can be distinguished (see also Table 9.1):

New mutations arise in single individuals, in somatic cells or in the germline. If a germline mutation does not seriously impair an individual's ability to have offspring who can transmit the mutation, it can spread to other members of a (sexual) population. Allelic sequence variation is traditionally described as a DNA polymorphism if more than one variant (allele) at a locus occurs in a human population with a frequency greater than 0.01 (a frequency high enough such that an origin as a result of chance recurrence is highly unlikely). The mean heterozygosity for human genomic DNA is thought to be of the order of 0.001-0.004 (i.e. approximately 1:250 to 1:1000 bases are different between allelic sequences; 0x01 graphic
Cooper et al., 1985; 0x01 graphic
Nickerson et al., 1998; 0x01 graphic
Taillon-Miller et al., 1998). Certain genes, notably some HLA genes, are exceptionally polymorphic and alleles can show very substantial sequence divergence (see Figure 14.27). Because mutation rates are comparatively low the vast majority of the differences between allelic sequences within an individual are inherited, rather than resulting from de novo mutations.

Mutations are the raw fuel that drives evolution, but they can also be pathogenic (Sections 9.4 and 9.5). They can be the direct cause of a phenotypic abnormality or they can result in increased susceptibility to disease. The usually low level of mutation may therefore be viewed as a balance between permitting occasional evolutionary novelty at the expense of causing disease or death in a proportion of the members of a species. Normally, most mutations arise as copying errors during DNA replication because DNA polymerases, like all enzymes, are error-prone. The error rate of a DNA polymerase (that is, the frequency of incorporating a wrong base) is significantly reduced by having a subunit of the polymerase which has a proofreading function. Even then, however, the size of the human genome makes huge demands on the fidelity of any DNA polymerase: a sequence of 3 billion nucleotides needs to be replicated accurately every single time a human cell divides.

DNA is also subject to significant spontaneous chemical attack in the cell. For example, every day approximately 5000 adenines or guanines are lost from the DNA of each nucleated human cell by depurination (the N-glycosidic bond linking the purine residue to the carbon 1′ of the deoxyribose is hydrolyzed and the purine is replaced by a hydroxyl group at carbon 1′). DNA is also damaged by exposure to natural ionizing radiation and to reactive metabolites. In order to minimize the mutation rate, therefore, it is necessary to have effective DNA repair systems which identify and correct many abnormalities in the DNA sequence (Section 9.6). In addition, errors that arise in the mRNA sequence during gene expression are subject to RNA surveillance mechanisms which ensure removal of mRNAs which have inappropriate termination codons (Section 9.4.6).

0x01 graphic

Mutation class

Type of mutation

Incidence

0x01 graphic

Base substitutions

All types

Comparatively common type of mutation in coding DNA but also common in noncoding DNA

Transitions and transversions

Unexpectedly, transitions are commoner than transversions, especially in mitochondrial DNA

Synonymous and nonsynonymous substitutions

Synonymous substitutions are considerably more common than nonsynonymous substitutions in coding DNA; conservative substitutions are more common than nonconservative

Gene conversion-like events (multiple base substitution)

Rare except at certain tandemly repeated loci or clustered repeats

Insertions

Of one or a few nucleotides

Very common in noncoding DNA but rare in coding DNA where they produce frameshifts

Triplet repeat expansions

Rare but can contribute to several disorders, especially neurological disorders (see Box 16.7)

Other large insertions

Rare; can occasionally get large-scale tandem duplications, and also insertions of transposable elements (Section 9.5.6)

Deletions

Of one or a few nucleotides

Very common in noncoding DNA but rare in coding DNA where they produce frameshifts

Larger deletions

Rare, but often occur at regions containing tandem repeats (Section 9.5.3) or between interspersed repeats (see Section 9.5.4 and Figure 9.9)

Chromosomal abnormalities

Numerical and structural

Rare as constitutional mutations, but can often be pathogenic (see Section 2.6). Much more common as somatic mutations and often found in tumor cells

0x01 graphic

© 1999 Garland Science

Human Molecular Genetics 2 0x01 graphic
14. Our place in the tree of life 0x01 graphic
14.6. What makes us human? Comparative mammalian genome organization and the evolution of modern humans

0x01 graphic


Figure 14.27. Some human alleles show greater sequence divergence than when individually compared with orthologous chimpanzee genes. From a total of 270 amino acid positions, the HLA-DRB*10302 and HLA-DRB1*0701 alleles show a total of 31 differences (13%). Comparison of either allele with alleles at the orthologous chimpanzee locus (Patr-DRB1) identifies more closely related human-chimpanzee pairs, such as HLA-DRB1*0701 and Patr-DRB1*0702 (only two amino acid differences out of 270). This suggests that some present-day HLA alleles pre-date the human-chimpanzee split. See 0x01 graphic
Gibbons (1995) for further details. Redrawn from 0x01 graphic
Klein et al. (1993)Scientific American, 269, pp. 675-680, with permission from Scientific American Inc.

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair

9.2. Simple mutations

9.2.1. Mutations due to errors in DNA replication and repair are frequent

Mutations can be induced in our DNA by exposure to a variety of mutagens occurring in our external environment or to mutagens generated in the intracellular environment. In the case of radiation-induced mutation, for example, 0x01 graphic
Dubrova et al. 1996 reported that the normal germline mutation rate for hypervariable minisatellite loci was doubled as a consequence of heavy exposure to the radiaoctive fallout from the Chernobyl accident. However, under normal circumstances by far the greatest source of mutations is from endogenous mutation, notably spontaneous errors in DNA replication and repair. During an average human lifetime there are an estimated 1017 cell divisions: about 2 × 1014 divisions are required to generate the approximately 1014 cells in the adult, and additional mitoses are required to permit cell renewal in the case of certain cell types, notably epithelial cells (see 0x01 graphic
Cairns, 1975). As each cell division requires the incorporation of 6 × 109 new nucleotides, error-free DNA replication in an average lifetime would require a DNA replication-repair process with an accuracy great enough so that the correct nucleotide was inserted on the growing DNA strands on each of about 6 × 1026 occasions.

Such a level of DNA replication fidelity is impossible to sustain; indeed, the observed fidelity of replication of DNA polymerases is very much less than this and uncorrected replication errors occur with a frequency of about 10-9-10-11 per incorporated nucleotide (see Cooper et al., 1995). As the coding DNA of an average human gene is about 1.7 kb, coding DNA mutations will occur spontaneously with an average frequency of about 1.7 × 10-6-1.7 × 10-8 per gene per cell division. Thus, during the approximately 1016 mitoses undergone in an average human lifetime, each gene will be a locus for about 108-1010 mutations (but for any one gene, only a tiny minority of cells will carry a mutation). In many cases, a deleterious gene mutation in a somatic cell will be inconsequential: the mutation may cause lethality for that single cell, but will not have consequences for other cells. However, in some cases, the mutation may lead to an inappropriate continuation of cell division, causing cancer (see Chapter 18).

9.2.2. The frequency of individual base substitutions is nonrandom

Base substitutions are among the most common mutations and can be grouped into two classes:

When one base is substituted by another, there are always two possible choices for transversion, but only one choice for a transition. For example, the base adenine can undergo two possible transversions (to cytosine or to thymine) but only one transition (to guanine; see Figure 9.1). One might, therefore, expect transversions to be twice as frequent as transitions. Because the substitution of alleles in a population takes thousands or even millions of years to complete, nucleotide substitutions cannot be observed directly. Instead, they are always inferred from pairwise comparisons of DNA molecules that share a common origin, such as orthologs in different species. When this is done, the transition rate in mammalian genomes is found to be unexpectedly higher than transversion rates. For example, 0x01 graphic
Collins and Jukes (1994) compared 337 pairs of human and rodent orthologs and found that the transition rate consistently exceeded the transversion rate. The ratio was 1.4 to 1 for substitutions which did not lead to an altered amino acid, and more than 2 to 1 for those that did result in an amino acid change.

Transitions may be favored over transversions in coding DNA because they usually result in a more conserved polypeptide sequence (see below). In both coding and noncoding DNA the excess of transitions over transversions is at least partly due to the comparatively high frequency of C 0x01 graphic
T transitions, resulting from instability of cytosine residues occurring in the CpG dinucleotide. In such dinucleotides the cytosine is often methylated at the 5′ C atom and 5-methylcytosines are susceptible to spontaneous deamination to give thymine (Section 8.4.2). Presumably as a result of this, the CpG dinucleotide is a hotspot for mutation in vertebrate genomes: its mutation rate is about 8.5 times higher than that of the average dinucleotide (see Cooper et al., 1995). Other factors favoring transitions over transversions are likely to include differential repair of mispaired bases by the sequence-dependent proofreading activities of the relevant DNA polymerases.

9.2.3. The frequency and spectrum of mutations in coding DNA differs from that in noncoding DNA

Many mutations are generated essentially randomly in the DNA of individuals. As a result, coding DNA and noncoding DNA are about equally susceptible to mutation. Clearly, however, the major consequences of mutation are largely restricted to the approximately 3% of the DNA in the human genome which is coding DNA. Mutations which occur in this component of the genome are of two types:

Silent mutations are thought to be effectively neutral mutations (conferring no advantage or disadvantage to the organism in whose genome they arise). In contrast, nonsynonymous mutations can be grouped into three classes, depending on their effect: those having a deleterious effect; those with no effect; and those with a beneficial effect (e.g. improved gene function or gene-gene interaction). Most new nonsynonymous mutations are likely to have a deleterious effect on gene expression and so can result in disease or lethality. However, the frequency of such mutation in the population is very much reduced because of natural selection (see Box 9.1). As a result, the overall mutation rate in coding DNA is much less than that in noncoding DNA. Consequently, the coding DNA component of a specific gene and the derived amino acid sequence show a relatively high degree of evolutionary conservation, as do important regulatory sequences such as the multiple elements of promoters and enhancers, and intronic sequences immediately flanking exons.

Selection pressure (the constraints imposed by natural selection) reduces both the overall frequency of surviving mutations in coding DNA and the spectrum of mutations seen. For example, deletions/insertions of one or several nucleotides are frequent in noncoding DNA but are conspicuously absent from coding DNA. This is so because often such mutations will cause a shift in the translational reading frame (frameshift mutation), introducing a premature termination codon and causing loss of gene expression. Even if insertions/deletions do not cause a frameshift mutation, they can often affect gene function, for example, as a result of removing a key coding sequence. Instead, coding DNA is marked by a comparatively high frequency of nonrandom base substitution occurring at locations which lead to minimal effects on gene expression (see next section).

9.2.4. The location of base substitutions in coding DNA is nonrandom

Nucleotide substitutions occurring in noncoding DNA usually have no net effect on gene expression. Exceptions include some changes in promoter elements or some other DNA sequence that regulates gene expression, and in important intronic sequence positions, such as at splice junctions or the splice branch site (see Figure 1.15). Substitutions occurring in coding DNA sequences which specify polypeptides show a very nonrandom pattern of substitutions because of the need to conserve polypeptide sequence and biological function. In principle, base substitutions can be grouped into three classes, depending on their effect on coding potential (see Box 9.2).

The different classes of base substitution listed in the box show differential tendencies to be located at the first, second or third base positions of codons. Because of the design of the genetic code, different degrees of degeneracy characterize different sites. Base positions in codons can be grouped into three classes:

The design of the genetic code and the degree to which one amino acid is functionally similar to another affect the relative mutabilities of individual amino acids. Certain amino acids may play key roles which cannot be substituted easily by others. For example, cysteine is often involved in disulfide bonding which can play a crucially important role in establishing the conformation of a polypeptide (see Figure 1.25). As no other amino acid has a side chain with a sulfhydryl group, there is strong selection pressure to conserve cysteine residues at many locations, and cysteine is among the least mutable of the amino acids ( 0x01 graphic
Collins and Jukes, 1994). In contrast, certain other amino acids such as serine and threonine have very similar side chains, and substitutions at both the first base position of codons (ACX 0x01 graphic
UCX; where X = any nucleotide) and second base positions (ACPy 0x01 graphic
AGPy; where Py = pyrimidine) can result in serine 0x01 graphic
threonine substitutions. Presumably as a result, serine and threonine are among the most mutable of the amino acids ( 0x01 graphic
Collins and Jukes, 1994).

9.2.5. Protein-coding genes show enormous variation in the rate of nonsynonymous substitutions

The rate and type of substitution varies between different genes. At one extreme are proteins whose sequences are extremely highly conserved, such as ubiquitin, histones H3 and H4, calmodulin, ribosomal proteins, etc. For example, the ubiquitin proteins of humans, mouse and Drosophila show 100% sequence identity, and comparison with the yeast ubiquitin reveals 96.1% sequence identity. These genes are not especially protected from mutation, because the rate of synonymous codon substitution is typical of that for many protein-encoding genes. Instead, what distinguishes them is the extremely low rate of nonsynonymous codon substitution compared with other genes (see Table 9.2 for some examples). Presumably, ubiquitin and the other highly conserved proteins play such crucial roles that they are under huge selection pressure to conserve the sequence. At the other extreme, the fibrinopeptides are proteins which are evolving extremely rapidly and do not appear to be subject to any selective constraint. These proteins (only 20 amino acids long) are thought to be functionless - they are fragments which are generated as part of the protein fibrinogen and discarded when the protein is activated to form fibrin during blood clotting. Another extremely rapidly evolving sequence is the major sex-determining locus, SRY. This gene encodes a protein which contains a central `high mobility group' domain (HMG box) of about 78 amino acids. The HMG box is central to SRY function and is well conserved, but the flanking N- and C-terminal segments are evolving extremely rapidly, which may indicate that the majority of the SRY coding sequence is not functionally significant ( 0x01 graphic
Whitfield et al., 1993). In between the two extremes in the rate of nonsynonymous substitution are the vast majority of polypeptide-encoding genes (see Table 9.2).

9.2.6. The molecular clock can vary from gene to gene, and is different in different lineages

Synonymous substitutions have been considered to be effectively neutral from the point of view of selective constraints. As a result, the concept of a constant molecular clock (whereby a given gene or gene product undergoes a constant rate of molecular evolution) was suggested over 30 years ago. Since then, however, abundant evidence has been accumulated which does not support the concept ( 0x01 graphic
Ayala, 1999).

When substitution rates are compared for different genes, even closely related members of a gene family, there are considerable differences. For example, the genes listed in Table 9.2 show considerable differences not only in their rates of nonsynonymous codon substitutions, but also in the rate of synonymous codon substitutions. Such differences may be governed by a number of factors:

For a given gene, the molecular clock varies very considerably depending on the species lineage, and the clock runs at different rates for closely related members of a gene family (e.g. 0x01 graphic
Gibbs et al., 1998). In order to estimate the relative rates of nucleotide substitutions in two lineages leading to present-day species A and B, a relative rate test is used. This involves using a third reference species C which is known to have branched off earlier in evolution, before the A-B split. Pairwise comparisons of orthologs in A and C, and in B and C are then used to calculate the K value, the number of synonymous substitutions per 100 sites. The KAC and KBC values then provide a measure of the relative rates of mutation in the lineages leading to species A and to species B. For example, when a variety of orthologs in mouse (species A) and rat (species B) are referenced against orthologs in humans (species C), the overall KAC and KBC values are nearly identical (Li and Graur, 1991, p. 82). This suggests that the base substitution rates in the lineages leading to present-day mouse and rat have been nearly equal. However, similar analyses suggest that the substitution rate appears to be lower in lineages leading to the primates and lower still in the lineage leading to modern day humans (Table 9.3).

The data in Table 9.3 may suggest that molecular evolution has effectively slowed down for organisms which have long generation times. With hindsight, perhaps this is not so surprising - most mutations arise when DNA is being replicated in gametogenesis (especially in males; see next section). Rodents and monkeys have comparatively shorter generation times than humans, and so will go through more generations per unit time. In addition, it has been suggested that longer-lived animals have a greater ability to repair their DNA than do short-lived species, thereby resulting in lower mutation rates ( 0x01 graphic
Britten, 1986).

9.2.7. Higher mutation rates in males are likely to be related to the greater number of germ cell divisions

Since Haldane first observed that most mutations resulting in hemophilia were generated in the male germline, it has been assumed that, at least in humans, mutations are preferentially paternally inherited. Two major approaches have been taken to estimate the relative mutation rates in the male and female germlines:

The comparatively high male mutation rate may be due to different factors ( 0x01 graphic
Hurst and Ellegren, 1998), but a major contributory factor is thought to be the large sex difference in the number of human germ cell divisions. In females, the number of cell divisions from zygote to fertilized oocyte is constant because all of the oocytes have been formed by the fifth month of development and only two further cell divisions are required to produce the zygote (Figure 9.4A). The estimated number of successive female cell divisions from zygote to mature egg has been variously estimated as 24 (Vogel and Motulsky, 1996) and 31 (Li, 1997, p. 229) and is broadly similar to estimates of 30-31 male cell divisions required from zygote to stem spermatogonia at puberty. Five subsequent cell divisions are required for spermatogenesis but thereafter the spermatogenesis cycle occurs approximately every 16 days or 23 cycles per year (Figure 9.4B). This means that in males, the number of cell divisions required to produce sperm is age-dependent. If an average age of 13 is taken for onset of puberty and an average of 25 for male reproductive age, the total number of cell divisions is about 30 + 5 + [23 × (25 - 13)], or about 310 divisions (Figure 9.4). Given that errors in DNA replication/repair provide the great majority of mutations, one might then expect that the male mutation rate would be substantially greater than that of the female.

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair

9.3. Genetic mechanisms which result in sequence exchanges between repeats

In addition to very frequent simple mutations, there are several mutation classes which involve sequence exchange between allelic or nonallelic sequences, often involving repeated sequences. For example, tandemly repetitive DNA is prone to deletion/insertion polymorphism whereby different alleles vary in the number of integral copies of the tandem repeat. Such variable number of tandem repeat (VNTR) polymorphisms can occur in the case of repeated units that are very short (microsatellites); intermediate (minisatellites) or large. Different genetic mechanisms can account for VNTR polymorphism depending on the size of the repeating unit (see the following two sections). In addition, interspersed repeats can also predispose to deletions/duplications by a variety of different genetic mechanisms. These are discussed particularly in the context of disease mutations and are therefore presented in Section 9.4.

9.3.1. Slipped strand mispairing can cause VNTR polymorphism at short tandem repeats (microsatellites)

There is considerable variation in the germline mutation rates at microsatellite loci, ranging from an undetectable level up to about 8 × 10-3 ( 0x01 graphic
Mahtani and Willard, 1993; 0x01 graphic
Weber and Wong, 1993). Novel length alleles at (CA)/ (TG) microsatellites and at tetranucleotide marker loci are known to be formed without exchange of flanking markers. This means that they are not generated by unequal crossover (see below). Instead, as new mutant alleles have been observed to differ by a single repeat unit from the originating parental allele ( 0x01 graphic
Mahtani and Willard, 1993), the most likely mechanism to explain length variation is a form of exchange of sequence information which commences by slipped strand mispairing. This occurs when the normal pairing between the two complementary strands of a double helix is altered by staggering of the repeats on the two strands, leading to incorrect pairing of repeats. Although slipped strand mispairing can be envisaged to occur in nonreplicating DNA, replicating DNA may offer more opportunity for slippage and hence the mechanism is often also called replication slippage or polymerase slippage (see Figure 9.5). In addition to mispairing between tandem repeats, slippage replication has been envisaged to generate large deletions and duplications by mispairing between noncontiguous repeats and has been suggested to be a major mechanism for DNA sequence and genome evolution ( 0x01 graphic
Levinson and Gutman, 1987; see also 0x01 graphic
Dover, 1995). The pathogenic potential of short tandem repeats is considerable (Sections 9.5.1 and 9.5.2).

9.3.2. Large units of tandemly repeated DNA are prone to insertion/deletion as a result of unequal crossover or unequal sister chromatid exchanges

Homologous recombination describes recombination (crossover) occurring at meiosis or, rarely, mitosis between identical or very similar DNA sequences. It usually involves breakage of nonsister chromatids of a pair of homologs and rejoining of the fragments to generate new recombinant strands. Sister chromatid exchange is an analogous type of sequence exchange involving breakage of individual sister chromatids and rejoining fragments that initially were on different chromatids of the same chromosome. Both homologous recombination and sister chromatid exchange normally involve equal exchanges - cleavage and rejoining of the chromatids occurs at the same position on each chromatid. As a result, the exchanges occur between allelic sequences and at corresponding positions within alleles. In the case of intragenic equal crossover between two alleles, a new allele can result which is a fusion gene (or hybrid gene), comprising a terminal fragment from one allele and the remaining sequence of the second allele (Figure 9.6). However, equal sister chromatid exchanges cannot normally produce genetic variation because sister chromatids have identical DNA sequences.

Unequal crossover is a form of recombination in which the crossover takes place between nonallelic sequences on nonsister chromatids of a pair of homologs (Figure 9.7). Often the sequences at which crossover takes place show very considerable sequence homology which presumably stabilizes mispairing of the chromosomes. Because crossover occurs between mispaired nonsister chromatids, the exchange results in a deletion on one of the participating chromatids and an insertion on the other. The analogous exchange between sister chromatids is called unequal sister chromatid exchange (see Figure 9.7). Both mechanisms occur predominantly at locations where the tandemly repeated units are moderate to large in size. In such cases, the very high degree of sequence homology between the different repeats can facilitate pairing of nonallelic repeats on nonsister chromatids or sister chromatids. If chromosome breakage and rejoining occurs while the chromatids are mispaired in this way, an insertion or deletion of an integral number of repeat units will result. Note that such exchanges are reciprocal; both participating chromatids are modified, in one case resulting in an insertion, and in the other case in a complementary deletion.

Unequal sister chromatid exchange is thought to be a major mechanism underlying VNTR polymorphism in the rDNA clusters. Unequal crossover is also expected to occur comparatively frequently in complex satellite DNA repeats and at tandemly repeated gene loci. In the latter case, unequal crossover is known to generate pathogenic deletions at some loci (see Section 9.5.3). Such exchanges can also lead to concerted evolution by causing a particular variant to spread through an array of tandem repeats, resulting in homogenization of the repeat units (see Figure 9.8).

Occasionally, unequal crossover and unequal sister chromatid exchanges can occur at regions where there is little homology. This is likely to be the case when such mechanisms first generate a tandemly duplicated locus following mispairing of nonallelic repeats such as two Alu repeats or even smaller elements (Figure 9.9).

9.3.3. Gene conversion events may be relatively frequent in tandemly repetitive DNA

Gene conversion describes a nonreciprocal transfer of sequence information between a pair of nonallelic DNA sequences (interlocus gene conversion) or allelic sequences (interallelic gene conversion). One of the pair of interacting sequences, the donor, remains unchanged. The other DNA sequence, the acceptor, is changed by having some or all of its sequence replaced by a sequence copied from the donor sequence (Figure 9.10). The sequence exchange is therefore a directional one; the acceptor sequence is modified by the donor sequence, but not the other way round.

One possible mechanism for gene conversion envisages formation of a heteroduplex between a DNA strand from the donor gene and a complementary strand from the acceptor gene. Following heteroduplex formation, conversion of an acceptor gene segment may occur by mismatch repair - DNA repair enzymes recognize that the two strands of the heteroduplex are not perfectly matched and `correct' the DNA sequence of the acceptor strand to make it perfectly complementary in the converted region to the sequence of the donor gene strand (see Figure 9.10C).

Gene conversion has been well-described in fungi where all four products of meiosis can be recovered and studied (tetrad analysis). In humans and mammals it is not possible to do this and so gene conversion cannot be demonstrated unambiguously in higher organisms (it can never be distinguished from double crossover events, for example, although double crossovers occurring in very close proximity would normally be expected to be extremely unlikely). Despite the difficulty in identifying gene conversion in complex organisms, there are numerous instances in mammalian genomes where an allele at one locus shows a pattern of mutations which strongly resembles those found in alleles at another locus of the same species. Such evidence suggests gene conversion-like exchanges between loci.

Although simple comparisons of two sequences may be suggestive, the evidence for gene conversion is most compelling when a new mutant allele can be compared directly with its progenitor sequence. Certain highly mutable loci lend themselves to this type of analysis. In particular, some hypervariable minisatellite loci have high germline mutation rates (often 1% or more per gamete) and individual repeats often show nucleotide differences so that repeat subclasses can be recognized. Germline mutations can be studied by detecting and characterizing mutant mini-satellite alleles in individual gametes. To do this, PCR analysis has been conducted on multiple dilute aliquots of DNA isolated from the sperm of an individual (small pool PCR), where each aliquot is calibrated to contain a few, perhaps 100, input molecules ( 0x01 graphic
Jeffreys et al., 1994). The PCR products recovered from individual pools can then be typed to identify any new mutations that result in a novel allele whose length is sufficiently different as to be distinguishable from the progenitor allele. Analyses of the patterns of germline mutation at three such loci have failed to identify exchanges of flanking markers and have shown that most mutations occurring at these loci are polar, involving the preferential gain of a few repeats at one end of a tandem repeat array. There is a bias towards gain of repeats and evidence was obtained for nonreciprocal sequence exchange between alleles, suggesting interallelic gene conversion ( 0x01 graphic
Jeffreys et al., 1994). Evidence for interlocus gene conversion has also been obtained in human genes, notably the steroid 21-hydroxylase gene (see Section 9.5.3).

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.9. Tandem gene duplication can result from unequal crossover or unequal sister chromatid exchange, facilitated by short interspersed repeats. The double arrow indicates the extent of the tandem gene duplication of a segment containing gene A and flanking sequences. Original mispairing of chromatids could be facilitated by a high degree of sequence homology between nonallelic short repeats (R1, R2). Note that the same mechanism can result in large-scale deletions.

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.10. Gene conversion involves a nonreciprocal sequence exchange between allelic or nonallelic genes. (A) Interallelic gene conversion. Note the nonreciprocal nature of the sequence exchange - the donor sequence is not altered but the acceptor sequence is altered by incorporating sequence copied from the donor sequence. (B) Interlocus gene conversion. This is facilitated by a high degree of sequence homology between nonallelic sequences, as in the case of tandem repeats. (C) Mismatch repair of a heteroduplex. This is one of several possible models to explain gene conversion. The model envisages invasion by one strand of the donor sequence (-) to form a heteroduplex with the complementary (+) strand of the acceptor sequence, thereby displacing the other strand of the acceptor. Mismatch repair enzymes recognize the mispaired bases in the heteroduplex and `correct' the mismatches so that the (+) acceptor sequence is `converted' to be perfectly complementary in sequence to the (-) donor strand. Subsequent replication of the (-) acceptor strand and sealing of nicks results in completion of the conversion.\

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.5. Slipped strand mispairing during DNA replication can cause insertions or deletions. Short tandem repeats are thought to be particularly prone to slipped strand mispairing, i.e. mispairing of the complementary DNA strands of a single DNA double helix. The examples show how slipped strand mispairing can occur during replication, with the lower strand representing a parental DNA strand and the upper blue strand representing the newly synthesized complementary strand. In such cases, slippage involves a region of nonpairing (shown as a bubble) containing one or more repeats of the newly synthesized strand (backward slippage) or of the parental strand (forward slippage), causing, respectively, an insertion or a deletion on the newly synthesized strand. Note that it is conceivable that slipped strand mispairing can also cause insertions/deletions in nonreplicating DNA. In such cases, two regions of nonpairing are required, one containing repeats from one DNA strand and the other containing repeats from the complementary strand ( 0x01 graphic
Levinson and Gutman, 1987). Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.6. Homologous equal crossover can result in fusion genes. The example shows how intragenic equal crossover occurring between alleles on nonsister chromatids can generate novel fusion genes composed of adjacent segments from the two alleles. Note that similar exchanges between genes on sister chromatids do not result in genetic novelty because the gene sequences on the interacting sister chromatids would be expected to be identical.

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.7. Unequal crossover and unequal sister chromatid exchange cause insertions and deletions. The examples illustrate unequal pairing of chromatids within a tandemly repeated array. Unequal crossover involves unequal pairing of nonsister chromatids followed by chromatid breakage and rejoining. Unequal sister chromatid exchange involves unequal pairing of sister chromatids followed by chromatid breakage and rejoining. For the sake of simplicity, the breakages of the chromatids are shown to occur between repeats, but of course breaks can occur within repeats. Note that both types of exchange are reciprocal - one of the participating chromatids loses some DNA, while the other gains some.

Human Molecular Genetics 2 0x01 graphic
9. Instability of the human genome: mutation and DNA repair 0x01 graphic
9.3. Genetic mechanisms which result in sequence exchanges between repeats

0x01 graphic


Figure 9.8. Unequal crossover in a tandem repeat array can result in sequence homogenization. Note that the initial spread of the novel sequence variant to the same position in the chromosomes of other members of a sexual population can result by random genetic drift (see Box 9.1). Once the mutation has achieved a reasonable population frequency (left panel) it can spread to other positions within the array (right panel). This can occur by successive gain of mutant repeats as a result of unequal crossover (or unequal sister chromatid exchanges) and occasional loss of normal repeats. Eventually the mutant repeat can replace the original repeat sequence at all positions within the array, leading to sequence homogenization for the mutant repeat. Such sequence homogenization is thought to result in species-specific concerted evolution for repetitive DNA sequences (see Section 14.4.2). UEC, unequal crossover.



Wyszukiwarka

Podobne podstrony:
mutacjei naprawacdHuman Molecular Genetics 2
naprDna1rysHuman Molecular Genetics 2
Primer On Molecular Genetics
Mendelian and Molecular Genetics
Molecular Toxicology 8
Materiał genetyczny, mutacje, systemy naprawy DNA, test Amesa
A Behavioral Genetic Study of the Overlap Between Personality and Parenting
prawo upadłościowe i naprawcze
Molecular evolution of FOXP2, Nature
genetivus pars I
EGZAMIN Naprawiony
How the ABI Prism 310 Genetic Analyzer Works
14 04 Remonty przeglady i naprawy maszynid 15614
Naprawimy misia, Scenariusze zajęć
Naprawa elektroniki w aucie, Diagnostyka dokumety
Ch w2 13.10 (Naprawiony), Studia (Geologia,GZMIW UAM), I rok, Chemia

więcej podobnych podstron