Protein Folding

background image

644

PROTEIN FOLDING

Vol. 7

PROTEIN FOLDING

Introduction

Protein folding is one of the most important processes that are vital for the ex-
istence of every living system. The correctly folded state of proteins is essential
for the tremendous array of protein functions at all levels of complexity, from the
subcellular machinery to organs and macroscopic structural elements. A very in-
complete list of protein functions includes their activity as enzymes (biocatalysts),
sensors of chemical and physical signals, regulators of expression of the genetic
information, structural building blocks (ranging from the microscopic cytoskele-
ton to the macroscopic fingernails and hair), and mediators of the immunological
phenomenon (as antibodies, receptors, and antigen-presenting agents).

As correct protein folding is required for normal cell function, protein

misfolding has grave consequences. An incorrect folded state of proteins is the
hallmark of a large number of diseases of unrelated origin, some of which are the
most common diseases in the western society. Such diseases include cystic fibrosis,
the most common genetic disease in the Caucasian population, amyloid diseases,
including Alzheimer’s disease and Type II diabetes, and infectious prion diseases
such as bovine spongiform encephalopathy—“the mad cow disease.” Since amy-
loid diseases, are age-related, and the average age of the population will increase
significantly in the next decades, the prevalence of such diseases is predicted to
increase accordingly.

The “protein-folding problem” is related to the determination of how and

why a protein of a given amino acid sequence adopts a certain three-dimensional
structure. A recent extension to the problem includes the question of how and
why a protein adopts the misfolded or unfolded conformations. This article de-
scribes the different hierarchies of protein folding, the classical models and current

Encyclopedia of Polymer Science and Technology. Copyright John Wiley & Sons, Inc. All rights reserved.

background image

Vol. 7

PROTEIN FOLDING

645

understanding of the folding process through defined pathways or funnels, the as-
sistance of molecular chaperons in the folding process, the experimental method-
ologies that are being used for the study of protein folding, and the attempts
and advances in the prediction of secondary and tertiary structures of proteins.
Finally, the article will describe recent advancements in the understanding of
protein unfolding and misfolding.

Proteins as Polymers

Proteins are generally composed of a linear combination of the 20 naturally oc-
curring

L

-

α-amino acids (see P

ROTEINS

), and are thereby polyamides. The poly-

merization of the amino acids occurs through the formation of an amide linkage
between the carboxyl group of given amino acid and the amino group of the next
amino acid, with the resultant elimination of a molecule of water. This amide link-
age is more commonly called a peptide bond and proteins are polypeptides. The
molecular basis for the enormously diverse functions of the proteins is the sig-
nificant diversity of their building blocks. The chemical nature of the 20 natural
amino acids is extremely versatile. The side chain of the amino acids can be either
negatively or positively charged, polar, aliphatic (branched or unbranched), or aro-
matic (substituted or nonsubstituted). It may include various functional groups
such as thiol, amine, hydroxyl, carboxyl, phenyl, and amide groups. Furthermore,
the properties of this biopolymer are a result of the amino acid sequence, that
is, the linear arrangement of the building blocks, rather than the composition
of the protein. Two proteins with the very same amino acid composition may be
completely different in their folded-state and molecular properties.

Therefore, it is clear that the polymerization of a protein must be directed

by a specific molecular plan, rather than a random polymerization process. The
information for the polymerization of a linear sequence of a given protein is en-
coded in the DNA, which is transcribed to a messenger RNA, which is in turn
translated into the protein polymer by cellular ribosomes (see P

OLYNUCLEOTIDES

;

G

ENETIC

M

ETHODS OF

P

OLYMER

S

YNTHESIS

). A typical length of a protein can range

from tens to thousands of building blocks. Shorter polymers composed of amino
acids (from 2 to around 40 amino acids long) are usually termed peptides. Some
of the peptides are products of cleavage of longer ribosome-translated precursor.
Other peptides (usually those shorter than 10 amino acids) may be synthesized by
cellular enzymes, each devoted to the synthesis of one specific peptide. In many
cases, those short peptides may be composed of amino acids that are different
from the 20 naturally occurring amino acids. These peptides may be composed of
nonnatural amino acids or

D

-

α-amino acids. Moreover, ribosome-synthesized pro-

teins may have building blocks that are different from the 20 natural amino acids.
This is due to post-translational modifications (such as the enzymatic formation
of hydroxyproline by the prolyl hydroxylase protein enzyme).

The incredible degree of diversity presented by proteins can be appreciated

by the calculation of the number of linear combinations available even for very
short proteins. For a very small protein of 50 amino acids, the number of linear
combinations of the 20 natural amino acids is about 10

65

. For a more typical-sized

protein of 200 amino acids, the number of linear combinations is more than 10

260

.

background image

646

PROTEIN FOLDING

Vol. 7

All these calculations are done without taking into account any post-translational
modifications. This immense magnitude of structural diversity is the core of the
central role of proteins in all living systems.

The Structures of Proteins

The structures of proteins are usually described in four levels of organization:
primary, secondary, tertiary, and quaternary. In the context of the protein-folding
question the secondary and tertiary structures are the most important.

Primary Structure.

This is the linear amino acid sequence of the polypep-

tide chain as described above. Primary structure is usually denoted by either
one- or three-letter conventional codes. Determination of the primary structure
of proteins was performed for many years by the N-terminal sequencing that
was developed by Frederick Sanger for which he received his first Nobel Prize
in Chemistry in 1958. (His second Nobel Prize in 1980 was for the development
of DNA-sequencing techniques.) This method is based on the Edman degrada-
tion reaction that removes the N-terminal amino acid of the protein chain, leav-
ing a new N-terminus on the chain. The identity of the removed amino acid is
then determined by HPLC analysis. In the last few years, there has been an in-
creased tendency to use mass spectrometry techniques to determine the sequence
of proteins. These sequencing procedures are based on tandem mass spectrome-
try (MS/MS) in which proteins, or more commonly protein fragments, are further
fragmented in the mass spectrometer by collision with gas molecules. This results
in an ensemble of many molecules with defined differences in mass. The high ac-
curacy of the mass spectrometers and the ability to specifically select a precursor
ion (through an ion trap or a quadrupole) allow the determination of the amino
acid sequence of the chain. Only two amino acids could not be distinguished using
conventional MS techniques. These are leucine and isoleucine, which have the
same molecular mass. However, recent techniques such as TOF mass spectrom-
etry identify side-chain fragmentations that allow the selective identification of
leucine and isoleucine. Mass spectrometry (qv) is also a very powerful method for
the determination of post-translational modifications.

The Secondary Structure.

The arrangement of the amino acids in a pro-

tein into local structural elements is termed secondary structure. The two main
structural elements that are seen in proteins are the

α-helix and β-sheet struc-

tures. The

α-helical structure, which was originally suggested by the Nobel Lau-

reate Linus Pauling and Robert Corey (1), could be described as a spring coiled
right-handed about an imaginary cylinder. The size of a helix can range from 5
to 50 amino acids. The helical structure has 3.6 residues per turn and the main
forces that stabilize the structure are hydrogen bonds between amide hydrogens
of peptide bonds and carboxyl oxygens of residues at the next turn of the helix. The
β-sheet structure is formed by the stacking of individual β-strands. The β-strands
are usually from 5 to 15 residues long and are in a fully extended conformation.
This secondary structure is also stabilized by hydrogen-bonding amide hydrogens
and carboxyl oxygens of stacked chains. The

β-strands can be arranged either in a

parallel manner (in each strand the amino acid sequence of the chain is arranged
toward the same direction) or in an antiparallel manner, which is more stable

background image

Vol. 7

PROTEIN FOLDING

647

energetically. Other secondary structure elements are

β-turns, short turns stabi-

lized by a specific pattern of hydrogen bonds, and loops, which are flexible linkers
that connect secondary structure elements.

Tertiary Structure.

This is the three-dimensional structure of a protein.

The tertiary structure is made up from the interaction of the secondary structure
elements to form the overall folding pattern of the polypeptide chain. The tertiary
structure of a protein is formally described by the coordinates in space of all (or
most) atoms of the protein molecules [see description below of the protein data-
bank (PDB)]. The tertiary structure is usually quite compact and after globular,
but can also be elongated or possess other geometries. As with secondary structure
elements, in spite of the fact that the number of possible folds (ie, the specific ar-
rangement of secondary structure elements) is almost infinite, there is a basic set
of a few thousand unique folds. This observation has a crucial role in our ability
to predict the tertiary structure of proteins through fold-recognition methods as
described below. The main driving force for formation of three-dimensional struc-
tures of proteins appears to be hydrophobic interactions (2–5). The arrangement
of the protein in a way that minimizes the energetically unfavorable orientations
(ie, hydrophobic moieties on the surface of the protein and hydrophilic moieties
buried inside the core) appear to provide an overall structural pattern for the
folded state. Interestingly, binary models of proteins, which present the protein
as a linear arrangement of hydrophobic and hydrophilic elements, are quite suc-
cessful in low resolution prediction of the three-dimensional structure of proteins
and folding pathways (3–5). However, it is very clear that the fine structure of a
protein is a result of much more complex interactions. Even the hydrophobic core
could not be regarded purely as an unstructured entity, and there is certainly a
core-packing process that may involve more specific interactions (eg, stacking of
aromatic residues in a specific orientation within the hydrophobic core). Other
elements like salt bridges, disulfide bonds, and hydrogen bonds are also very im-
portant in the stabilization of the folded proteins and fine-tuning of the folded
state.

Figure 1 shows the primary, secondary, and tertiary structures of the bovine

pancreatic trypsin inhibitor (BPTI). This small protein is being widely used as a
model for folding studies, because of its small size (58 amino acids), the fact that
it contains both

α-helical and β-sheet structural elements, and its commercial

availability.

Quaternary Structure.

This is the spatial arrangement of non–covalently

linked protein subunits to form a functional protein assembly. One of the best-
known examples of a quaternary structure of protein is the assembly of func-
tional hemoglobin that is made of four non–covalently linked subunits. Extensive
discussion of the quaternary structure is beyond the scope of this article.

The Thermodynamic Hypothesis

A guiding principle in the study of protein folding is the “thermodynamic hypoth-
esis” (6), established by Christian A. Anfinsen in the 1950s and early 1960s. The
thermodynamic hypothesis suggests that a particular three-dimensional struc-
ture of a protein occurs because this molecular arrangement is the most stable

background image

648

PROTEIN FOLDING

Vol. 7

(c) Tertiary structure

(b) Secondary structure

(a) Primary structure

NH2-RPDFCLEPPYTGPCKARIIRYFYNAKAGLCQTFVYGGCRAKRNNFKSAEDCMRTCGGA-COOH

Fig. 1.

Levels of structural organization of the BPTI protein. (a) Primary structure, the

amino acid sequence of protein; (b) Secondary structure,

α

-helices are denoted by shaded

boxes and

β

-strands are denoted by arrows. Secondary structure elements are presented

according to their determination as they appear in the file containing the BPTI coordinates
(1BPI) as deposited in the protein databank (PDB); (c). Tertiary structure, wireframe chem-
ical structure (right), and schematic strand representation (left). The tertiary structures
were prepared using rasmol software with BPTI coordinates.

thermodynamically. The development of the hypothesis was based on Anfinsen’s
experiments with the ribonuclease (Rnase) protein. The RNase is a relatively
small protein of 124 amino acids that contains four disulfide bridges. Therefore,
there are 105 (7

× 5 × 3 × 1) possible combinations for the arrangement of the

disulfide bond in the folded protein. However, only one orientation is enzymatically
active. Anfinsen reduced the disulfide bridges of RNase and unfolded the protein
under extreme chemical conditions using high concentrations of urea. Under these
conditions the protein was completely unfolded and could be regarded as a random
polymer. However, when the urea was removed, followed by oxidation of the disul-
fide bridges, the protein spontaneously refolded back into its original form. On the
other hand, when the protein was oxidized in the presence of a high concentration
of urea (that was only later removed), only about 1% activity was gained. This is
consistent with a random formation of various 105 “scrambled” RNase structures,
only one of which has enzymatic activity. The Anfinsen hypothesis could be best
summarized in his own words from his 1972 Nobel Prize acceptance speech, “The
native conformation is determined by the totality of interatomic interactions and
hence by the amino acid sequence, in a given environment.”

Chaperon-Assisted Protein Folding.

The Anfinsen theory was a very

clear turning point in our understanding of protein folding and it appears to be
valid in the typical time frame of the folding process (milliseconds to seconds) for
most studied proteins. However, more recent studies have clearly demonstrated

background image

Vol. 7

PROTEIN FOLDING

649

that in the cellular environment, the folding process may be assisted or may
require the help of so-called molecular chaperons (7–9). The chaperons, which
are large protein assemblies, are associated with the target protein during part
of the folding process. However, once folding is complete (or even before), the
correctly folded protein is released from the chaperon assembly into the cellular
environment. The physiological role of the molecular chaperons is not fully under-
stood. Chaperons are found in all living organisms including bacteria, but they
are mainly expressed under conditions of stress (such as heat shock). It is widely
speculated that the main role of these molecular assemblies is to avoid protein
aggregation. They also seem to serve as “quality control” machinery under normal
growth conditions. Nevertheless, for most cellular proteins the “thermodynamic
hypothesis” seems to hold during their initial process of folding upon synthesis by
the ribosome.

Levinthal’s Paradox

The thermodynamic hypothesis describes the energetic end point of the folding
process, but it does not deal with the pathway by which the molecule reaches its
global energetic minimum. The rate of protein folding is usually in the order of
milliseconds to seconds (10). However, the number of possible orientations of the
dihedral angles of the various peptide bonds that compose the protein is astronom-
ical. Assuming 10 possible conformations for each peptide bond in a very short 40
amino acid protein will result in 10

40

possible conformations. The apparent con-

tradiction between the number of possible configurations and the fast folding rate
of proteins is known as “Levinthal’s paradox” (11). Levinthal argued that if we
consider an average rotation frequency around each bond, one could assume that
a protein can sample about 10

14

structures per second. Therefore, it would take

40 amino acids 10

26

seconds or about 10

18

years to examine all the possible confor-

mations. This time scale is much larger than the present age of the universe. The
clear conclusion from Levinthal’s paradox is that folding cannot occur by sampling
the entire conformational space and that there must be another way to reach the
folded state of low energy. Indeed, Cyrus Levinthal suggested in the late 1960s,
on the basis of his theoretical argument, that there are specific folding pathways.

Folding Pathways

It took several years after Levinthal’s and others’ theoretical arguments until
folding pathways could actually be demonstrated. The very rapid folding reac-
tion, that was proved to be highly instrumental in the conceptual realization
of the folding pathways notion, made the observation of folding intermediates
quite difficult. The first experimental proof for the existence of specific folding
pathways came from experiments done by Thomas Creighton in the mid-1970s
(12,13) on the folding of bovine pancreatic trypsin inhibitor (BPTI), the protein
that is presented in Figure 1. BPTI has three disulfide bridges, and Creighton
was able to trap folding intermediates of BPTI with one or two disulfide bridges
formed, by irreversibly blocking all the free thiol groups at different folding stages.

background image

650

PROTEIN FOLDING

Vol. 7

A

B

Energy

Conformation

Reaction coordinate

Folded

Folded

Unfolded

Intermediate

Intermediate

Fig. 2.

Energy of folding reaction. A. Funnel view of the energetic landscape. The tra-

jectory of the protein along the energy landscape is a probabilistic statistical mechanics
function; B. A pathway view of the folding process. The folding process is represented as
a chemical reaction. The reaction coordinate is deterministic and intermediates along the
pathway are well-defined.

Analysis of the nature of the intermediates allowed the construction of a folding
pathway of BPTI. Later experiments by Weissman and Kim (14)—using different
methodologies—suggested that the nature of the folding intermediate might be
different from that originally described. However, there was an agreement on the
concept of intermediate species in the pathway from the unfolded to the folded
state.

In more recent years, other groups have suggested that the pathway toward

the minimum of free energy might be less deterministic than previously assumed
(15–20). In this more modern point of view the folding process is seen as a multi-
plicity of routes down a folding funnel rather than a distinct pathway of discrete
intermediates. This view is based on statistical mechanics rather than a chemical
reaction view of the folding process (Fig. 2). This approach suggests that different
partially folded species are distributed in the energetic landscape of the folding
reaction according to the probability of occupation at a finite temperature in a way
that is proportional to the Boltzmann factor. According to this view, the molecular
dynamics within a folding funnel involves the progressive formation of an ensem-
ble of partially ordered structures. The practical meaning of this approach is that
there is no way to clearly know the folded state of a molecule during the folding
process. Figure 2 presents a schematic view of the energetic landscape funnel and
the folding pathway models.

The funnel approach does not contradict Levinthal’s argument that a given

protein does not sample the entire energy landscape of its various conformations.
However, in spite of sampling only a small part of the energy landscape, the exact
path is the result of a probability function rather than a deterministic function.
Furthermore, since in a typical experiment of protein folding there is a very large
number of molecules, there is no way to predict the exact conformation of each
molecule, and they can only be represented as a probability function. Very recent
studies that use single-molecule techniques should help further explore the energy
landscape of protein folding (21).

background image

Vol. 7

PROTEIN FOLDING

651

Models for the Order of Events in the Folding Process

Regardless of the view of the folding process as discrete steps with defined in-
termediates or as an energetic funnel, one of the interesting questions that still
remains open is the order of events during the folding pathways. The three major
hypotheses for the order of events are (1) the framework model (22–24), (2) the
nucleation-growth mechanism (25), and (3) the hydrophobic collapse model (3,4).

The framework model suggests that the first stage in the folding reaction in-

volves the formation of elements of secondary structures without any significant
compactization. These secondary structure elements are then assembled to form
the compact and folded final structure of the protein. The nucleation-growth mech-
anism suggests that the first stage in the formation of the folded state involves
the formation of a small nucleus that is well-folded and compact. This nucleus
involves only a small fraction of the protein polypeptide chain. The formation of
the nucleus is followed by hierarchical assembly of further structural elements
to form the well-ordered three-dimensional structure. The hydrophobic collapse
model suggests that the first stage of the folding process involves the rapid collapse
of the protein chain into a compact conformation that does not have well-ordered
secondary structures. The secondary structure elements then grow in the general
framework of the collapsed structure. There is no clear indication which of the
models correctly describes the folding mechanism. It may be that different pro-
tein molecules are being folded in different ways and/or that the actual sequence
of events may be different from or a combination of the three models. The statisti-
cal mechanics model also may suggest that all three models are actually different
discrete trajectories on the energy landscape and each one of them can occur even
at the same folding reaction of an ensemble of protein molecules.

Experimental Methodologies

As described above, the folding reaction is a very rapid event (10). Therefore in
order to be able to follow the kinetics of folding and sequence of events there is
a need for rapid stopped-flow techniques. The properties that are usually mon-
itored using these techniques are either aromatic residues fluorescence or cir-
cular dichroism spectra. The fluorescence of the aromatic residues (especially of
tryptophan) is largely dependent on the dielectric constant of its environment.
Furthermore, aromatic residues tend to be part of the hydrophobic core of the
protein. Therefore, monitoring the change of fluorescence during folding and un-
folding reactions allow insight into the reaction mechanism. Circular dichroism
(CD) spectra of proteins provide direct information on the secondary structure of
proteins. The CD spectrum of a protein in the far-UV region (180–250 nm) shows
a clear indication of secondary structure (especially of

α-helical structure). Thus

it allows the direct observation of the formation of secondary structures. These
two techniques can be used not only to study the kinetics of protein folding but
also the thermodynamics. By steady-state determination of the fraction of folded
protein upon titration with denaturating agent such as urea, the change in free
energy (

G) of the folding reaction can be calculated.

background image

652

PROTEIN FOLDING

Vol. 7

Fourier-transformed infrared (FTIR) is another excellent method to study

protein folding. Unlike the well-known use of FTIR as a method for the identifi-
cation of functional groups, in terms of protein structure this method allows the
determination of secondary structure. The frequency of vibration of the amide
I band of the peptide chain (1500–1600 cm

− 1

) heavily depends on the struc-

ture of the protein. FTIR has the advantage of being more sensitive for the
study of proteins that contain

β-sheet elements as compared to CD. Furthermore,

since FTIR spectroscopy can be applied to solids also, it allows the structural
analysis of aggregated protein deposits. The availability of the rapid step-scan
method for FTIR is also very useful for the study of rapid folding reactions (see
V

IBRATIONAL

S

PECTROSCOPY

).

The Protein Databank (PDB).

One of the most instructive tools available

for researchers studying both basic and applied aspects of protein folding is the
protein databank (PDB) (26,27). Databank data is freely available via the Inter-
net (http://www.pdb.org) and it contains the coordinates of protein structures as
determined by X-ray crystallography or nuclear magnetic resonance (NMR) spec-
troscopy. The rate of structure submission to the PDB has increased steadily over
the years (Fig. 3). By the end of 2001 there were nearly 17,000 deposited struc-
tures in the PDB. The PDB was established at the Brookhaven National Labo-
ratories (BNL) in the 1970s and maintained there for many years. Several years
ago, the maintenance and development of the PDB was transferred to Rutgers,
The State University of New Jersey; the San Diego Supercomputer Center at the
University of California, San Diego; and the National Institute of Standards and
Technology—three members of the Research Collaboratory for Structural Bioin-
formatics (RCSB).

Prediction of Protein Folding

The importance of the correct protein fold for the understanding of protein activity
and design of novel proteins has led to much interest in the possibility of predicting
correct protein folds. In spite of the great importance and interest in the prediction
of the three-dimensional structure of proteins from the primary structure there are
no such algorithms available to date. However, the attempts to predict secondary
structures of proteins have been much more (although not fully) successful.

One of the earliest and most pivotal attempts to predict the secondary struc-

ture of proteins was made by Peter Chou and Gerald Fasman (28). The Chou–
Fasman scale for secondary structure prediction is based on the statistical occur-
rences of amino acids in different secondary structure elements. Their classifica-
tion was based on the structures of protein as deposited in the PDB, taking into
account the relative occurrence of the different amino acids in various proteins
(which can be quite diverse). The various amino acids were classified according to
their ability to form or break the two major secondary structures. The categories
were strong former (H), former (h), weak former (I), indifferent former (i), breaker
(b), and strong breaker (B) of the specific secondary structure elements. Proteins
are scanned through a fixed-size window and each area of the polymer is scaled for
its tendency to form the various secondary structures. Further developments of
the Chou–Fasman method, such as the GOR method (29), improved the prediction
potential by taking into account not only the identity of the specific amino acid but

background image

Vol. 7

PROTEIN FOLDING

653

Number of new structures at the PDB

3500

3000

2500

2000

1500

1000

500

0

1972

1976

1980

1984

1988

1992

1996

2000

Number of new folds at the PDB

700

600

500

400

300

200

100

0

1980

1984

1988

1992

1996

2000

Year

Fig. 3.

New structures versus new folds in the PDB. There is a constant increase in

the number of new structures submitted to the PDB. The number of new folds shows
significantly different behavior, with a sharp decrease in 2001. The data was taken from
the official PDB statistics.

also their context (ie, specific secondary structure patterns). Another approach for
the prediction of secondary structures is based on neural networks analysis. Ac-
cording to this method, computational neural networks are trained by sequences
of known secondary structures. One very successful example of the neural net-
works method is the development of the secondary structure prediction tool in a
mail server called PHD (30).

Prediction of the three-dimensional structures of proteins is much more com-

plex. The success rate of this process is formally determined by the a contest known
as the Critical Assessment of Techniques for Protein Structure Prediction (CASP).
The CASP contest showed a significant advance in the ability to predict three-
dimensional structures of proteins over the years by homology modeling and fold
recognition, as described below, but ab initio methods are still not very accurate.
One of the major problems is finding a way to distinguish between the global mini-
mum of free energy of a given protein molecule and local energetic minima. Most of
the successful models of protein structures are based on homology modeling. Such

background image

654

PROTEIN FOLDING

Vol. 7

methods start by forcing the target sequence into the fold of its closest relative with
a known folded state, followed by energy minimization of the structure using force
fields. An Automated Comparative Protein Modeling Server, The Swiss Model,
is available freely on-line (http://www.expasy.ch/swissmod/SWISS-MODEL.html).
More advanced homology-modeling software is available commercially.

Another of the successful methods that is based on a reverse approach relies

on protein fold recognition by “threading” (31). This method takes advantage of the
fact that the number of folds is limited. Instead of trying to predict the fold of the
protein on the basis of the amount of free energy, the method determines whether
a given sequence can be fitted to a known fold. As the number of folds (Fig. 3)
appears to be finite, it provides a way to look for the structure of new proteins using
known folds. Another method for the prediction of three-dimensional structures
of proteins is based on the in silico assembly of protein structure from shorter
structural elements with more defined structures (32).

Protein Unfolding and Misfolding

There is an increased interest in recent years in protein unfolding and misfolding
(33–37). For example, there is a clear realization that small but significant parts
of protein molecules may be natively unfolded (38). Furthermore, the formation
of misfolded protein aggregates is the hallmark of various unrelated diseases and
therefore attracts much medical attention. Other research activities are directed
toward the search for the physiological significance of the unfolding and misfold-
ing phenomena. Indeed, recent studies have indicated that protein unfolding and
misfolding might also have a physiological role. One instance is that of unstable
and short-lived proteins, in which their instability is an integral part of a regu-
latory mechanism, as in the case of Toxin–Anitoxin systems (38). Another recent
example is the formation of curli amyloid fibril in Escherichia coli, which is related
to the formation of biofilms (39).

The formation of amyloid fibrils is probably one of the most important cases

of protein misfolding. In this self-assembly process, soluble cellular proteins form
large and ordered fibrillar structures. This process is also accompanied by a struc-
tural transition of the aggregated proteins from their native fold into a predom-
inantly

β-sheet secondary structure. The fibrillar structures are well ordered in

the long axis direction, and X-ray fiber diffraction shows a clear 0.48-nm reflec-
tion on the meridian. Such reflection corresponds to the hydrogen bonding dis-
tance between

β-strands. This is consistent with the secondary transition toward

a

β-sheet structure upon amyloid fibril formation as is observed using CD, and

FTIR spectroscopy. There is also a correlation between the unfolded and misfolded
states of proteins. Two amyloid-forming proteins, the diabetes-related islet amy-
loid polypeptide (IAPP) and the Parkinson’s disease–related

α-synuclein polypep-

tides, have clearly been shown to be natively unfolded.

The “Correctly Folded State” as a Metastable State

The folded state of protein, as determined by X-ray crystallography or NMR and
deposited in the PDB, is considered the state of lowest free energy for a given

background image

Vol. 7

PROTEIN FOLDING

655

isolated protein molecule. However, recent studies have raised doubt about the
validity of this assumption for concentrated ensembles of proteins in aqueous
solution. Many (if not all) proteins may undergo a process of aggregation and
misfolding at infinite time (35). For example, as described above, amyloid-related
proteins tend to undergo spontaneous aggregation in solution. It is therefore as-
sumed that the nonaggregated state of polypeptides such as IAPP is a kinetically
trapped metastable state [“kinetic solubility,” as coined by Jarrett and Lansbury
(33)]. However, in recent years it has been realized that disease-related proteins
are not the only ones to undergo an aggregation process in solution. It has been
demonstrated that disease-unrelated proteins also, such as the SH3 domain, myo-
globin, and a bacterial cold shock protein, form aggregated structures in solution.
Aggregation is also noted in preparations of many protein solutions after long stor-
age. This leads to the suggestion that the aggregated form, and most notably the
amyloid form, of proteins may represent a generic form of proteins. Therefore if
each protein molecule is not regarded as an independent thermodynamic system,
but we consider the ensemble of proteins in aqueous solution as one large thermo-
dynamic system, a solution of aggregation-prone but “correctly folded” proteins
represent a system in a transient state that will eventually reach its global free-
energy minimum of the aggregated state (35). It appears that many (or perhaps
all) proteins will sooner or later undergo the aggregation process. For some pro-
teins this may take minutes or hours and for others it can take months or years.
For example, the fibrillation time of the aggregation-prone human calcitonin at
physiological pH is 5 min at a concentration of 5 mg/mL, whereas the fibrilla-
tion time of non–aggregation-prone salmon calcitonin under the same condition
is about 7 months (40), but after this long time it does form aggregated fibrillar
structures.

BIBLIOGRAPHY

1. L. Pauling and R. B. Corey, Proc. Natl. Acad. Sci. U.S.A. 37, 252 (1951).
2. G. D. Rose and R. Wolfenden, Annu. Rev. Biophys. Biomol. Struct. 22, 381 (1993).
3. K. A. Dill, Biochemistry 24, 1501 (1985).
4. K. A. Dill, Biochemistry 29, 7133 (1990).
5. K. A. Dill, Science 250, 297 (1990).
6. C. B. Anfinsen, Science 181, 223 (1973).
7. F. U. Hartl and M. Hayer-Hartl, Science 295, 1852 (2002).
8. R. J. Ellis, Curr. Biol. 11, R1038 (2001).
9. T. K. Chaudhuri, G. W. Farr, F. A. Fenton, S. Rospert, and A. L. Horwich, Cell 107, 235

(2001).

10. J. K. Myers and T. G. Oas, Annu. Rev. Biochem. 71, 783 (2002).
11. C. Levinthal, J. Chim. Phys. 65, 44 (1968).
12. T. E. Creighton, J. Mol. Biol. 87, 563 (1974).
13. T. E. Creighton, J. Mol. Biol. 87, 579 (1974).
14. J. S. Weissman and P. S. Kim, Science 256, 112 (1992).
15. P. E. Leopold, M. Montal, and J. N. Onuchic, Proc. Natl. Acad. Sci. U.S.A. 89, 8721

(1992).

16. P. G. Wolynes, J. N. Onuchic, and D. Thirumalai, Science 267, 1619 (1995).
17. P. G. Wolynes, Proc. Natl. Acad. Sci. U.S.A. 93, 14249 (1996).

background image

656

PROTEIN FOLDING

Vol. 7

18. K. A. Dill, K. M. Fiebig, and H. S. Chan, Proc. Natl. Acad. Sci. U.S.A. 90, 1942 (1993).
19. E. M. Boczko and C. L. Brooks, Science 269, 393 (1995).
20. J. N. Onuchic, P. G. Wolynes, Z. Luthey-Schulten, and N. D. Socci, Proc. Natl. Acad.

Sci. U.S.A. 92, 3626 (1995).

21. A. R. Fersht and V. Daggett, Cell 108, 573 (2002).
22. O. B. Ptitsyn and A. A. Rashin, Biophys. Chem. 3, 1 (1975).
23. P. S. Kim and R. L. Baldwin, Annu. Rev. Biochem. 51, 459 (1982).
24. P. S. Kim and R. L. Baldwin, Annu. Rev. Biochem. 59, 631 (1990).
25. A. R. Fersht, Curr. Opin. Struct. Biol. 7, 3 (1997).
26. F. C. Bernstein, T. F. Koetzle, G. J. B. Williams, E. F. Meyer Jr., M. D. Brice, J. R.

Rodgers, O. Kennard, T. Shimanouchi, and M. Tasumi, J. Mol. Biol. 112, 535 (1977).

27. H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N.

Shindyalov, and P. E. Bourne, Nucleic Acids Res. 28, 235 (2000).

28. P. Y. Chou and G. Fasman, Biochemistry 15, 13222 (1974).
29. J. Garnier, D. J. Osguthorpe, and B. Robson, J. Mol. Biol. 120, 97 (1978).
30. B. Rost and C. Sander, Proc. Natl. Acad. Sci. U.S.A. 90, 7558 (1993).
31. D. Fischer, D. Rice, J. U. Bowie, and D. Eisenberg, FASEB J. 10, 126 (1996).
32. C. D. Tsai, B. Ma, S. Kumar, H. Wolfson, and R. Nussinov, Crit. Rev. Biochem. Mol.

Biol. 36, 399 (2001).

33. J. T. Jarrett and P. T. Lansbury Jr., Cell 73, 1055 (1993).
34. C. M. Dobson, Trends Biochem. Sci. 24, 329 (1999).
35. E. Gazit, Angew. Chem. Int. Ed. 41, 257 (2002).
36. E. Gazit, FASEB J. 16, 77 (2002).
37. V. N. Uversky, Protein Sci. 11, 739 (2002).
38. E. Gazit and R. T. Sauer, J. Biol. Chem. 274, 2652 (1999).
39. M. R. Chapman, L. S. Robinson, J. S. Pinkner, R. Roth, J. Heuser, M. Hammar,

S. Normark, and S. J. Hultgren, Science 295, 851 (2002).

40. K. Kanaori and A. Y. Nosaka, Biochemistry 34, 12138 (1995).

E

HUD

G

AZIT

V

ERONICA

G

LATTAUER

J

EROME

A. W

ERKMEISTER

Tel Aviv University

PSA.

See P

RESSURE

S

ENSITIVE

A

DHESIVES

.

PULTRUSION.

See C

OMPOSITES

, F

ABRICATION

.

PVC.

See V

INYL

C

HLORIDE

P

OLYMERS

.

PVDC.

See V

INYLIDENE

C

HLORIDE

P

OLYMERS

.

PVF.

See V

INYLCARBAZOLE

P

OLYMERS

.

PVK.

See V

INYLCARBAZOLE

P

OLYMERS

.

PVP.

See V

INYL

A

MIDE

P

OLYMERS

.


Wyszukiwarka

Podobne podstrony:
1 DIETA PROTEINOWA
FABP ang. fatty acids binding proteins
Dieta proteinowa dr.dukana, Zdrowie
DIETA PROTEINOWA
Arakawa et al 2011 Protein Science
ProteinCrystallography
Kulki Proteinowe
Methods in Enzymology 463 2009 Quantitation of Protein
DIETA PROTEINOWA
Fluorescent proteins as a toolkit for in vivo imaging 2005 Trends in Biotechnology
Dieta Proteinowa przepisy na 101 deserów
Beta barrel proteins form bacte Nieznany
Producing proteins in transgenic plants and animals
Dieta proteinowa(1)
Making recombinant proteins in animals
(kulki proteinowe 17uniwersalnych tonących)id 1350

więcej podobnych podstron