Introduction to the philosophy of science
by Lyle Zinda
Lecture 1
2/1/94
Introduction
Philosophy of science is part of a range of sub-disciplines known as "philosophy of X"
(where X may be filled in with art, history, law, literature, or the various special sciences
such as physics). Each of the activities for which there is a "philosophy of X" is an
investigation into a certain part of the world or a particular type of human activity. What
we will talk about today, in very general terms, is what distinguishes philosophy of X
from sociology, history, or psychology of X. These approaches to science cannot be
sharply demarcated, though many people have thought that they can be. However, there
are clear differences of emphasis and methods of investigation between these approaches
that we can outline here in a preliminary way.
Let us attempt to characterize the particular emphases of several approaches to studying
science:
• Sociology of Science - Sociology of science studies how scientists interact as
social groups to resolve differences in opinion, how they engage in research, and
how they determine which of many theories and research programs are worth
pursuing, among other things.
• Psychology of Science - Psychology of science studies how scientists reason, i.e.,
the thought processes that scientists follow when they are judging the merits of
certain kinds of research or theoretical speculations; how they reason about data,
experiment, theories, and the relations between these; and how they come up with
new theories or experimental procedures.
• History of Science - History of science studies how scientists have engaged in the
various activities noted above in the past, i.e., how social interactions among
scientists (and between scientists and the greater society) and styles of scientific
reasoning have changed over the centuries; and how particular scientific
achievements came to be accepted, both by individual scientists and by the
scientific community as a whole.
In each of these cases, the data on which the approach to studying science is based is
empirical or observational. What is at issue is how scientists in fact interact as social
groups, reason, or how scientific reasoning styles and scientific theories have in fact
changed over time. To adjudicate disputes within these approaches thus requires
engaging in much like scientific activity itself: one must gather evidence for one's views
(or in the case of certain approaches to history, one's interpretation of the nature of
scientific activity or the causes of scientific revolutions).
What is philosophy of science? How does it differ from these other approaches to
studying science? Well, that's not an easy question to answer. The divisions among
philosophers of science are quite striking, even about fundamentals, as will become
apparent as the course proceeds. One reason for this is that philosophers of science, on
occasion, would find many of the things that sociologists, psychologists, and historians of
science study to be relevant to their own studies of science. Of course, the degree to
which philosophers of science are interested in and draw upon the achievements of these
other disciplines varies greatly among individuals--e.g., some philosophers of science
have been far more interested in the history of science, and have thought it more relevant
to their own endeavors, than others. However, there are some tendencies, none of them
completely universal, that would serve to mark a difference between philosophers of
science on the one hand and sociologists, historians, and psychologists of science on the
other.
The first difference is that philosophy of science is not primarily an empirical study of
science, although empirical studies of science are of relevance to the philosopher of
science. (Like everything else you might cite as a peculiarity of philosophy of science,
this point is a matter of dispute; some philosophers of science, for example, claim that
philosophy of science ought to be considered a special branch of epistemology, and
epistemology ought to be considered a special branch of empirical psychology.)
Philosophers of science do not generally engage in empirical research beyond learning
something about a few branches of science and their history. This type of study, however,
is simply a prerequisite for talking knowledgeably about science at all. Philosophers
primarily engage in an activity they call "conceptual clarification," a type of critical,
analytical "armchair" investigation of science. For example, a philosopher of science may
try to answer questions of the following sort.
What is scientific methodology, and how does it differ (if it does) from the
procedures we use for acquiring knowledge in everyday life?
How should we interpret the pronouncements of scientists that they have gained
knowledge about the invisible, underlying structure of the world through their
investigations?
Part of what is open to philosophy of science, insofar as it is critical, is to question the
methods that scientists use to guide their investigations. In other words, philosophers of
science often seek to answer the following question.
What reason is there to think that the procedures followed by the scientist are
good ones?
In a sense, philosophy of science is normative in that it asks whether the methods that
scientists use, and the conclusions that they draw using those methods, are proper or
justified. Normally, it is assumed that the methods and conclusions are proper or justified,
with it being the task of the philosopher of science to explain precisely how they can be
proper or justified. (In other words, the philosopher of science seeks to understand the
practice of science in such as way as to vindicate that practice.) This opens up the
possibility of revision: that is, if a philosopher of science concludes that it is impossible
to justify a certain feature of scientific practice or methodology, he or she might conclude
that that feature must be abandoned. (This would be rare: most philosophers would react
to such a situation by rejecting the view that did not vindicate that feature of scientific
practice.)
Let us take another approach to the question of what distinguishes philosophy of science
from other approaches to studying science. Philosophy has, since Plato, been concerned
with the question of what a particular kind of thing essentially is. In other words,
philosophers seek a certain sort of answer to questions of the following form.
What is X?
In asking a question of this form, philosophers seek to understand the nature of X, where
by "nature" they mean something like X's essence or meaning.
We will start the course by considering the question, "What is scientific explanation?"
We will also seek to answer the question, "What makes a scientific explanation a good
one?" Most people take the notion of explanation for granted; but as you will soon find
out, philosophers take a special interest in the concepts others take for granted.
Philosophers emphasize the difference between being able to identify something as an
explanation and being able to state in precise terms what an explanation is, i.e., what
makes something an explanation. Philosophers seek to do the latter, assuming that they
are able (like everyone else) to do the former.
None of this, of course, will mean very much until we have examined the philosophy of
science itself, i.e., until we start doing philosophy of science. To a large degree, each of
you will have to become a philosopher of science to understand what philosophy of
science is.
Lecture 2
2/3/94
The Inferential View Of Scientific Explanation
Last time we discussed philosophy of science very abstractly. I said that it was difficult to
separate completely from other studies of science, since it draws on these other
disciplines (sociology, history, psychology, as well as the sciences themselves) for its
"data." What distinguishes philosophy of science from other studies of science is that it
(1) takes a critical, evaluative approach--e.g., it aims at explaining why certain methods
of analyzing data, or as we shall see today, the notion of explanation--are good ones.
There is also an emphasis on conceptual analysis--e.g., explaining what explanation is,
or, in other words, what it means when we say that one thing "explains" another.
(Philosophers often discuss the meaning of many terms whose meaning other people take
for granted.) I also noted that the best way to see how philosophy of science is
differentiated from other studies of science is by examining special cases, i.e., by doing
philosophy of science. We will start our investigation by examining the notion of
explanation.
The Difference Between Explanation And Description
It is commonplace, a truism, to say that science aims at not only describing regularities in
the things that we observe around us--what is often called observable or empirical
phenomena--but also at explaining those phenomena. For example, there is the
phenomenon of redshift in the spectra of stars and distant galaxies. The physical
principles behind redshift are sometimes explained by analogy with the Doppler effect,
which pertains to sound: observed wavelengths are increased if the object is moving
away from us, shortened if the object is moving towards us. (Also, there is a derivation in
general relativity theory of the redshift phenomenon, due to gravitation.) In 1917, de
Sitter predicted that there would be a relationship between distance and redshift, though
this prediction was not generally recognized until Hubble (1929), who was inspired by de
Sitter's analysis. Another example: There's a periodic reversal in the apparent paths of the
outer planets in the sky. This can be predicted purely mathematically, based on past
observations, but prediction does not explain why the reversal occurs. What is needed is a
theory of the solar system, which details how the planets' real motions produce the
apparent motions that we observe. Final example: Hempel's thermometer example (an
initial dip in the mercury level precedes a larger increase when a thermometer is placed
into a hot liquid).
Three Approaches To Explanation
A philosopher of science asks: What is the difference between describing a phenomenon
and explaining it? In addition, what makes something an adequate explanation?
Philosophers have defended three basic answers to this question.
Inferential View (Hempel, Oppenheim) - An explanation is a type of argument, with
sentences expressing laws of nature occurring essentially in the premises, and the
phenomenon to be explained as the conclusion. Also included in the premises can
be sentences describing antecedent conditions.
Causal View (Salmon, Lewis) - An explanation is a description of the various causes
of the phenomenon: to explain is to give information about the causal history that
led to the phenomenon.
Pragmatic View (van Fraassen) - An explanation is a body of information that
implies that the phenomenon is more likely than its alternatives, where the
information is of the sort deemed "relevant" in that context, and the class of
alternatives to the phenomenon are also fixed by the context.
In the next few weeks, we will examine each of these accounts in turn, detailing their
strengths and weaknesses. Today we will start with Hempel's inferential view, which
goes by several other names that you might encounter.
The "received" view of explanation (reflecting the fact that philosophers
generally agreed with the inferential view until the early 1960s)
The "deductive-nomological" model of explanation (along with its probabilistic
siblings, the "inductive-statistical" and "deductive-statistical" models of
explanation)
The Inferential Theory of Explanation
The original Hempel-Oppenheim paper, which appeared in 1948, analyzed what has
come to be known as "deductive-nomological" (D-N) explanation. Hempel and
Oppenheim consider patterns of explanation of a particular sort and try to generalize that
pattern. As already noted, they view explanation as a certain sort of argument, i.e., a set
of premises (sentences) which collectively imply a conclusion. The arguments they
considered were deductive, i.e., arguments such that if the premises are true the
conclusion has to be true as well.
All human beings are mortal.
Socrates is a human being.
Socrates is mortal.
Not every argument that is deductive is an explanation, however. How can we separate
those that are from those that are not?
To accomplish this task, Hempel and Oppenheim describe their "General Conditions of
Adequacy" that define when a deductive argument counts as an adequate explanation.
An explanation must:
(a) be a valid deductive argument (hence "deductive")
(b) contain essentially at least one general law of nature as a premise (hence
"nomological")
(c) have empirical content (i.e., it must be logically possible for an observation-
statement to contradict it)
The first three conditions are "logical" conditions, i.e., formal, structural features that a
deductive argument must have to count as an explanation. To complete the conditions of
adequacy, Hempel and Oppenheim add a fourth, "empirical" condition.
The premises (statements in the explanans) must all be true.
On the inferential view, explanations all have the following structure (where the
antecedent conditions and laws of nature make up the "explanans").
C1, ... , Cn [antecedent conditions (optional)]
L1, ... , Ln [laws of nature]
Therefore, E [explanandum]
Later, we will be looking at a statistical variant of this pattern, which allows the laws of
nature to be statistical, and the resulting inference to be inductive.
Laws Of Nature
Since Hempel and Oppenheim's analysis is stated in terms of laws of nature, it is
important to state what a law of nature is on their view. Hempel and Oppenheim take a
law to be a true "law-like" sentence. This means that a law is a linguistic entity, to be
distinguished by its peculiar linguistic features. According to Hempel and Oppenheim,
laws are to be distinguished from other sentences in a language in that they are (1)
universal, (2) have unlimited scope, (3) contain no designation of particular objects, and
(4) contain only "purely qualitative" predicates.
The problem Hempel and Oppenheim face is distinguishing laws from accidental
generalizations, i.e., general truths that happen to be true, though they are not true as a
matter of physical law. For example, suppose that all the apples that ever get into my
refrigerator are yellow. Then the following is a true generalization: "All the apples in my
refrigerator are yellow." However, we do not deem this sentence to be a law of nature.
One reason that this might be so is that this statement only applies to one object in the
universe, i.e., my refrigerator. Laws of nature, by contrast, refer to whole classes of
objects (or phenomena). (Consider the sentence "All gases that are heated under constant
pressure expand.") It is for this reason that Hempel and Oppenheim include the
requirement that to be a law of nature a sentence must not designate any particular
objects.
Discuss: Can we get around this requirement by a bit of linguistic trickery? Consider if
we replace the designation "in my refrigerator" by "in any Lyle Zynda-owned
refrigerator," or, even better, by some physical description in terms of "purely
qualitative" predicates that single out my refrigerator from the class of all refrigerators in
the universe. Would the existence of such a sentence convince you that you have
discovered a new law of nature? Moreover, consider the following two sentences.
G: No gold sphere has a mass greater than 100,000 kilograms.
U: No enriched uranium sphere has a mass greater than 100,000 kilograms.
The former is not a law of nature, though the latter is (though it is of a relatively low-
level variety of law).
One reason that the statements about apples and gold might not be laws of nature, a
reason not adequately captured by the Hempel-Oppenheim analysis of laws, is that they
do not support inferences to counterfactual statements. For example, it cannot be inferred
from the fact that all the apples that ever get into my refrigerator are yellow that if a red
apple were to be placed into my refrigerator, it would turn yellow. Laws of nature, by
contrast, support counterfactual inferences. From the fact that all gases that are heated
under constant pressure expand we can infer that if a sample of gas in a particular
container were heated under constant pressure, it would expand. (We can reasonably infer
this even if we never heat that sample of gas under constant pressure.) Similarly, we can
infer that if were to successfully gather 100,000 kilograms of uranium and try to fashion
it into a sphere, we would fail. (We could not infer a similar statement about gold from
the true generalization that no gold sphere has a mass greater than 100,000 kilograms.)
The difference between statements G and U has never been captured purely syntactically.
Thus, Hempel and Oppenheim's view that laws of nature are sentences of a certain sort
must be fundamentally mistaken. What laws of nature are is still a matter of dispute.
Counterexamples To The Inferential View Of Scientific Explanation: Asymmetry
and Irrelevance
The Hempel-Oppenheim analysis of scientific explanation has the following overlapping
features. (Review and explain each.)
(a) Inferential - Explanations are arguments: to explain why E occurred is to
provide information that would have been sufficient to predict beforehand that E
would occur.
(b) Covering Law - Explanations explain by showing that E could have been
predicted from the laws of nature, along with a complete specification of the
initial conditions.
(c) Explanation-Prediction Symmetry - The information (i.e., laws, antecedent
conditions) appearing in an adequate explanation of E could have been used to
predict E; conversely, any information that can be used to predict E can be used
after the fact to explain why E occurred.
(d) No Essential Role For Causality - Laws of nature do not have to describe
causal processes to be used legitimately in scientific explanations.
Many counterexamples have been given to the Hempel-Oppenheim analysis of scientific
explanation that target one or more of these features. The first group of counterexamples
reveals that the Hempel-Oppenheim analysis faces the Problem of Asymmetry: Hempel-
Oppenheim assert that explanation and prediction are symmetric, whereas that does not
seem to be the case, as the following examples show.
(1) Eclipse - You can predict when and where a solar eclipse will occur using the
laws governing the orbit of the Earth around the Sun, and the orbit of the Moon
around the Earth, as well as the initial configuration these three bodies were in at
an earlier time. You can also make the same prediction by extrapolating
backwards in time from the subsequent positions of these three bodies. However,
only the first would count as an explanation of why the eclipse occurred when and
where it did.
(2) Flagpole - Using the laws of trigonometry and the law that light travels in
straight lines, you can predict the length of the shadow that a flagpole of a certain
height will cast when the Sun is at a certain elevation. You can also predict what
the height of the flagpole is by measuring the length of its shadow and the
elevation of the Sun. However, only the first derivation would count as an
explanation.
(3) Barometer - Using the laws governing weather patterns, storm formation, and
the effect of air pressure on the behavior of barometers, you can predict that when
a barometer falls that a storm will soon follow. You can also predict that when a
storm is approaching, the barometer will fall. However, neither of these are
explanatory, since both are explained by antecedent atmospheric conditions.
The second group of counterexamples reveals that the Hempel-Oppenheim analysis of
explanation faces the Problem of Irrelevance: the Hempel-Oppenheim analysis
sometimes endorses information as explanatory when it is irrelevant to the explanandum.
(4) Birth Control Pills - All men who take birth control pills never get pregnant.
Thus, from the fact that John is taking birth control pills we can infer logically
that he won't get pregnant. However, this would hardly be an explanation of
John's failing to get pregnant since he couldn't have gotten pregnant whether or
not he took birth control pills.
(5) Hexed Salt - All salt that has had a hex placed on it by a witch will dissolve in water.
Hence, we can logically infer from the fact that a sample of salt had a hex placed on it by
a witch that it will dissolve in water. However, this wouldn't give us an explanation of the
dissolution since the salt would have dissolved even if it hadn't been hexed.
Lecture 3
2/8/94
The Causal Theory Of Explanation, Part I
Last time, we saw that the inferential view of explanation faced the asymmetry and
irrelevance problems. There is another problem, however, that comes out most clearly
when we consider the inductive-statistical (I-S) component of the inferential view. This
problem strikes most clearly at the thesis at the heart of the inferential view, namely, that
to explain a phenomenon is to provide information sufficient to predict that it will occur.
I-S explanation differs from D-N explanation only in that the laws that are cited in the
explanation can be statistical. For example, it is a law of nature that 90% of electrons in a
90-10 superposition of spin-up and spin-down will go up if passed through a vertically
oriented Stern-Gerlach magnet. This information provides us with the materials for
stating an argument that mirrors a D-N explanation.
Ninety percent of electrons in a 90-10 superposition of spin-up and spin-down
will go up if passed through a vertically oriented Stern-Gerlach magnet. (Law of
Nature)
This electron is in a 90-10 superposition of spin-up and spin-down and is passed
through a vertically oriented Stern-Gerlach magnet. (Statement of Initial
Conditions)
Therefore, this electron goes up. (Explanandum) [90%]
This argument pattern is obviously similar to that exhibited by D-N explanation, the only
difference being that the law in the inductive argument stated above is statistical rather
than a universal generalization. On the inferential view, this argument constitutes an
explanation since the initial conditions and laws confer a high probability on the
explanandum. If you knew that these laws and initial conditions held of a particular
electron, you could predict with high confidence that the electron would go up.
The problem with the inferential view is that you can't always use explanatory
information as the basis for a prediction. That is because we frequently offer explanations
of events with low probability. (Numbers in the examples below are for purposes of
illustration only.)
Atomic Blasts & Leukemia. We can explain why a person contracted leukemia
by pointing out the person was once only two miles away from an atomic blast,
and that exposure to an atomic blast from that distance increases one's chances of
contracting leukemia in later life. Only 1 in 1,000 persons exposed to an atomic
blast eventually contract leukemia. Nevertheless, exposure to an atomic blast
explains the leukemia since people who haven't been exposed to an atomic blast
have a much lower probability (say, 1 in 10,000) of contracting leukemia.
Smoking & Lung Cancer. We can explain why someone contracted lung cancer
by pointing out that the person had smoked two packs of cigarettes a day for forty
years. This is an explanation since people who smoke that much have a much
higher probability (say, 1 in 100) of contracting lung cancer than non-smokers
(say, 1 in 10,000). Still, the vast majority of smokers (99 percent) will never
contract lung cancer.
Syphilis & Paresis. We can explain why someone contracted paresis by pointing
out that the person had untreated latent syphilis. This is an explanation since the
probability of getting paresis is much higher (e.g., 1 in 100) if a person has
untreated latent syphilis than if he does not (e.g., 0). Still, the vast majority of
people with untreated latent syphilis will never contract paresis.
In each of these cases, you can't predict that the result will occur since the information
does not confer a high probability on the result. Nevertheless, the information offered
constitutes an explanation of that result, since it increases the probability that that result
will occur.
In the 1960s and 1970s, Wesley Salmon developed a view of statistical explanation that
postulated that, contrary to what Hempel claimed earlier, high probability was not
necessary for an explanation, but only positive statistical relevance.
Definition. A hypothesis h is positively relevant (correlated) to e if h makes e
more likely, i.e., pr(h|e) > pr(h).
The problem Salmon faced was distinguishing cases where the information could provide
a substantive explanation from cases where the information reported a mere correlation
and so could not. (For example, having nicotine stains on one's fingers is positively
correlated with lung cancer, but you could not explain why a person contracted lung
cancer by pointing out that the person had nicotine stains on their fingers.) Distinguishing
these cases proved to be impossible using purely formal (statistical) relations. Obviously,
some other type of information was needed to make the distinction. Rejecting the
received view of explanation, Salmon came to believe that to explain a phenomenon is
not to offer information sufficient for a person to predict that the phenomenon will occur,
but to give information about the causes of that phenomenon. On this view, an
explanation is not a type of argument containing laws of nature as premises but an
assembly of statistically relevant information about an event's causal history.
Salmon points out two reasons for thinking that causal information is what is needed to
mark off explanations. First, the initial conditions given in the explanatory information
have to precede the explanandum temporally to constitute an explanation of the
explanandum. Hempel's theory has no restriction of this sort. The eclipse example
illustrates this fact: you can just as well use information about the subsequent positions of
the Sun and Moon to derive that an eclipse occurred at an earlier time as use information
about the current positions of the Sun and Moon to derive that an eclipse will occur later.
The latter is a case of retrodiction, whereas the latter is a (familiar) case of prediction.
This is an example of the prediction-explanation symmetry postulated by Hempel.
However, as we saw earlier when discussing the problem of asymmetry, only the
forward-looking derivation counts as an explanation. Interestingly, Salmon points out that
the temporal direction of explanation matches the temporal direction of causation, which
is forward-looking (i.e., causes must precede their effects in time).
Second, not all derivations from laws count as explanations. Salmon argues that some D-
N "explanations" (e.g., a derivation from the ideal gas law PV = nRT and a description of
initial conditions) are not explanations at all. The ideal gas law simply describes a set of
constraints on how various parameters (pressure, volume, and temperature) are related; it
does not explain why these parameters are related in that way. Why these constraints
exist is a substantive question that is answered by the kinetic theory of gases. (Another
example: People knew for centuries how the phases of the Moon were related to the
height of tides, but simply describing how these two things are related did not constitute
an explanation. An explanation was not provided until Newton developed his theory of
gravitation.) Salmon argues that the difference between explanatory and non-explanatory
laws is that the former describe causal processes, whereas non-explanatory laws (such as
the ideal gas law) only describe empirical regularities.
Lecture 4
(2/10/94)
The Causal Theory Of Explanation, Part II
As we discussed last time, Salmon sought to replace the inferential view of explanation,
which faces the asymmetry and irrelevance problems, with a causal theory, which
postulates that an explanation is a body of information about the causes of a particular
event. Today we will discuss Salmon's view in detail, as well as the related view of David
Lewis.
Salmon's theory of causal explanation has three elements.
(1) Statistical Relevance - the explanans (C) increases the probability of the
explanandum (E), i.e., pr(E|C) > pr(E).
(2) Causal Processes - the explanans and the explanandum are both parts of
different causal processes
(3) Causal Interaction - these causal processes interact in such a way as to bring
about the event (E) in question
This leaves us with the task of saying what a causal process is. Basically, Salmon's view
is that causal processes are characterized by two features. First, a causal process is a
sequence of events in a continuous region of spacetime. Second, a causal process can
transmit information (a "mark").
Let us discuss each of these in turn. There are various sequences of events that are
continuous in the required sense--e.g., a light beam, a projectile flying through space, a
shadow, or a moving spot of light projected on a wall. An object that is sitting still, e.g., a
billiard ball, is also deemed a causal process. Each of these is a continuous process in
some sense but not all of them are causal processes--e.g., the shadow and light spot. Let's
look at an example that makes this clearer. As some of you may know, relativity theory
says that nothing can travel faster than light. But what is the "thing" in nothing? Consider
a large circular room with a radius of 1 light year. If we have an incredibly focused laser
beam mounted on a swivel in the center of the room, we can rotate the laser beam so that
it rotates completely once per second. If the laser beam is on, it will project a spot on the
wall. This spot too will rotate around the wall completely once per second, which means
that it will travel at 2p light-years per second! Strangely enough, this is not prohibited by
relativity theory, since a spot of this sort cannot "transmit information." Only things of
that sort are limited in speed.
Salmon gives this notion an informal explication in "Why ask 'Why'?" He argues that the
difference between the two cases is that a process like a light beam is a causal process:
interfering with it at one point alters the process not only for that moment: the change
brought about by the interference is "transmitted" to the later parts of the process. If a
light beam consists of white light (or a suitably restricted set of frequencies), we can put a
filter in the beam's path, e.g., separating out only the red frequencies. The light beam
after it passes through the filter will bear the "mark" of having done so: it will now be red
in color. Contrast this with the case of the light spot on the wall: if we put a red filter at
one point in the process, the spot will turn red for just that moment and then carry on as if
nothing had happened. Interfering with the process will leave no "mark." (For another
example, consider an arrow on which a spot of paint is placed, as opposed to the shadow
of the arrow. The first mark is transmitted, but the second is not.)
Thus, Salmon concludes that a causal process is a spatiotemporal continuous process that
can transmit information (a "mark"). He emphasizes that the "transmission" referred to
here is not an extra, mysterious event that connects two parts of the process. In this
regard, he propounds his "at-at" theory of mark transmission: all that transmission of a
mark consists in is that the mark occurs "at" one point in the process and then remains in
place "at" all subsequent points unless another causal interaction occurs that erases the
mark. (Here he compares the theory with Zeno's famous arrow paradox. Explain. The
motion consists entirely of the arrow being at a certain point at a certain time; in other
words, the motion is a function from times to spatial points. This is necessary when we
are considering a continuous process. To treat as the conjunction of the discrete events of
moving from A halfway to C, moving halfway from there, and so on, leads to the
paradoxical conclusion that the arrow will never reach its destination. Transmission is not
a "link" between discrete stages of a process, but a feature of that process itself, which is
continuous.
Now we turn to explanation. According to Salmon, a powerful explanatory principle is
that whenever there is a coincidence (correlation) between the features of two processes,
the explanation is an event common to the two processes that accounts for the correlation.
This is a "common cause." To cite an example discussed earlier, there is a correlation
between lung cancer (C) and nicotine stains on a person's fingers (N). That is,
pr(C|N) > pr(C).
The common cause of these two events is a lifetime habit of smoking two packs of
cigarettes each day (S). Relative to S, C and N are independent, i.e.,
pr(C|N&S) = pr(C|S).
You'll sometimes see the phrase that S "screens C off from N" (i.e., once S is brought into
the picture N becomes irrelevant). This is part of a precise definition of "common cause,"
which is constrained by the formal probabilistic conditions. We start out with pr(A|B) >
pr(A). C is a common cause of A and B if the following hold.
pr(A&B|C) = pr(A|C)pr(B|C)
pr(A&B|¬C) = pr(A|¬C)pr(B|¬C)
pr(A|C) > pr(A|¬C)
pr(B|C) > pr(B|¬C)
(The first condition is equivalent to the screening off condition given earlier.) These
conditions are also constrained: A, B, and C have to be linked suitably as parts of a causal
process known as a "conjunctive" fork.
(Consider the relation between this and the smoking-lung cancer, and leukemia-atomic
blast cases given earlier.)
This does not complete the concept of causal explanation, however, since some common
causes do not "screen off" correlated by (causally) independent events. Salmon gives
Compton scattering as an example. Given that an electron e- absorbs a photon of a certain
energy E and is given a bit of kinetic energy E* in a certain direction as a result, a second
photon will be emitted with E** = E - E*. The energy levels of the emitted photon and of
the electron will be correlated, even given that the absorption occurred. That is,
pr(A&B|C) > pr(A|C)pr(B|C).
This is a causal interaction of a certain sort, between two processes (the electron and the
photon). We can use the probabilistic conditions here to analyze the concept: "when two
processes intersect, and both are modified in such ways that the changes in one are
correlated with changes in the other--in the manner of an interactive fork--we have causal
interaction." Thus, a second type of "common cause" is provided by the C in the
interactive fork.
Salmon's attempt here is to analyze a type of explanation that is commonly used in
science, but the notion of causal explanation can be considered more broadly than he
does. For example, Lewis points out that the notion of a causal explanation is quite fluid.
In his essay on causal explanation, he points out that there is an extremely rich causal
history behind every event. (Consider the drunken driving accident case.) Like Salmon,
Lewis too argues that to explain an event is to provide some information about its causal
history. The question arises, what kind of information? Well, one might be to describe in
detail a common cause of the type discussed by Salmon. However, there might be many
situations in which we might only want a partial description of the causal history (e.g., we
are trying to assign blame according to the law, or we already know a fair chunk of the
causal history and are trying to find out something new about it, or we just want to know
something about the type of causal history that leads to events of that sort, and so on).
Lewis allow negative information about the causal history to count as an explanation
(there was nothing to prevent it from happening, there was no state for the collapsing star
to get into, there was no connection between the CIA agent being in the room and the
Shah's death, it just being a coincidence, and so on). To explain is to give information
about a causal history, but giving information about a causal history is not limited to
citing one or more causes of the event in question.
(Mention here that Lewis has his own analysis of causation, in terms of non-backtracking
counterfactuals.)
Now given this general picture of explanation, there should be no explanations that do
not cite formation about the causal history of a particular event. Let us consider whether
this is so. Remember the pattern of D-N explanation that we looked at earlier, such as a
deduction of the volume of a gas that has been heated from the description of its initial
state, how certain things such as temperature changed (and others, such as pressure, did
not), and an application of the ideal gas law PV = nRT. On Hempel's view, this could
count as an explanation, even though it is non-causal. Salmon argues that (1) non-causal
laws allow for "backwards" explanations, and (2) cry out to be explained themselves.
Regarding the latter point, he says that non-causal laws of this sort are simply
descriptions of empirical regularities that need to be explained. The same might occur in
the redshift case, if the law connecting the redshift with the velocity was simply an
empirical generalization. (Also, consider Newton's explanation of the tides.)
Let's consider another, harder example. A star collapses, and then stops. Why did it stop?
Well, we might cite the Pauli Exclusion Principle (PEP), and say that if it had collapsed
further, there would have been electrons sharing the same overall state, which can't be
according the PEP. Here PEP is not causing the collapse to stop; it just predicts that it
will stop. Lewis claims that the reason this is explanatory is that it falls into the "negative
information" category. The reason that the star stopped collapsing is that there was no
physically permissible state for it to get into. This is information about its causal history,
in that it describes the terminal point of that history.
There are other examples, from quantum physics, especially, that seem to give the most
serious problems for the causal view of explanation, especially Salmon's view that
explanations in science are typically "common cause" explanations. One basic problem is
that Salmon's view relies on spatiotemporal continuity, which we cannot assume at the
particulate level. (Consider tunneling.) Also, consider the Bell-type phenomenon, where
we have correlated spin-states, and one particle is measured later than the other. Why did
the one have spin-up when it was measured? Because they started out in the correlated
state and the other one that was measured had spin-down. You can't always postulate a
conjunctive fork. This should be an interactive fork of the type cited by Salmon, since
given the set-up C there is a correlation between the two events. However, it is odd to say
the setup "caused" the two events when they were both space-like separated! We will
consider these problems in greater detail next week.
Lecture 5
2/15/93
The Causal Theory Of Explanation, Part III
Last time, we discussed Salmon's analysis of causal explanation. To review, Salmon said
that explanation involved (1) statistical relevance, (2) connection via a causal process,
and (3) change after a causal interaction. The change is the event to be explained. The
notion of a causal process was filled out in terms of (a) spatiotemporal continuity and (b)
the ability to transmit information (a "mark"). While we can sometimes speak simply of
cause and effect being parts of a single causal process, the final analysis will typically be
given in terms of more complex (and sometimes indirect) causal connections, of which
Salmon identifies two basic types: conjunctive and interactive forks.
Today, we are going to look at these notions a bit more critically. To anticipate the
discussion, we will discuss in turn (1) problems with formulating the relevant notion of
statistical relevance (as well as the problems associated with a purely S-R view of
explanation), (2) the range of causal information that can be offered in an explanation,
and (3) whether some explanations might be non-causal. If we have time, we will also
discuss whether (probabilistic or non-probabilistic) causal laws have to be true to play a
role in good explanations (Cartwright).
Statistical Relevance
The basic problem with statistical relevance is specifying what is relevant to what.
Hempel first noticed in his analysis of statistical explanation that I-S explanations, unlike
D-N explanations, could be weakened by adding further information. For example, given
that John has received penicillin, he is likely to recover from pneumonia; however, given
the further information that he has a penicillin-resistant strain of pneumonia, he is
unlikely to recover. Thus, we can use true information and statistical laws to explain
mutually contradictory things (recovery, non-recovery). On the other hand, note that we
can also strengthen a statistical explanation by adding more information (in the sense that
the amount of inductive support the explanans gives the explanandum can be increased).
This "ambiguity" of I-S explanation--relative to one thing, c explains e, relative to
another, it does not--distinguishes it in a fundamental way from D-N explanation.
As you know, the inferential view said that the explanans must confer a high probability
on the explanandum for the explanation to work; Salmon and other causal theorists
relaxed that requirement and only required that the explanans increase the probability of
the explanandum, i.e., that it be statistically relevant to the explanandum. Still, the
ambiguity remains. The probability that Jones will get leukemia is higher given that he
was located two miles away from an atomic blast when it occurred; but it is lowered
again when it is added that he had on lead shielding that completely blocked the effects of
any radiation that might be in the area. This led Hempel, and Salmon, too, to add that the
explanation in question must refer to statistical laws stated in terms of a "maximally"
specific reference class (i.e., the class named in the "given" clause) to be an explanation.
In other words, it is required that dividing the class C further into C1, C2, and so on
would not affect the statistics, in that pr(E|Ci) = pr(E|Cj). This can be understood in two
ways, either "epistemically," in terms of the information we have at our disposal, of
"objectively," in terms of the actual "objective" probabilities in the world. (Hempel only
recognized the former.) If our reference class can't be divided ("partitioned") into cells
that give different statistics for E, then we say that the class is "homogeneous" with
respect to E. The homogeneity in question can be either epistemic or objective: it must be
the latter if we are really talking about causes rather than what we know about causes.
The problem with this is that dividing up the class can result in trivialization. For
example, a subclass of the class of people who receive penicillin to treat their pneumonia
(P) is the class of those people who recover (R). Obviously, it is always the cause that
pr(|P&R) = 1. However, this type of statistical law would not be very illuminating to use
in a statistical explanation of why the person recovered from pneumonia.
There are various ways around this problem. For example, you can require that the
statistical law in question not be true simply because it is a theorem of the probability
calculus (which was the case with pr(R|P&R) = 1). Hempel used this clause in his
analysis of I-S explanation. Salmon adds that we should further restrict the clause by
noting that the statistical law in question not refer to a class of events that either (1)
follow the explanandum temporally, or (2) cannot be ascertained as true or false in
principle independently of ascertaining the truth or falsity of the explanandum. The first
requirement is used to block explanations of John's recovery that refer to the class of
people who are reported on the 6:00 news to have recovered of pneumonia (supposing
John is famous enough to merit such a report). This is the requirement of maximal
specificity (Hempel) or that the reference class be statistically homogeneous (Salmon).
Of course, as we mentioned earlier, there might be many correlations that exist in the
world between accidental events, such as that someone in Laos sneezes (S) whenever a
person here recovers from pneumonia (R), so that we have pr(R|P&S) > pr(R|P). (Here
the probabilities might simply be a matter of the actual, empirical frequencies.) If this
were the case, however, we would not want to allow the statistical law just cited to occur
in a causal explanation, since it may be true simply by "accident." We might also demand
that there be causal processes linking the two events. That's why Salmon was concerned
to add that the causal processes that link the two events must be specified in a causal
explanation. The moral of this story is two-fold. Statistical relevance is not enough, even
when you divide up the world in every way possible. Also, some ways of dividing up the
world to get statistical relevance are not permissible.
What Kinds Of Causal Information Can Be Cited In An Explanation?
Salmon said that there were two types of causal information that could be cited in a
causal explanation, which we described as conjunctive and interactive forks. Salmon's
purpose here is to analyze a type of explanation that is commonly used in science, but the
notion of causal explanation can be considered more broadly than he does. For example,
Lewis points out that the notion of a causal explanation is quite fluid. In his essay on
causal explanation, he points out that there is an extremely rich causal history behind
every event. (Consider the drunken driving accident case.) Like Salmon, Lewis too
argues that to explain an event is to provide some information about its causal history.
The question arises, what kind of information? Well, one might be to describe in detail a
common cause of the type discussed by Salmon. However, there might be many
situations in which we might only want a partial description of the causal history (e.g., we
are trying to assign blame according to the law, or we already know a fair chunk of the
causal history and are trying to find out something new about it, or we just want to know
something about the type of causal history that leads to events of that sort, and so on).
Question: How far back in time can we go to statistically relevant events? Consider the
probability that the accident will occur. Relevant to this is whether gas was available for
him to drive his car, whether he received a driver's license when he was young, or even
whether he was lived to the age that he did. All of these are part of the "causal history"
leading up to the person having an accident while drunk, but we would not want to cite
any of these as causing the accident. (See section, "Explaining Well vs. Badly.")
Something to avoid is trying to make the distinction by saying that the availability of gas
was not "the" cause of the person's accident. We can't really single out a given chunk of
the causal history leading up to the event to separate "the cause." Lewis separates the
causal history--any portion of which can in principle be cited in a given explanation--
from the portion of that causal history that we are interested in or find most salient at a
given time. We might not interested in information about any portion of the causal
history, Lewis says, but it remains the case that to explain and event is to give
information about the causal history leading up to that event.
In addition, Lewis points out that the range of ways of giving information about causal
histories is quite broad. For example, Lewis allow negative information about the causal
history to count as an explanation (there was nothing to prevent it from happening, there
was no state for the collapsing star to get into, there was no connection between the CIA
agent being in the room and the Shah's death, it just being a coincidence, and so on). To
explain is to give information about a causal history, but giving information about a
causal history is not limited to citing one or more causes of the event in question.
Lecture 6
2/17/94
Problems With The Causal Theory Of Explanation
Last time we finished our exploration of the content of causal theories of explanation,
including the kinds of caveats that have to be added to make the theory workable. Today,
we will examine whether the causal approach as a whole is plausible, and examine an
alternative view, namely van Fraassen's pragmatic theory of explanation. (Note
Kourany's use of the term "erotetic.") As I stated at the end of the period last time, there
are two basic challenges that can be given to the causal view, namely that sometimes
non-causal generalizations can explain, and that laws can be explained by other laws (a
relationship that does not seem to be causal, since laws don't cause other laws, since
neither is an event).
(1) Non-causal generalizations - Suppose that someone was ignorant of the various gas
laws, or once learned the laws and has now forgotten them.
General Law: PV = nRT
Boyle's Law: At constant temperature, pressure is inversely proportional to
volume, i.e., PV = constant.
Charles' Law: At constant pressure, volume is directly proportional to
temperature, i.e., V/T = constant.
Pressure Law: At constant volume, pressure is directly proportional to
temperature, i.e., P/T = constant.
They wonder why it is that a certain container containing gas expands when heated. You
could then give various answers, such as that the pressure was constant, and then cite
Charles's law to finish the explanation. Alternately, you might say that there was a more
complex relationship, where the pressure increased along with the temperature, but that
the increase in pressure was not enough to compensate for the increase in temperature to
that the volume had to rise too, according to the general gas law. Q: Is this an
explanation? We have to distinguish whether it's the "ultimate" or "fundamental"
explanation of the phenomenon in question or whether it's an explanation of the
phenomenon.
Example from statics: an awkwardly posed body is held in place by a transparent rod.
Why is it suspended in that odd way? Well, there's a rod here, and that compensates for
the force of gravity.
Example from Lewis, the collapsing star: Why did it stop? Well, there is no causal story
that we can tell, other than by giving "negative" information: there was no state for it to
get into if it collapsed further, because of Pauli Exclusion Principle. (Here identical
fermions cannot have the same quantum numbers, n, l, m, and ms.) Here PEP is not
causing the collapse to stop; it just predicts that it will stop. Lewis claims that the reason
this is explanatory is that it falls into the "negative information" category. The reason that
the star stopped collapsing is that there was no physically permissible state for it to get
into. This is information about its causal history, in that it describes the terminal point of
that history.
(2) Explanation of Laws by Laws. Newton explained Kepler's laws (ellipses, equal areas
in equal times, p
2
= d
3
) by deriving them from his laws of motion and gravitation (inertia,
F = ma, action-reaction, and F = Gm
1
m
2
/r
2
). This is the kind of explanation to which the
inferential view is especially well suited (as well as the pragmatic view that we will
consider next time), but it does not fit immediately into the causal view of explanation
(Newton's laws don't cause Kepler's laws to be true.) That is because the causal view of
explanation seems best-suited for explaining particular events, rather than general
regularities. Lewis responds to several objections to his theory by saying that his theory
does not intend to cover anything more than explanations of events. (Not a good answer
as is, since then we would not have a general theory of explanation, but only a description
of certain kinds of explanation.) However, he does have an answer at his disposal: one
way to give information about causal histories is to consider what is common to all causal
histories for events of a given type (e.g., a planet orbiting around a star according to
Kepler's laws). Q: Is this enough?
There are of course the other problems mentioned earlier: Salmon's view relies on the
notion of spatiotemporal continuous processes to explain the notion of causal processes.
(Lewis is not subject to this problem, since he has an alternative theory of causation:
linkage by chains of non-backtracking counterfactual dependence.)
Lecture 7
2/22/94
Van Fraassen's Pragmatic View Of Explanation
Van Fraassen's pragmatic view of explanation is that an explanation is a particular type of
answer to a why-question, i.e., an answer that provides relevant information that "favors"
the event to be explained over its alternatives. For van Fraassen, these features are
determined by the context in which the why-question is asked.
The Basic Elements Of The Pragmatic View Of Explanation
According to van Fraassen, a why-question consists of (1) a presupposition (Why X), (2)
a contrast class (Why X rather than Y, Z, and so on), and (3) an implicitly understood
criterion of relevance. Information given in response to a particular why-question
constitutes an explanation of the presupposition if the information is relevant and
"favors" the presupposition over the alternatives in its contrast class. (Explain and give
examples.)
Both the contrast class and the criterion of relevance are contextually determined, based
on interests of those involved. Subjective interests define what would count as an
explanation in that context, but then it's an objective matter whether that information
really favors the presupposition over the alternatives in its contrast class. (Explain and
give examples.)
Contrasts Between The Pragmatic And Causal Views Of Explanation
Any type of information can be counted as relevant (of course, it's a scientific
explanation if only information provided by science counts; however, there might
be different kinds of scientific explanation; not any old information will do).
Context (interests) determines when something counts as an explanation, vs. when we
would find an explanation interesting or salient. (According to Lewis, what makes
it an explanation is that it gives information about the causal history leading up to
a given event; whether we find that explanatory information interesting or salient
is another matter.)
Distinction: Pragmatic theory of explanation vs. theory of the pragmatics of
explanation. On the pragmatic view, God could never have a "complete"
explanation of an event, unless he had interests. (A mere description of the causal
history leading up to an event--even a complete one--is not an explanation of any
sort according to the pragmatic view.)
On the pragmatic view, asymmetries only exist because of the context; thus, they can be
reversed with a change in context. That is what van Fraassen's Tower example is
supposed to illustrate. (Recount the Tower example.) Lewis' Objection to the Tower
example: What is really doing the explaining is the intention of the Prince, and that's a
cause of the flagpole being that particular height. Discuss: Can you think of a story in
which the redshift would explain the galaxies moving away? Where human intention is
not possible, it seems difficult; this would seem to confirm Lewis' diagnosis of the Tower
story.
Lecture 8
2/24/94
Carnap vs. Popper
The idea is to say when we "test" scientific theories, and then form opinions regarding
those theories based on such tests. What does scientific investigation consist in, and what
are the rules governing it?
Induction - Scientific investigation is a type of inductive process, where we increase the
evidential basis for or against a particular theory without the evidence conclusively
establishing the theory. "Induction" has had various meanings in the past: in the
Renaissance, it was thought that the way to develop scientific theories was to examine all
the evidence you could and then extrapolate from that to form theories. (This was a
method of developing theories as well as a method of justifying them.) This was in
contrast to positing "hypotheses" about unobservable entities to explain the phenomena.
(Indeed, Newton was criticized for formulating such "hypotheses.") This view did not
survive, however, since it became apparent that you can't form theories in this way. Thus,
we have to understand "induction" differently (supposing that it is a useful concept at all).
Carnap is an inductivist, and in this respect he differs from Popper. However, both agree
(taking inspiration from Hume) that there is a serious problem with the justification of
"inductive inference." Carnap discusses it in terms of a puzzle about how we arrive at and
form opinions regarding laws. (Note that notion of a law that Carnap assumes is similar
to Hempel's.) Laws are universal statements (at least), hence apply to an at least
potentially infinite domain. However, our empirical data is always finite. (Consider ideal
gas law.) What does deductive logic give us to evaluate theories?
Suppose that h -> e1, e2, e3,.... If we show the truth any finite number of the ei, we
haven't established that h; however, if we show that one of these is false, h must also be
false.
Thus deductive logic cannot give us the tools to establish a scientific theory; we cannot
"infer" from evidence to theory. There are different ways in which this can be so. Carnap
distinguishes between empirical and theoretical laws. The latter refer to unobservable
entities or properties; theoretical laws are absolutely necessary in science, but they cannot
simply be derived from large bodies of research. Science postulates new categories of
things not just generalizations about regularities of things we can observe. (Consider the
kinetic theory of gases.) The upshot is that theories can always be wrong, no matter how
much evidence we've found that is consistent with those theories. Scientific theories are
not "proven" in the sense that given that a certain body of empirical data they are immune
from all refutation. Q: How can we form an opinion regarding theories? Moreover, what
kind of opinion should we form?
Both of these authors assume that the answer to this question will be in the form of a
"logic of scientific discovery." (Logic = formal system of implication, concerned with
relations of implication between statements, but not relations of statements to the world,
i.e., whether they are true or false). Indeed, this is the title of Popper's famous book. The
point on which they differ is the following: Is deductive logic all that we're limited to in
science? (Popper - yes, no inductive logic; Carnap - No, there's an "inductive logic" too,
which is relevant to scientific investigation.)
Popper and Carnap also agree on is that there is a distinction between the contexts of
justification and discovery. This is a contentious view, as we'll see when we start looking
at Kuhn and Laudan. The traditional approach to induction assumed that gathering
evidence would enable one to formulate and justify a theory. (Consider Popper's example
of a person who gave all his "observations" to the Royal Society. Also, consider the
injunction to "Observe!") Both Carnap and Popper distinguish the two: what they were
concerned with is not how to come up with scientific hypotheses--this is a creative
process (e.g., the formulation of a theoretical framework or language), not bound by rules
of logic but only (perhaps) laws of psychology--but how to justify our hypotheses once
we come up with them. Carnap's answer is that we can't "prove" our hypotheses but we
can increase (or decrease) their probability by gathering evidence. (This is inductive since
you proceed to partial belief in a theory by examining evidence.)
Let's make this a bit more precise: is induction a kind of argument? That is, is there such
a thing as "inductive inference?" This is one view of inductive logic: a formal theory
about relations of (partial) implication between statements. Can be thought of in two
ways. (1) You accept a conclusion (all-or-nothing) based on evidence that confirms a
theory to a certain degree (e.g., if sufficiently high). (2) You accept a conclusion to a
certain degree (i.e., as more or less probable) based on certain evidence. The latter is
what Carnap's approach involves. The basic notion there is degree of confirmation. What
science does when it "justifies" a theory (or tests it) is to provide evidence that confirms
or disconfirms it to a certain degree, i.e., makes it more or less probable than it was
before the evidence was considered.
We've already made informal use of the notion of probability. However, the precise sense
of the term turns out to be of importance when thinking about induction in Carnap's
sense.
Frequentist (Statistical)
Inductive (Logical)
Objective (Physical Propensity)
Subjective (Personal Degree of Belief)
Carnap thought that the Logical notion was the one operative in scientific reasoning.
Analogy with deductive logic: formal, relation between statements because of their
formal properties (i.e., irrespective of what facts obtain). Disanalogy: No acceptance or
belief: pr(h|e) = x means that e partially implies h, to degree x. Only partial belief, guided
by the logical probabilities. The latter is a matter of the logical relationship between the
two statements; it should guide our opinion (degrees of belief), but it does not on
Carnap's view reduce to it. Scientific reasoning is then the formulation of a broad
framework (language) in which theories can be expressed; inherent in that framework
will be relations of partial implication (conditional logical probabilities) between
evidence and hypotheses, i.e., pr(h|e). Evidence confirms the theory if pr(h|e) > pr(h) (and
if pr(h|e) is sufficiently high). Then making an observation involves determining whether
e is true; we seek to develop and confirm theories in this way.
Popper rejected this framework altogether. Popper thought that Carnap's inductive logic--
indeed, the very idea of inductive logic--was fundamentally mistaken. (How do we
establish our starting point in Carnap's theory, i.e., the logical probabilities? It didn't work
out very well.) There is no such thing: the only logic available to science is deductive
logic. (Note that this doesn't mean that you can't use statistics. Indeed, you'll use it
frequently, though the relationship between statistical laws and statistical data will be
deductive.) What is available then? Well, we can't verify a theory (in the sense of
justifying belief or partial belief in a hypothesis by verifying its predictions), but we can
certainly falsify a theory using purely deductive logic. If h Þ e1, e2, e3, ..., then if even
one of e1, e2, e3, ... turn out to be false, the theory as a whole is falsified, and must be
rejected (unless it can be amended to account for the falsity of the falsifying evidence).
Thus, Popper argues that science proceeds ("advances") by a process of conjecture and
refutation. He summarizes several important features of this process as follows (page
141).
• It's too easy to get "verifying" evidence; thus, verifying evidence is of no intrinsic
value.
• To be of any use, predictions should be risky.
• Theories are better (have more content) the more they restrict what can happen.
• Theories that are not refutable by some possible observation are not scientific
(criterion of demarcation).
• To "test" a theory in a serious sense is to attempt to falsify it.
• Evidence on "corroborates" a theory if it is the result of a serious ("genuine") test.
The process is then to start with a conjecture and try to falsify it; it that succeeds, move
on to the next conjecture, and so on, until you find a conjecture that you do not falsify.
Keep on trying, though. If you have trouble falsifying it, you say that it has been
"corroborated." This does not mean that it has a high degree of probability, however. It
still may be improbable, given the evidence at hand. (Indeed, it should be improbable if it
says anything of interest.) We only tentatively accept scientific theories, while continuing
to try to refute them. (Here "tentative acceptance" does not mean to believe that they are
true, or even to be highly confident in their truth.)
Lecture 9
3/1/94
An Overview Of Kuhn's The Structure Of Scientific Revolutions
Today we will be looking at Kuhn's The Structure of Scientific Revolutions (SSR) very
broadly, with the aim of understanding its essentials. As you can gather from the title of
Kuhn's book, he is concerned primarily with those episodes in history known as
"scientific revolutions." During periods of this sort, our scientific understanding of the
way the universe works is overthrown and replaced by another, quite different
understanding.
According to Kuhn, after a scientific discipline matures, its history consists of long
periods of stasis punctuated by occasional revolutions of this sort. Thus, a scientific
discipline goes through several distinct types of stages as it develops.
I. The Pre-Paradigmatic Stage
Before a scientific discipline develops, there is normally a long period of somewhat
inchoate, directionless research into a given subject matter (e.g., the physical world).
There are various competing schools, each of which has a fundamentally different
conception of what the basic problems of the discipline are and what criteria should be
used to evaluate theories about that subject matter.
II. The Emergence Of Normal Science
Out of the many competing schools that clutter the scientific landscape during a
discipline's pre-paradigmatic period, one may emerge that subsequently dominates the
discipline. The practitioners of the scientific discipline rally around a school that proves
itself able to solve many of the problems it poses for itself and that holds great promise
for future research. There is typically a particular outstanding achievement that causes the
discipline to rally around the approach of one school. Kuhn calls such an achievement a
"paradigm."
A. Two Different Senses Of "Paradigm"--Exemplar And Disciplinary Matrix
Normal science is characterized by unanimous assent by the members of a scientific
discipline to a particular paradigm. In SSR, Kuhn uses the term paradigm to refer to two
very different kinds of things.
1. Paradigms As Exemplars
Kuhn at first uses the term "paradigm" to refer to the particular, concrete achievement
that defines by example the course of all subsequent research in a scientific discipline. In
his 1969 postscript to SSR, Kuhn refers to an achievement of this sort as an "exemplar."
Among the numerous examples of paradigms Kuhn gives are Newton's mechanics and
theory of gravitation, Franklin's theory of electricity, and Copernicus' treatise on his
heliocentric theory of the solar system. These works outlined a unified and
comprehensive approach to a wide-ranging set of problems in their respective disciplines.
As such, they were definitive in those disciplines. The problems, methods, theoretical
principles, metaphysical assumptions, concepts, and evaluative standards that appear in
such works constitute a set of examples after which all subsequent research was
patterned. (Note, however, that Kuhn's use of the term "paradigm" is somewhat
inconsistent. For example, sometimes Kuhn will refer to particular parts of a concrete
scientific achievement as paradigms.)
2. Paradigms As Disciplinary Matrices
Later in SSR, Kuhn begins to use the term "paradigm" to refer not only to the concrete
scientific achievement as described above, but to the entire cluster of problems, methods,
theoretical principles, metaphysical assumptions, concepts, and evaluative standards that
are present to some degree or other in an exemplar (i.e., the concrete, definitive scientific
achievement). In his 1969 postscript to SSR, Kuhn refers to such a cluster as a
"disciplinary matrix." A disciplinary matrix is an entire theoretical, methodological, and
evaluative framework within which scientists conduct their research. This framework
constitutes the basic assumptions of the discipline about how research in that discipline
should be conducted as well as what constitutes a good scientific explanation. According
to Kuhn, the sense of "paradigm" as a disciplinary matrix is less fundamental that the
sense of "paradigm" as an exemplar. The reason for this is that the exemplar essentially
defines by example the elements in the framework that constitutes the disciplinary
matrix.
B. Remarks On The Nature Of Normal Science
1. The Scientific Community
According to Kuhn, a scientific discipline is defined sociologically: it is a particular
scientific community, united by education (e.g., texts, methods of accreditation),
professional interaction and communication (e.g., journals, conventions), as well as
similar interests in problems of a certain sort and acceptance of a particular range of
possible solutions to such problems. The scientific community, like other communities,
defines what is required for membership in the group. (Kuhn never completed his
sociological definition of a scientific community, instead leaving the task to others.)
2. The Role Of Exemplars
Exemplars are solutions to problems that serve as the basis for generalization and
development. The goal of studying an exemplar during one's scientific education is to
learn to see new problems as similar to the exemplar, and to apply the principles
applicable to the exemplar to the new problems. A beginning scientist learns to abstract
from the many features of a problem to determine which features must be known to
derive a solution within the theoretical framework of the exemplar. Thus, textbooks often
contain a standard set of problems (e.g., pendulums, harmonic oscillators, inclined plane
problems). You can't learn a theory by merely memorizing mathematical formulas and
definitions; you must also learn to apply these formulas and definitions properly to solve
the standard problems. This means that learning a theory involves acquiring a new way of
seeing, i.e., acquiring the ability to group problems according to the theoretical principles
that are relevant to those problems. The "similarity groupings" of the mature scientist
distinguish her from the scientific neophyte.
3. Normal Science As "Puzzle-Solving"
According to Kuhn, once a paradigm has been accepted by a scientific community,
subsequent research consists of applying the shared methods of the disciplinary matrix to
solve the types of problems defined by the exemplar. Since the type of solution that must
be found is well defined and the paradigm "guarantees" that such a solution exists
(though the precise nature of the solution and the path that will get you to a solution is
often not known in advance), Kuhn characterizes scientific research during normal or
paradigmatic science as "puzzle-solving."
III. The Emergence Of Anomaly And Crisis
Though the paradigm "guarantees" that a solution exists for every problem that it poses, it
occasionally happens that a solution is not found. If the problem continues to persist after
repeated attempts to solve it within the framework defined by the paradigm, scientists
may become acutely distressed and a sense of crisis may develop within the scientific
community. This sense of desperation may lead some scientists to question some of the
fundamental assumptions of the disciplinary matrix. Typically, competing groups will
develop strategies for solving the problem, which at this point has become an "anomaly,"
that congeal into differing conceptual "schools" of thought much like the competing
schools that characterize pre-paradigmatic science. The fundamental assumptions of the
paradigm will become subject to widespread doubt, and there may be general agreement
that a replacement must be found (though often many scientists continue to persist in
their view that the old paradigm will eventually produce a solution to the apparent
anomaly).
IV. The Birth And Assimilation Of A New Paradigm
Eventually, one of the competing approaches for solving the anomaly will produce a
solution that, because of its generality and promise for future research, gains a large and
loyal following in the scientific community. This solution comes to be regarded by its
proponents as a concrete, definitive scientific achievement that defines by example how
research in that discipline should subsequently be conducted. In short, this solution plays
the role of an exemplar for the group--thus, a new paradigm is born. Not all members of
the scientific community immediately rally to the new paradigm, however. Some resist
adopting the new problems, methods, theoretical principles, metaphysical assumptions,
concepts, and evaluative standards implicit in the solution, confident in their belief that a
solution to the anomaly will eventually emerge that preserves the theoretical,
methodological, and evaluative framework of the old paradigm. Eventually, however,
most scientists are persuaded by the new paradigm's growing success to switch their
loyalties to the new paradigm. Those who do not may find themselves ignored by
members of the scientific community or even forced out of that community's power
structure (e.g., journals, university positions). Those who hold out eventually die. The
transition to the new paradigm is complete.
Lecture 10
3/3/94
Paradigms and Normal Science
Last time we examined in a very general way Kuhn's account of the historical
development of particular scientific disciplines. To review, Kuhn argued that a scientific
discipline goes through various stages: the pre-paradigmatic, paradigmatic ("normal"),
and revolutionary (transitional, from one paradigm to another). Each stage is
characterized in terms of the notion of a paradigm, so it is highly important that we
discuss this notion in detail. Today, we will limit ourselves primarily to the context of the
transition from pre-paradigmatic science to "normal" (paradigm-governed) science.
Paradigms
Let's look a bit a Kuhn's characterization of the notion of paradigm. He introduces
paradigms first as "universally recognized scientific achievements that for a time provide
model problems and solutions to a community of practitioners" (page x). A paradigm is
"at the start largely a promise of success discoverable in selected and still incomplete
examples" (pages 23-24), and it is "an object for further articulation and specification
under new or more stringent conditions" (page 23); hence from paradigms "spring
particular coherent traditions of scientific research" (page 10) that Kuhn calls "normal
science." Normal science consists primarily of developing the initial paradigm "by
extending the knowledge of those facts that the paradigm displays as particularly
revealing, by increasing the extent of the match between those facts and the paradigm's
predictions, and by further articulation of the paradigm itself" (page 24). The paradigm
provides "a criterion for choosing problems that, while the paradigm is taken for granted,
can be assumed to have solutions" (page 27). Those phenomena "that will not fit the box
are often not seen at all" (page 24). Normal science "suppresses fundamental novelties
because they are necessarily subversive of its basic commitments." Nevertheless, not all
problems will receive solutions within the paradigm, even after repeated attempts, and so
anomalies develop that produce "the tradition-shattering complements to the tradition-
bound activity of normal science" (page 6).
At the outset of SSR, we are told that a paradigm is a concrete achievement that provides
a model for subsequent research. Such achievements are referred to in textbooks,
lectures, and laboratory exercises; typically, they are the standard problems that a student
is required to solve in learning the discipline. The task given the student provides some of
the content of the mathematical equations (or more generally, theoretical descriptions)
that comprise the main body of the text. In other parts of SSR, however, we are told that a
paradigm is much more, e.g., that it includes law, theory, application, and instrumentation
together (page 10); or that it is a set of commitments of various sorts, including
conceptual, theoretical, instrumental, methodological, and quasi-metaphysical
commitments (pages 41-42). Paradigms sometimes are characterized as definitive,
concrete patterns or models for subsequent research, but at other times seem to be
characterized as vague theories or theory schemas to be subsequently articulated. In its
broadest sense, the paradigm is taken to included theories, laws, models, concrete
applications (exemplars--"paradigms" in the narrower sense), explicit or implicit
metaphysical beliefs, standard for judging theories, and particular sets of theoretical
values. In short, anything that is accepted or presupposed by a particular scientific
community can seemingly be part of a "paradigm."
There is no doubt that all these elements are present in science. The question is whether it
is informative to characterize science in this way. A basic problem is as follows: is
"paradigm" is defined in its broadest sense, where anything that is accepted or
presupposed by a scientific community is part of the "paradigm" that defines that
community, then it is a relatively trivial matter to say that there are paradigms. (That is,
it's not really a substantive historical thesis to say that scientific eras are defined by the
universal acceptance of a paradigm if a paradigm is simply defined as whatever is
universally accepted.)
In his 1969 Postscript to SSR, Kuhn recognizes these problems and distinguishes
between two senses of "paradigm" as used in SSR: a "disciplinary matrix" (framework)
and an "exemplar." The former (disciplinary matrix) is the entire framework--conceptual,
methodological, metaphysical, theoretical, and instrumental--assumed by a scientific
tradition. The latter (exemplars) are the concrete, definitive achievements upon which all
subsequent research is patterned. Kuhn's thesis about paradigms is not empty, since he
argues that the definitive, concrete achievement ("paradigm" in the narrow sense)
provides the foundation of the disciplinary matrix ("paradigm" in the broader sense). In
other words, a scientific tradition is defined not by explicitly stated theories derived by
explicit methodological rules, but by intuitive abstraction from a particular, concrete
achievement. Let's look at this in more detail.
The function of textbook examples - Textbook examples provide much of the content of
the mathematical or theoretical principles that precede them. Here's a common
phenomenon: you read the chapter, think you understand, but can't do the problems. This
doesn't evince any failure on your part to understand the text, simply shows that
understanding the theory cannot be achieved simply by reading a set of propositions.
Learning the theory consists in applying it (it's not that you learn the theory first and then
learn to apply it). In other words, knowledge of a theory is not always propositional
knowledge (knowledge that--accumulation of facts); it is sometimes procedural or
judgmental knowledge (knowledge how--acquiring judgmental skills). Scientists agree on
the identification of a paradigm (exemplar), but not necessarily on the full interpretation
or rationalization of the paradigm (page 44).
Scientific training is a process of incorporation into a particular community - The goal of
training is to get the student to see new problems as like the standard problems in a
certain respect: to group the problems that he will be faced with into certain similarity
classes, based on the extent to which they resemble the concrete, standard exemplars.
Being able to group things in the right way, and to attack similar problems using methods
appropriate for solving that particular type of problem evinces understanding and so
incorporation into the community. Kuhn's thesis is that it does not evince the
internalization of methodological rules explicit or implicit in scientific procedure. (Here a
"rule" is understood as a kind of statement about how to proceed in certain
circumstances.) That is not to say that none, or even most, of scientific practice can be
codified into rules; it is just that rules of how to proceed are not required, nor does
scientific competence (in the sense of judgmental skills) consist in the acquisition of rules
(statements about how to proceed) and facts (statements about the way the world is). This
distinguishes him from Carnap and Popper, who see at least the "justification" of
scientific theories are a rule-governed procedure.
Thus, to adopt a paradigm is to adopt a concrete achievement as definitive for the
discipline. (Example: Newton's Principia, with its application to particular problems such
as the tides, the orbits of the planets, terrestrial motion such as occurs with projectiles,
pendulums, and springs, and so on.) It's definitive in the sense that the methods that were
used, the result that was obtained, and the assumptions behind the methods (mathematical
generalizations, techniques for relating the formalism to concrete situations, etc.). The
discipline then grows by extending these procedures to new areas--the growth is not
simply one of "mopping up" however (despite Kuhn's own characterization of it as such),
but rather an extension of the disciplinary matrix (framework) by organic growth. (No
sufficient & necessary conditions; instead, a family resemblance.)
Normal Science
Pre-Paradigmatic Science - Kuhn's model is the physical sciences; almost all of his
examples are taken from Astronomy, Physics, or Chemistry. This affects his view of
science. Consider, on the other hand, Psychology, Sociology, Anthropology, are they
"pre-paradigmatic"? Is there any universally shared framework? If so, are they sciences at
all?
ï Some properties - each person has to start anew from the foundations (or at least, each
subgroup); disagreement over fundamentals (what counts as an interesting problem, what
methods should be used to solve the problem, what problems have been solved); in this
context, all facts seem equally relevant:
In the absence of a paradigm or some candidate for a paradigm, all of the facts that could
possibly pertain to the development of a given science are likely to seem equally relevant.
As a result, early fact-gathering is a far more nearly random activity than the one that
subsequent scientific research makes familiar ... [and] is usually restricted to the wealth
of data that lie ready to hand (page 15).
Thus, some important facts are missed (e.g., electrical repulsion).
Why accept a paradigm?
• It solves a class of important though unsolved problems.
• The solution has great scope and so holds much promise for generating further
research. That is, it must be both broad enough and incomplete enough to provide
the basis for further research: "it is an object for further articulation and
specification under new or more stringent conditions" (page 23).
• It gives direction and focus to research, i.e., selects out a feasible subset of facts,
possible experiments as promising or relevant. It says what kinds of articulations
are permissible, what kinds of problems have solutions, what kind of research is
likely to lead to a solution, and the form an adequate solution can take. A puzzle:
the outcome is known, the path that one takes to it is not (also, it requires
ingenuity to get there).
Important to recognize that the form of the solution is not exactly specified, but only the
categories in which it will be framed and the set of allowable paths that will get one there.
However, this again is not a rule-bound activity, but imitation (and extension) of a
pattern. "Normal science can proceed without rules only so long as the relevant scientific
community accepts without question the particular problems-solutions already achieved.
Rules should therefore become important and the characteristic unconcern about them
should vanish whenever paradigms or models are felt to be insecure" (page 47).
Discuss: What level of unanimity exists in present sciences? Aren't there disagreements
even in physics?
Lecture 11
3/8/94
Anomaly, Crisis, and the Non-Cumulativity of Paradigm Shifts
Last time we discussed the transition from pre-paradigmatic science to paradigmatic or
"normal" science, and what that involves. Specifically, we discussed the nature and role
of a paradigm, distinguishing between the primary, narrow sense of the term (an
exemplar, i.e., a definitive, concrete achievement) and a broader sense of the term (a
disciplinary matrix or framework, which includes conceptual, methodological,
metaphysical, theoretical, and instrumental components). We also noted that though
Kuhn usually talks about each scientific discipline (as distinguished roughly by academic
department) as having its own paradigm, the notion is more flexible than that and can
apply to sub-disciplines (such as Freudian psychoanalysis) within a broader discipline
(psychology). The question of individuating paradigms is a difficult one (though Kuhn
sometimes speaks as if it's a trivial matter to identify a paradigm), but we will not
investigate it any further. Now we want to turn to how paradigms lead to their own
destruction, by providing the framework necessary to discover anomalies, some of which
lead to the paradigm's downfall.
As you recall, normal science is an enterprise of puzzle-solving according to Kuhn.
Though the paradigm "guarantees" that the puzzles it defines have solutions, this is not
always the case. Sometimes puzzles cannot admit of solution within the framework
(disciplinary matrix) provided by the paradigm. For example, the phlogiston theory of
combustion found it difficult to explain the notion of weight gain when certain substances
were burned or heated. Since combustion was the loss of a substance on that view, there
should be weight loss. This phenomenon was not taken to falsify the theory, however
(contrary to what Popper might claim); instead, phlogiston theorists attempted to account
for the difference by postulating that phlogiston had "negative mass," or that "fire
particles" sometimes entered an object upon burning. The paradigm eventually collapsed
for several reasons:
• None of the proposed solutions found general acceptance; i.e., there was a
multiplicity of competing solutions to the puzzle presented by the anomaly
(weight gain).
• The proposed solutions tended to create as many problems as they solved; one
method of explaining weight gain tended to imply that there would be weight gain
in cases in which there was none.
• This led to a sense of crisis among many practitioners in the field; the field begins
to resemble the pre-paradigmatic stage (i.e., nearly random attempts at
observation, experimentation, theory formation), signaling the demise of the old
paradigm. Nevertheless, it is not abandoned: to abandon it would be to abandon
science itself, on Kuhn's view.
• Eventually, a competitor arose that seemed more promising than any of the other
alternatives, but which involved a substantial conceptual shift. This was the
oxygen theory of combustion.
Paradigm Change as Non-Cumulative
There is a common but oversimplified picture of science that sees it as a strictly
cumulative enterprise (science progresses by finding out more and more about the way
the world works). The "more and more" suggests that nothing is lost. Kuhn argues that on
the contrary there are substantial losses as well as gains when paradigm shifts occur. Let's
look at some examples.
• Some problems no longer require solution, either because they make no sense in
the new paradigm, or they are simply rejected.
• Standards for evaluating scientific theories alter along with the problems that a
theory must according to the paradigm solve.
Example: Newtonian physics introduced an "occult" element (forces), against the
prevailing view of the corpuscular view that all physical explanation had to be in terms of
collisions and other physical interactions between particles. Newton's theory did not
accord with that standard, but it solved many outstanding problems. (Corpuscular view
could not explain other than in a rough qualitative way why planets would move in
orbits; Kepler's Laws were separate descriptions of why this was so. Thus it was a great
achievement when postulating forces led to the derivation of Kepler's Laws.) "Forces"
were perceived by many as odd, "magical" entities. Newton himself tried to develop a
corpuscular theory of gravitation, without success, as did many Newtonian scientists who
followed him. Eventually, when it became apparent that the effort was futile, the standard
that corpuscular-mechanistic explanation was required was simply disregarded, and
gravitational attraction was accepted as an intrinsic, unexplainable property of matter.
Another example: Chemistry before Dalton aimed at explaining the sensory qualities of
compounds (colors, smells, sounds, etc.). Dalton's atomic paradigm was only suited for
explaining why compounds went together in certain proportions; so when it was
generally accepted the demand that a theory explain sensory qualities was dropped.
Other examples: (a) Phlogiston. The problems of explaining how phlogiston combined
with calxes to form particular metals were abandoned when phlogiston itself was
abandoned. The ether (medium for electromagnetic radiation); explaining why movement
through the ether wasn't detectable vanished along with the ether. There simply was no
problem to be solved. (b) Michelson-Morley experiment. This experiment was first
"explained" by Lorenz based on his theory of the electron, which implied that since
forces holding together matter were electromagnetic and hence influenced by movement
through the ether, parts of bodies contract in the direction of motion when moving
through the ether. Relativity explains the contraction, but in a new conceptual framework
that does not include the ether.
• Some new entities are introduced along with the paradigm; indeed, only make
sense (i.e., can be conceptualized) when introduced as such. (Oxygen wasn't
"discovered" until the oxygen theory of combustion was developed.)
• Elements that are preserved may have an entirely different status. (The constant
speed of light is a postulate in special relativity theory, a consequence of
Maxwell's theory.) Substantive conceptual theses might be treated as "tautologies"
(e.g., Newton's Second Law).
The fact that new standards, concepts, and metaphysical pictures are introduced makes
the paradigms not only incompatible, but also "incommensurable." Paradigm shift is a
shift in worldview.
Lecture 12
3/10/94
Incommensurability
At the end of the lecture last time, I mentioned Kuhn's view that different paradigms are
incommensurable, i.e., that there is no neutral standpoint from which to evaluate two
different paradigms in a given discipline. To put the matter succinctly, Kuhn argues that
different paradigms are incommensurable (1) because they involve different scientific
language, which express quite different sorts of conceptual frameworks (even when the
words used are the same), (2) because they do not acknowledge, address, or perceive the
same observational data, (3) because they are not concerned to answer the same
questions, or resolve the same problems, and (4) they do not agree on what counts as an
adequate, or even legitimate, explanation.
Many authors took the first sense of incommensurability (linguistic, conceptual) to be the
primary one: the reason scientists differed with regard to the paradigms, and there was no
neutral standpoint to decide the issues between the two paradigms, is that there is no
language (conceptual scheme) in which the two paradigms can be stated. That is why the
two sides "inevitably talk past each other" during revolutionary periods. Kuhn seems to
assume that because two theories differ in what they say mass is like (it is conserved vs. it
is not, and exchangeable with energy) that the term "mass" means something different to
the two sides. Thus, there is an assumption implicit in his argument about how theoretical
terms acquire meaning, something like the following.
The two sides make very different, incompatible claims about mass.
The theoretical context as a whole (the word and its role within a paradigm)
determines the meaning of a theoretical or observational term.
The two sides mean something different by "mass."
On this interpretation of Kuhn, the two sides of the debate during a revolution talk past
each other because they're simply speaking different (but very similar sounding)
languages. This includes not only the abstract terms like "planet" or "electron," but
observational terms like "mass," "weight," "volume" and so on. This contrasts with the
older ("positivist") view espoused by Carnap, for example, which held that there was a
neutral observational language by which experimental results could be stated when
debating the merits (or deficiencies) of different candidates for paradigm status. The two
scientists might disagree on whether mass is conserved, but agree on whether a pointer on
a measuring apparatus is in a certain position. If one theory predicts that the pointer will
be in one location, whereas other predicts it will be in a different location, the two cannot
both be correct and so we then need only check to see which is right. On Kuhn's view (as
we are interpreting him here), this is a naive description, since it assumes a sharp
dichotomy between theoretical and observational language. (On the other hand, if his
view is simply holistic, so that any change in belief counts as a conceptual change, it is
implausible, possibly vacuous.)
This interpretation of Kuhn makes him out to be highly problematic: if the two groups are
talking about different things, how can they really conflict or disagree with one another?
If one group talks about "mass" and another about "mass*" it's as if the two are simply
discussing apples and oranges. Theories are on this interpretation strictly incomparable.
Newtonian and relativistic mechanics could not be rivals. This is important since Kuhn
himself speaks as though the anomalies are the deciding point between the two theories:
one paradigm cannot solve it, the other can. If they are dealing with different empirical
data, then they're not even trying to solve the same thing.
The problem becomes more acute when you consider remarks that Kuhn makes that seem
to mean that the conceptual scheme (paradigm) is self-justifying, so that any debate
expressed in the different languages of the two groups will necessarily be circular at some
point. That is, there is no compelling reason to accept one paradigm unless you already
accept that paradigm: reasons to accept a paradigm are conclusive only within that
paradigm itself. If that is so, however, what reason could anyone have to give up the old
paradigm? (In addition, it cannot be literally correct that the paradigm is self-justifying,
since otherwise there would be no anomalies.)
An alternative view would reject the thesis that the incommensurability of scientific
concepts or language is the primary one; rather it is the incommensurability between
scientific problems. That is, if the two paradigms view different problems as demanding
quite different solutions, and accept different standards for evaluating proposed solutions
to those problems, they may overlap conceptually to a large degree, enough to disagree
and be rivals, but still reach a point at which the disagreement cannot be settled by appeal
to experimental data or logic. "When paradigms change," he says, "there are usually
significant shifts in the criteria determining both the legitimacy of problems and of
proposed solutions...." "To the extent ... that two scientific schools disagree about what is
a problem and what is a solution, they will inevitably talk through each other when
debating the relative merits of their respective paradigms." The resulting arguments will
be "partially circular." "Since no paradigm ever solves all the problems it defines and
since no two paradigms leave all the same problems unsolved, paradigm debates always
involve the question: which problems is it more significant to have solved?"
On this view, what makes theories "incommensurable" with each other is that they differ
on their standards of evaluation; this difference is the result of their accepting different
exemplars as definitive of how work in that discipline should proceed. They are, indeed,
making different value judgments about research in their discipline.
How are different value judgments resolved? One focus of many critics has been Kuhn's
insistence to compare scientific revolutions with political or religious revolutions, and
with paradigm change as a kind of "conversion." Since conversion is not a rational
process, it is argued, then this comparison suggests that neither is scientific revolution,
and so science is an irrational enterprise, where persuasion--by force, if necessary--is the
only way for proponents of the new paradigm to gain ascendancy. Reasoned debate has
no place during scientific revolutions. Whether this is an apt characterization of Kuhn's
point depends on whether conversion to a religious or political viewpoint is an irrational
enterprise.
Kuhn himself does not endorse the radical conclusions just outlined; he does not view
science as irrational. In deciding between different paradigms, people can give good
reasons for favoring one paradigm over another, he says; it is just that those reasons
cannot be codified into an algorithmic "scientific method," that would decide the point
"objectively" and conclusively. There are different standards of evaluation for what
counts as the important problems to solve, and what counts as an admissible solution. For
pre-Daltonian chemistry (with the phlogiston theory of combustion), explaining weight
gains and losses was not viewed to be as important as explaining why metals resembled
each other more than they did their ores. Quantitative comparisons were secondary to
qualitative ones. Thus the weight gain problem was not viewed as a central difficulty for
the theory, though an anomalous one. Things were different in the new theory, in which
how elements combined and in what proportions became the primary topic of research.
Here there was a common phenomenon on which they could agree--weight gain during
combustion--but it was not accorded the same importance by both schools.
Under this interpretation, much of what Kuhn says is misleading, e.g., his highly
metaphorical discussion about scientists who accept different paradigms living in
different worlds. Kuhn sometimes seems to be making an argument, based on Gestalt
psychology, of the following form.
Scientists who accept different paradigms experience the world in different ways;
they notice some things the others do not, and vice versa.
The world consists of the sum of your experiences.
Scientists who accept different paradigms experience different worlds.
Some of his arguments depend on the assumption that to re-conceptualize something, to
view it is a different way, is to see a different thing. Thus, he speaks as if one scientist
sees a planet where another saw a moving star, or that Lavoisier saw oxygen whereas
Priestley saw "dephlogisticated air."
This is an incommensurability of experience; it is dubious, but this does not detract from
the very real incommensurability of standards that Kuhn brought to the attention of the
philosophical, historical, and scientific communities.
Lecture 13
3/22/94
Laudan on Kuhn's Theory of Incommensurable Theories
Before the break, we finished our discussion of Kuhn's theory of scientific revolutions by
examining his notion of incommensurability between scientific theories. To review, Kuhn
claimed that rival paradigms are always incommensurable. Roughly, that means that
there is no completely neutral standpoint from which one can judge the relative worth of
the two paradigms. As we discussed, incommensurability comes in three basic varieties
in Kuhn's SSR:
• Incommensurability of Standards or Cognitive Values - Scientists espousing
different paradigms may agree on certain broadly defined desiderata for theories
(that a theory should be simple, explanatory, consistent with the empirical data, of
large scope and generality, and so on), but they typically disagree on their
application. For example, they might disagree about what needs to be explained
(consider transition to Daltonian chemistry) or what constitutes an acceptable
explanation (consider transition to Newtonian mechanics).
• Incommensurability of Language - Scientists typically speak different languages
before and after the change; the same words may take on new meanings (consider
mass, simultaneous, length, space, time, planet, etc.). The two sides inevitably
"talk past one another."
• Incommensurability of Experience - Scientists see the world in different,
incompatible ways before and after a paradigm shift. Kuhn describes paradigm
change as involving a kind of Gestalt shift in scientists' perceptions of the world.
Sometimes Kuhn speaks as though the world itself has changed (as opposed to the
conceptual framework in which the world is perceived, which is quite different),
though such talk is best construed as metaphorical. (Since otherwise you're
committed to the view that the world is constituted by how we conceive of it.)
As I noted then, the first sense of incommensurability is the fundamental one for Kuhn.
That is because the paradigm (in the sense of exemplar, a concrete, definitive
achievement) defines by example what problems are worth solving and how one should
go about solving them. Since this defines the particular sense in which theories are
deemed "simple," "explanatory," "accurate," and so one, the paradigm one adopts as
definitive determines which standards one uses to judge theories as adequate.
Thus, according to Kuhn one's particular standards or cognitive values are determined by
the paradigm one accepts. There is on his view no higher authority to which a scientist
can appeal. There are no "deeper" standards to which one can appeal to adjudicate
between two paradigms that say that different problems are important; thus, there is no
neutral standpoint from which one can decide between the two theories. It is primarily in
this sense that theories are "incommensurable" according to Kuhn.
In the next two weeks, we will examine the notion of cognitive values in science in detail,
particularly as discussed in Laudan's book Science and Values. (Note that Laudan's book
does not deal with ethical values, but with cognitive ones, and particularly with the notion
that there is no paradigm-neutral algorithm for adjudicating between different sets of
cognitive values.) Laudan's aim: to find a middle ground between the rule-bound view of
Carnap and Popper, and the apparent relativism of Kuhn (i.e., his view that standards of
theoretical worth are paradigm-relative, and that there is no higher authority to adjudicate
between these standards).
Laudan's claim is that both approaches fail to explain some aspect or another of science,
and go too far in predicting the degree of disagreement or consensus in science.
On the rule-bound view, there should normally be consensus as long as scientists are
rational. To decide between two competing hypotheses, one has only to examine the
evidence. Evidence can be inconclusive, but it can never be conclusive in one way for
one person and conclusive in another way for another person. (It is this way according to
Kuhn, since to a large extent paradigms are "self-justifying.") In any case, it is always
apparent how one could proceed in principle to adjudicate between the two hypotheses,
even if it is impossible or impractical for us to do so. If there is a neutral algorithm or
decision procedure inherent in "the scientific method," then you can see how the degree
of consensus that typically exists in science would be easy to explain. On the other hand,
it is difficult to explain how there could be disagreement over fundamentals when people
have the same body of evidence before them. Historical study does seem to suggest that
indeed science is non-cumulative in important respects, i.e., that in scientific revolutions
some standards and achievements are lost at the same time that new ones are gained.
(Examples: Dalton, Newton).
On the other hand, it is hard to see how Kuhn could explain how consensus arises as
quickly as it does in science, given his incommensurability thesis. Indeed, it is difficult to
see how Kuhn can explain why consensus arises at all. Some of his explanations leave
one cold, e.g., that all the young people adopt the new theory, and that the older people,
who remain loyal to the older framework, simply die off. Why shouldn't the younger
scientists be as divided as the older scientists? Similarly, arguing that some groups get
control of the universities and journals does not explain why the others don't go off and
found their own journals. To see this, consider Kuhn's own analogies between scientific
revolutions and political revolutions of religious conversions. In these areas of human
discourse, there is little consensus and little prospect for consensus. (Indeed, in the fields
of philosophy and sociology of science there is little consensus or prospect for consensus,
either.) Here we suspect that the groups differ with regard to their basic (political or
religious) values, and that since there is no way of adjudicating between these values, the
rifts end up persisting. If science is like that, and there is no "proof" but only
"conversion" or "persuasion," then why should there ever be the unanimity that arises
during periods of normal science, as Kuhn describes it?
To complicate matters, Kuhn often notes that it usually becomes clear to the vast majority
of the scientific community that one paradigm is "better" than another. One important
achievement leading to the adopting of the new theory is that it solves the anomaly that
created a sense of crisis within the old paradigm. Additionally, the new paradigm may
include very precise, quantitative methods that yield more accurate predictions, or they
may simply be easier to apply or conceptualize. Often, Kuhn claims not that these are not
good considerations in favor of adopting the new paradigm, but that they are
"insufficient" to force the choice between scientists. In any case, they are not often
present in actual historical cases (e.g., he notes that at first Copernican astronomy was not
sufficiently more accurate than Ptolemaic astronomy).
When Kuhn talks like this, he sounds very little like the person who propounds radical
incommensurability between theories. Instead, the issue at hand seems to be that the
empirical evidence simply does not determine logically which theory is correct. This
thesis is often called the "underdetermination" thesis. This is much less radical than
saying that scientists "live in different worlds" (incommensurability of experience).
Instead, we simply have it that empirical evidence does not determine which theory is
correct, and that to fill in the gap scientists have to import their own cognitive values,
about which they differ. (These may not be supplied by the paradigm, but rather the
paradigm is favored because cognitive values differ.)
Laudan thinks that Kuhn shares many assumptions with Popper and Carnap, in particular
the view that science (and pursuit of knowledge in general) is hierarchically structured
when it comes to justification. That is, we have the following levels of disagreement and
resolution.
Level of Disagreement
Level of Resolution
Factual
Methodological
Methodological
Axiological
Axiological
None
According to Laudan, Kuhn disagrees with Carnap and Popper about whether scientists
share the same cognitive values insofar as they are acting professionally (i.e., as scientists
rather than individuals). If so, this would provide a way of resolving any dispute. Kuhn
says No; Carnap and Popper (in different ways) says Yes. They both seem to agree that
differences in cognitive values cannot be resolved. Thus, the reason Kuhn sees paradigms
as incommensurable is simply that on his view there is no higher level to appeal to in
deciding between the different values inherent in competing paradigms. Next time, we
will examine the implications of the hierarchical picture in detail.
Lecture 14
3/24/94
Laudan on the Hierarchical Model of Justification
At the end of the last lecture, we briefly discussed Laudan's view that Popper, Carnap,
and Kuhn all shared an assumption, i.e., that scientific justification is hierarchically
structured. To review, Laudan thinks that Kuhn shares many assumptions with Popper
and Carnap, in particular the view that science (and pursuit of knowledge in general) is
hierarchically structured when it comes to justification. That is, we have the following
levels of disagreement and resolution.
Level of Disagreement
Level of Resolution
Factual
Methodological
Methodological
Axiological
Axiological
None
According to Laudan, Kuhn disagrees with Carnap and Popper about whether scientists
share the same cognitive values insofar as they are acting professionally (i.e., as scientists
rather than individuals). If so, this would provide a way of resolving any dispute. Kuhn
says No; Carnap and Popper (in different ways) says Yes. They both seem to agree that
differences in cognitive values cannot be resolved. Thus, the reason Kuhn sees paradigms
as incommensurable is simply that on his view there is no higher level to appeal to in
deciding between the different values inherent in competing paradigms. Let us look at
Laudan's description of the hierarchical model in detail.
• Factual Disputes - disagreement about "matters of fact," i.e., any claim about
what is true of the world, including both what we observe and the unobservable
structure of the world.
Factual disputes, on the hierarchical view, are to be adjudicated by appealing to the
methodological rules governing scientific inquiry.
• Methodological Disputes - disagreement about "methodology," i.e., both high-
and low-levels rules about how scientific inquiry should be conducted. These
include very specific, relatively low-level rules such as "always prefer double-
blind to single-blind tests when testing a new drug" to high-level rules such as
"avoid ad hocness," "only formulate independently testable theories," "assign
subjects to the test and control groups randomly," and so on. These rules would
also include instructions regarding statistical analysis (when to perform a t-test or
c-squared test, when to "reject" or "accept" hypotheses at a given level of
significance, and so on). As Laudan remarks, settling a factual dispute on the
hierarchical model is somewhat similar to deciding a case in court; the rules are
fairly well established for deciding cases; evidence is presented on either side of
the case, and the rules, when properly applied, result in an impartial, fair, justified
resolution of the dispute.
Note that it is not a part of the hierarchical view that any factual dispute can be
immediately adjudicated by application of methodological rules. For starters, the
evidence may simply be inconclusive, or of poor quality. In that case, the rules would
simply tell you to go find more evidence; they would also tell you what type of evidence
would be needed to resolve the dispute. This does not mean, of course, that the evidence
will be found. It may be impractical, or maybe even immoral, to go out and get evidence
of the required sort. That, however, only means that some factual disputes cannot be
settled as a practical matter; all disputes can be settled "in principle."
Methodological disputes, on the hierarchical view, are to be adjudicated by appealing to
the goals and aims of scientific inquiry. The assumption is that rules of testing,
experiment, statistical analysis, and so on, are not ends in themselves but means to
achieving a higher goal.
• Axiological Disputes - disagreements about the aims and goals of scientific
inquiry. For example, do scientists seek truth, or simply empirical adequacy?
Must theories be "explanatory" in particular sense? As remarked last time, Carnap
and Popper seem to assume that scientists, insofar as they are acting rationally and
as scientists, do not disagree about the fundamental aims and goals of scientific
inquiry. (Carnap and Popper might differ on what those aims and goals are that all
scientists share; but that is another story.) Kuhn, on the other hand, assumes that
scientists who commit themselves to different paradigms will also differ about
what they consider to be the aims and goals of scientific inquiry (in a particular
discipline).
On the hierarchical view of justification, axiological disputes cannot be adjudicated; there
is no higher level to appeal to.
[Note: GOAL in subsequent discussion of hierarchical view is what it involves, and what
it would take to refute it. Contrast with "Leibnizian Ideal."]
Factual Disagreement (Consensus)
Can all factual disputes be settled by appeal to methodological rules? There is a basic
problem with supposing that they can. Although the rules and available evidence will
exclude some hypotheses from consideration, they will never single out one hypothesis
out of all possible hypotheses as the "correct" one given that evidence. In other words,
methodological rules plus the available evidence always underdetermine factual claims.
This may occur if the two hypotheses are different but "empirically equivalent," i.e., they
have the same observational consequences. In this case, it is questionable whether the
theories are even different theories at all (e.g., wave vs. matrix mechanics). In many cases
we might think that the theories really are different, but observation could never settle the
issue between them (e.g., Bohmian mechanics vs. orthodox quantum mechanics).
As noted earlier, this fact does not undercut the hierarchical model. The reason for this is
that the hierarchical model only says that when factual disputes can be adjudicated, they
are adjudicated at the methodological level (by applying the rules of good scientific
inquiry). However, it is not committed to conceiving of methodology as singling out one
hypothesis out of all possible hypotheses, but simply as capable, often enough, of settling
a dispute between the hypotheses that we happen to have thought of, and giving us
direction on how to gather additional evidence should available evidence be insufficient.
That is, on the hierarchical model are rules simply answer the question:
Which hypothesis out of those available to us is
best supported by the available evidence?
The rules, then, do not tell us what hypothesis to believe ("the truth is h"), but simply
which of two hypotheses to prefer. In other words, they provide criteria that partition or
divide the class of hypotheses into those that are permissible, given the evidence, and
those that are not. Thus it may turn out in particular cases that given the available
evidence more than one hypothesis is acceptable.
Consider the following question: would it be rational for a scientist now, given our
present state of empirical knowledge, to believe in Cartesian physics, or phlogiston
theory, and so on? The point here is that though there are periods (perhaps long ones)
during which the available rules underdetermine the choice, so that it is rationally
permissible for scientists to disagree, there comes a time when rational disagreement
becomes impermissible. That is, though whether a person is justified in holding a position
is relative to the paradigm in the short term, in the long term it isn't true that just
"anything goes." (Feyerabend, some sociologists conclude from the fact that "reasonable"
scientists can and do differ, sometimes violently so, when it comes to revolutionary
periods that there is no rational justification of one paradigm over another, that there is no
reason to think that science is progressing towards the truth, or so on.)
• Moral (Laudan): Underdetermination does not imply epistemic relativism with
regard to scientific theories.
How is this relevant to Kuhn? Well, Laudan claims that Kuhn is implicitly assuming that
the fact that there is no neutral algorithm (methodological rule) that always tells us "This
hypothesis is correct," and concludes from this that the choice between them must be, at
least in part, non-rational. Then he concludes at the end of the book that this shows that
science is not "progressing" towards the truth, considered as a whole. Progress can only
be determined when the relevant and admissible problems (goals and aims of the
discipline) are fixed by a paradigm. (Analogy with Darwinian evolution: no "goal"
towards which evolution is aiming.) However, this may be true only in the short term.
Laudan says (pages 31-32) that though "observational accuracy" might be vaguely
defined, there comes a point at which it's apparent that one theoretical framework is more
accurate observationally than another. Kuhn's problem is emphasizing too strongly the
disagreement that can occur because of the ambiguity of shared standards.
Methodological Disagreement (Consensus)
What are some examples of methodological disagreement? Predictions must be surprising
or of wide variety. Another example: disagreement over applications of statistics;
statistical inference is not a monolithic area of investigation, free of disagreement.
Laudan says that goals and aims cannot completely resolve disputes over many
methodological claims; there is underdetermination between goals and justification of
methods just as there is between methods and justification of factual claims. For example,
simply accepting that our goal is that scientific theories be true, explanatory, coherent,
and of wide generality does not by itself determine which methodological principles we
should seek.
This no more implies that methodological disputes cannot be resolved by appeals to
shared goals than that factual claims can never be settled by appeals to shared methods.
We can often show that a certain rule is one way of reaching our goal, or that it is better
than other rules under consideration. Consider the rule that double-blind tests are better
than single-blind tests. NOTE: This example already shows that the hierarchical model is
too simple, since the lower-level facts influence what methods we think will most likely
reach our goals of the truth about a subject matter.
Lecture 15
3/29/94
Laudan's Reticulated Theory of Scientific Justification
Last time we examined Laudan's criticisms of the hierarchical model of scientific
justification. As you recall, his point was not that scientific debate is never resolved as
the hierarchical model would predict; it is just that it does not always do so. Laudan
thinks that the hierarchical model is often plausible, so long as it is loosened up a bit. In
particular, the hierarchical model has to allow that not all disputes can be resolved by
moving up to a higher level; also, as I mentioned at the end of the last session, it has to
allow for elements from a lower level to affect what goes on at a higher level. (For
example, it was mentioned that the methodological rule that double-blind tests be
preferred to single-blind tests was based on the factual discovery that researchers
sometimes unintentionally cause patients who have received a medication, but don't
know whether they have, to be more optimistic about their prospects for recovery and
thereby affect how quickly people recover from an illness.) If these adjustments are
made, the hierarchical model becomes less and less "hierarchical."
In the end, however, the what makes the hierarchical view essentially "hierarchical" is
that there is a "top" level (the axiological) for which no higher authority is possible. To
put it less abstractly, a theory of scientific justification is hierarchical if it says that there
is no way to adjudicate between disputes at the (axiological) level of cognitive values,
aims, or goals; disagreement at this level is always rationally irresolvable.
Laudan wants to dispute the claim that disagreement at the axiological level is always
irresolvable. Instead, he argues that there are several mechanisms that can and are used to
resolve disagreements at the axiological level. To see that these mechanisms exist,
however, we have to drop all vestiges of the view that scientific justification is "top-
down." Instead, scientific justification is a matter of coherence between the various
levels; scientific disputes can be rationally resolved so long as one or more of the levels is
held fixed.
Central to this model of scientific justification is the view that the different levels
constrain each other, so that holding some of the levels fixed, there are limits to how far
you can go is modifying the other level(s). This means that it must be possible for some
of the levels to change without there being change at all the other levels. Before Laudan
describes this model, which he the "reticulated" model of scientific justification, in detail,
he first discusses a common but flawed pattern of reasoning that leads many people to
think that there must be "covariation" between all three levels.
The Covariation Fallacy
• Disagreement at one level (e.g., theory) are always accompanied by disagreement
at all higher levels (e.g., method and goals).
• Agreement at one level (e.g., aims) is always accompanied by agreement at all
lower levels (e.g., method and theory).
If these theses are correct, then theoretical disagreements between scientists would indeed
have to be accompanied by disagreements over aims, e.g., what counts as an acceptable
scientific explanation. Laudan, relying on the fact that there is underdetermination
between every level, argues that this is not necessarily so. People can agree over what
counts as a good scientific explanation while differing over whether a specific theory
meets whatever criteria are necessary for a good scientific explanation, or over what
methods would best help promote the acquisition of good scientific explanations. (Kuhn's
view, of course, is that there must be a difference at the higher level if disagreement
occurs at the lower level; thus, he argues that the scientists agree only at a shallow level--
theories ought to be "explanatory"--while disagreeing about what those criteria
specifically amount to.) What is perhaps more important, people can disagree about the
aims of their discipline (e.g., truth vs. empirical adequacy, or consistency with the
evidence vs. conceptual elegance, simplicity, and beauty) while agreeing about
methodology and theory. (TEST: Get a group of scientists who agree on theory and
method and then ask them what the ultimate aims of their discipline are; you might find
surprising differences.) This would occur if the methods in question would promote both
sets of aims. Similarly, the same theory can be deemed preferable by two different and
deeply conflicting methodologies. That is, a theory can win out if it looks superior no
matter what perspective you take. This is how things have often occurred in the history of
science, according to Laudan.
Kuhn commits the covariance fallacy, Laudan argues, in his arguments for the view that
theory, methods, and values, which together make up a paradigm, form an inseparable
whole. As Laudan construes him, Kuhn thinks that a paradigm is a package deal: you
can't modify the theory without affecting the methodological rules, or how the aims of
that discipline are conceived. On the contrary, Laudan argues, change can occur
piecemeal, one or more level at a time (with adjustments to the other levels coming later).
How Can Goals Be Rationally Evaluated?
Laudan describes two mechanisms that can be used to adjudicate between axiological
disputes: (1) you can show that, if our best theories are true, the goals could not be
realized (the goal is "utopian"); and (2) the explicitly espoused goals of a discipline are
not (or even cannot be) reflected in the actual practice of that discipline (as evinced in its
methods). Mechanism (1) tries to show a lack of fit between theories and aims, keeping
the former fixed; mechanism (2) tries to show a lack of fit between methods and aims,
keeping the former fixed.
Method (1) - Different Kinds of "Utopian" Strategies:
(a) Demonstrable utopianism (goals demonstrably cannot be achieved, e.g., absolute
proof of general theories by finite observational evidence);
(b) Semantic utopianism (goals are so vaguely that it is unclear what would count as
achieving them, e.g., beauty or elegance);
(c) Epistemic utopianism (it's impossible to provide a criterion that would enable us to
determine if we've reached our goal, e.g., truth).
Method (2) - Reconciling Aims and Practice
(a) Actual theories, methods cannot achieve those aims. Examples: Theories must be
capable of proof by Baconian induction from the observable evidence; Explanatory
theories must not speculate about unobservables. Both were rejected because the practice
of science necessitated the postulation of unobservables; so Baconian induction was
rejected as an ideal and replaced with the Method of Hypotheses (hypothetical-
deductivism). Here agreement over theories and methods--which didn't make sense if the
explicitly espoused aims were really the aims of science--provided the rational basis for
adjusting the explicitly espoused aims of science.
(b) All attempts at achieving those aims have failed (e.g., certainty, explanatory,
predictive theory that appeals only to kinematic properties of matter).
Three Important Things to Note about Laudan's Reticulated Theory of Scientific
Rationality: on Laudan's view, (1) because the levels constrain but do not determine the
other levels, it is sometimes the case that disagreements over aims are rationally
irresolvable--but this is not generally the case; (2) the levels are to a large degree
independent of one another, allowing for paradigm change to be piecemeal or gradual
rather than a sudden "conversion" or "Gestalt shift;" and (3) scientific "progress" can only
be judged relative to a particular set of goals. Thus, Laudan's view, like Kuhn's, is
relativistic.
(Important: Like Kuhn, Laudan denies the radical relativistic view that progress does
not exist in science; he simply thinks that whether progress occurs in science can only be
judged relative to certain shared goals, just like whether certain aims are reasonable can
only be judged if either theory or method is held fixed. Judgments that progress has
occurred in science, relative to fixed goals, can therefore be rationally assessed as true or
false.)
Lecture 16
3/31/94
Dissecting the Holist Picture of Scientific Change
Last time, we discussed Laudan's reticulationist model of scientific justification. In this
session, we will examine Laudan's arguments for thinking that his model does better than
Kuhn's quasi-hierarchical, "holist" model at explaining both how agreement and
disagreement emerges during scientific revolutions. As you recall, Laudan argues that
what change in the aims or goals of a scientific discipline can result from reasoned
argument if there is agreement at the methodological and/or theoretical (factual) level. In
other words, if any of the three elements in the triad of theories, methods, and aims is
held fixed, this is sufficient to provide reasonable grounds for criticizing the other
elements. Laudan's view is "coherentist," in that he claims that scientific rationality
consists in maintaining coherence or harmony between the elements of the triad.
This picture of reasoned argument in science requires that the aims-methods-theories
triad be separable, i.e., that these elements do not combine to form an "inextricable"
whole, or Gestalt, as Kuhn sometimes claimed. If some of these elements can change
while the others are held fixed, and reasoned debate is possible as long one or more of the
elements are held fixed, then this leaves open this possibility that scientific debates
during what Kuhn calls "paradigm change" can be rational, allowing (at least sometimes)
for relatively quick consensus in the scientific community. This could occur if scientific
change were "piecemeal," i.e., if change occurred in only some elements of the aims-
methods-theories triad at a time. Indeed, Laudan wants to argue that when examined
closely scientific revolutions are typically piecemeal and gradual rather than sudden, all-
or-nothing Gestalt switches. The fact that it often looks sudden in retrospect is an illusion
accounted for by the fact that looking back often telescopes the fine-grained structure of
the changes.
Kuhn on the Units of Scientific Change
For Kuhn, the elements of the aims-methods-theories triad typically change
simultaneously rather than sequentially during scientific revolutions. For example, he
says: "In learning a paradigm the scientist acquires theory, methods, and standards
together, usually in an inextricable mix." In later chapters of SSR, Kuhn likens paradigm
change to all-or-nothing Gestalt switches and religious conversion. If this were so, it
would not be surprising if paradigm debate were always inconclusive and could never
completely be brought to closure by rational means. Closure must then always be
ultimately explained by non-rational factors, such as the (contingent) power dynamics
within a scientific community. As Laudan argued in Chapter 1 of SV, factors such as
these cannot fully explain why it is that closure is usually achieved in science whereas it
is typically not in religions or other ideologies, or why it is that closure is normally
achieved relatively quickly.
Laudan's solution to the problem of explaining both agreement and disagreement during
scientific revolutions is not to reject Kuhn's view entirely, but to modify it in two ways.
• by dropping the hierarchical picture of scientific rationality, and replacing it with
the "reticulated" picture, in which scientific aims as well as methods and theories
are rationally negotiable
• by dropping the notion that all elements in the aims-methods-theories triad change
simultaneously; change is typically "piecemeal" during scientific revolutions
Question: How could piecemeal change occur? Laudan first sketches an idealized
account of such change, and then attempts to argue that this idealized account
approximates what often happens historically. (Go through some examples of the former
in terms of a hypothetical "unitraditional" paradigm shift.) The fact that it does not look
like that in retrospect is normally due to the fact that history "telescopes" change, so that
a decade-long period of piecemeal change is characterized only in terms of its beginning
and end-points, which exhibits a complete replacement of one triad by another. "...a
sequence of belief changes which, described at the microlevel, appears to be a perfectly
reasonable and rational sequence of events may appear, when represented in broad
brushstrokes that drastically compress the temporal dimension, as a fundamental and
unintelligible change of world view" (78).
(Now go through what might happen if there are different, competing paradigms.) When
there is more than one paradigm, agreement can also occur in the following kinds of
cases.
• The theory in one complex looks better according to the divergent methodologies
in both paradigms.
• The theory in one complex looks like it better meets the aims of both theories than
its rival (e.g., predictive accuracy, simplicity).
Because the criteria (aims, methods) are different in the two theories, there may be no
neutral, algorithmic proof; nevertheless, it often turns out that as the theory develops it
begins to look better from both perspectives. (Again, this only makes sense if we deny
three theses espoused by Kuhn--namely, that paradigms are self-justifying, that the aims-
methods-theories mix that comprises a paradigm is an "inextricable" whole, and that
paradigm change cannot be piecemeal.) Thus, adherents of the old paradigm might drop
adopt the new methods and theories because they enable them to do things that they
recognize as valuable even from their own perspective; then they might modify their aims
as they find out that those aims don't cohere with the new theory. (Examples of
Piecemeal Change: Transition from Cartesian to Newtonian mechanics (theoretical
change led later to axiological change); Transition from Ptolemaic to Copernican
astronomy (methodological change--i.e., it eventually because easier to calculate using
the methods of Copernican astronomy, though this wasn't true at first--led to eventual
adoption of the Copernican theory itself.)
Prediction of the Holist Approach (committed to covariance):
• Change at one level (factual, methodological, axiological) will always be
simultaneous with change at the other levels (follows from the fact that theory,
methods, and standards form an "inextricable" whole).
Counterexamples: piecemeal change given above; cross-discipline changes not tied to
any particular paradigm (e.g., the acceptance of unobservables in theories, the rejection of
certainty or provability as a standard for acceptance of theories)
Because of the counterexamples, it is possible for there to be "fixed points" from which
to rationally assess the other levels. "Since theories, methodologies, and axiologies stand
together in a kind of justificatory triad, we can use those doctrines about which there is
agreement to resolve the remaining areas about which we disagree" (84).
Can Kuhn respond?
(1) The "ambiguity of shared standards" argument - those standards that scientists agree
on (simplicity, scope, accuracy, explanatory value), they often interpret or "apply"
differently. Laudan's criticism: not all standards are ambiguous (e.g., logical consistency)
- A response on behalf of Kuhn: it's enough that some are, and that they play a crucial
role in scientists' decisions
(2) The "collective inconsistency of rules" argument - rules can be differently weighted,
so that they lead to inconsistent conclusions. Laudan's criticism: only a handful of cases,
not obviously normal. No one has ever shown that Mill's Logic, Newton's Principia,
Bacon or Descartes' methodologies were internally inconsistent. A response of behalf of
Kuhn: Again, it's enough if it happens, and it often happens when it matters most, i.e.,
during scientific revolutions.
(3) The shifting standards argument - different standards applied, so theoretical
disagreements cannot be conclusively adjudicated. Laudan's criticism: It doesn't follow
(see earlier discussion of underdetermination and the reticulationist model of scientific
change).
(4) The problem weighting argument - Laudan's response: one can give reasons why
these problems are more important than others, and these reasons can be (and usually are)
rationally critiqued. "...the rational assignment of any particular degree of probative
significance to a problem must rest on one's being able to show that there are viable
methodological and epistemic grounds for assigning that degree of importance rather than
another" (99). Also, Laudan notes that the most "important" problems are not the most
probative ones (the ones that most stringently test the theory). For example, explaining
the anomalous advance in Mercury's perihelion, Brownian motion, diffraction around a
circular disk. These problems did not become probative because they were important, but
became important because they were probative.
Lecture 17
4/5/94
Scientific Realism Vs. Constructive Empiricism
Today we begin to discuss a brand new topic, i.e., the debate between scientific realism
and constructive empiricism. This debate was provoked primarily by the work of Bas van
Fraassen, whose critique of scientific realism and defense of a viable alternative, which
he called constructive empiricism, first reached a wide audience among philosophers with
the publication of his 1980 book The Scientific Image. Today we will discuss (1) what
scientific realism is, (2) what alternatives are available to scientific realism, specifically
van Fraassen's constructive empiricism.
What Is Scientific Realism?
Scientific realism offers a certain characterization of what a scientific theory is, and what
it means to "accept" a scientific theory. A scientific realist holds that (1) science aims to
give us, in its theories, a literally true story of what the world is like, and that (2)
acceptance of a scientific theory involves the belief that it is true.
Let us clarify these two points. With regard to the first point, the "aims of science" are to
be distinguished from the motives that individual scientists have for developing scientific
theories. Individual scientists are motivated by many diverse things when they develop
theories, such as fame or respect, getting a government grant, and so on. The aims of the
scientific enterprise are determined by what counts as success among members of the
scientific community, taken as a whole. (Van Fraassen's analogy: The motives an
individual may have for playing chess can differ from what counts as success in the
game, i.e., putting your opponent's king in checkmate.) In other words, to count as fully
successful a scientific theory must provide us with a literally true description of what the
world is like.
Turning to the second point, realists are not so naive as to think that scientists' attitudes
towards even the best of the current crop of scientific theories should be characterized as
simple belief in their truth. After all, even the most cursory examination of the history of
science would reveal that scientific theories come and go; moreover, scientists often have
positive reason to think that current theories will be superseded, since they themselves are
actively working towards that end. (Example: The current pursuit of a unified field
theory, or "theory of everything.") Since acceptance of our current theories is tentative,
realists, who identify acceptance of a theory with belief in its truth, would readily admit
that scientists at most tentatively believe that our best theories are true. To say that a
scientist's belief in a theory is "tentative" is of course ambiguous: it could mean either
that the scientist is somewhat confident, but not fully confident, that the theory is true; or
it could mean that the scientist is fully confident that the theory is approximately true. To
make things definite, we will understand "tentative" belief in the former way, as less-
than-full confidence in the truth of the theory.
Constructive Empiricism: An Alternative To Scientific Realism
There are two basic alternatives to scientific realism, i.e., two different types of scientific
anti-realism. That is because scientific realism as just described asserts two things, that
scientific theories (1) should be understood as literal descriptions of the what the world is
like, and (2) so construed, a successful scientific theory is one that is true. Thus, a
scientific anti-realist could deny either that theories ought to be construed literally, or that
theories construed literally have to be true to be successful. A "literal" understanding of a
scientific theory is to be contrasted with understanding it as a metaphor, or as having a
different meaning from what its surface appearance would indicate. (For example, some
people have held that statements about unobservable entities can be understood as
nothing more than veiled references to what we would observe under various conditions:
e.g., the meaning of a theoretical term such as "electron" is exhausted by its "operational
definition.") Van Fraassen is an anti-realist of the second sort: he agrees with the realist
that scientific theories ought to be construed literally, but disagrees with them when he
asserts that a scientific theory does not have to be true to be successful.
Van Fraassen espouses a version of anti-realism that he calls "constructive empiricism."
This view holds that (1) science aims to give us theories that are empirically adequate,
and (2) acceptance of a scientific theory involves the belief that it is empirically adequate.
(As was the case above, one can tentatively accept a scientific theory by tentatively
believing that the theory is empirically adequate.) A scientific theory is "empirically
adequate" if it gets things right about the observable phenomena in nature. Phenomena
are "observable" if they could be observed by appropriately placed beings with sensory
abilities similar to those characteristic of human beings. On this construal, many things
that human beings never have observed or ever will observe count as "observable." On
this understanding of "observable," to accept a scientific theory is to believe that it gets
things right not only about the empirical observations that scientists have already made,
but also about any observations that human scientists could possibly make (past, present,
and future) and any observations that could be made by appropriately placed beings with
sensory abilities similar to those characteristic of human scientists.
The Notion Of Observability
Constructive empiricism requires a notion of "observability." Thus, it is important that we
be as clear as possible about what this notion involves for van Fraassen. Van Fraassen
holds two things about the notion of observability:
(1) Entities that exist in the world are the kinds of things that are observable or
unobservable. There is no reason to think that language can be divided into theoretical
and observational vocabularies, however. We may describe observable entities using
highly theoretical language (e.g., VHF receiver," "mass," "element," and so on); this does
not, however, mean that whether the things themselves (as opposed to how we describe
or conceptualize them) are unobservable or not depends on what theories we accept.
Thus, we must carefully distinguish between observing an entity from observing that an
entity exists meeting such-and-such a description. The latter can be dependent upon
theory, since descriptions of observable phenomena are often "theory-laden." However, it
would be a confusion to conclude from this that the entity observed is a theoretical
construct.
(2) The boundary between observable and unobservable entities is vague. There is a
continuum from viewing something with glasses, to viewing it with a magnifying lens,
with a low-power optical microscope, with a high-power optical microscope, to viewing
it with an electron microscope. At what point should the smallest things visible using a
particular instrument count as "observable?" Van Fraassen's answer is that "observable"
is a vague predicate like "bald" or "tall." There are clear cases when a person is bald or
not bald, tall or not tall, but there are also many cases in between where it is not clear on
which side of the line the person falls. Similarly, though we are not able to draw a precise
line that separates the observable from the unobservable, this doesn't mean that the notion
has no content, since there are entities that clearly fall on one side or the other of the
distinction (consider sub-atomic particles vs. chairs, elephants, planets, and galaxies).
The content of the predicate "observable" is to be fixed relative to certain sensory
abilities. What counts as "observable" for us is what could be observed by a suitably
placed being with sensory abilities similar to those characteristic of human beings (or
rather, the epistemic community to which we consider ourselves belonging). Thus, beings
with electron microscopes in place of eyes do not count.
Arguments In Favor Of Scientific Realism: Inference To The Best Explanation
Now that we have set out in a preliminary way the two rival positions that we will
consider during the next few weeks, let us examine the arguments that could be given in
favor of scientific realism. An important argument that can be given for scientific realism
is that we ought rationally to infer that the best explanation of what we observe is true.
This is called "inference to the best explanation." The argument for this view is that in
everyday life we reason according to the principle of inference to the best explanation,
and so we should also reason this way in science. The best explanation, for example, for
the fact that measuring Avogadro's number (a constant specifying the number of
molecules in a mole of any given substance) using such diverse phenomena as Brownian
motion, alpha decay, x-ray diffraction, electrolysis, and blackbody radiation gives the
same result is that matter really is composed of the unobservable entities we call
molecules. If it were not, wouldn't it be an utterly surprising coincidence that things
behaved in very different circumstances exactly as if they were composed of molecules?
This is the same kind of reasoning that justifies belief that an apartment has mice. If all
the phenomena that have been observed are just as would be expected if a mouse were
inhabiting the apartment, isn't it then reasonable to believe that there's a mouse, even
though you've never actually seen it? If so, why should it be any different when you are
reasoning about unobservable entities such as molecules?
Van Fraassen's response is that the scientific realist is assuming that we follow a rule that
says we should infer the truth of the best explanation of what we have observed. This is
what makes it look inconsistent for a person to insist that we ought not to infer that
molecules exist while at the same time insisting that we ought to infer that there is a
mouse in the apartment. Why not characterize the rule we are following differently--i.e.,
that we infer that the best explanation of what we observe is empirically adequate? If that
were the case, we should believe in the existence of the mouse, but we should not believe
anything more of the theory that matter is composed of molecules than that it adequately
accounts for all observable phenomena. In other words, van Fraassen is arguing that,
unless you already assume that scientists follow the rule of inference to the (truth of the)
best explanation, you cannot provide any evidence that they follow that rule as opposed
to following the rule of inference to the empirical adequacy of the best explanation.
We will continue the discussion of inference to the best explanation, as well as other
arguments for scientific realism, next time.
Lecture 18
4/7/94
Inference To The Best Explanation As An Argument For Scientific Realism
Last time, I ended the lecture by alluding to an argument for scientific realism that
proceeded from the premise that the reasoning rule of inference to the best explanation
exists. Today, we will examine in greater detail how this argument would go, and we'll
also discuss the notion of inference to the best explanation in greater detail.
The Reality Of Molecules: Converging Evidence
As I noted last time, what convinced many scientists at the beginning of this century of
the atomic thesis (i.e., that matter is composed of atoms that combine into molecules, and
so on) was that there are many independent experimental procedures all of which lead to
the same determination of Avogadro's number. Let me mention some of the ways in
which the number was determined.
(1) Brownian Motion. Jean Perrin studied the Brownian motion of small, microscopic
particles known as colloids. (Brownian motion was first noted by Robert Brown, in the
early 19th century.) Though visible only through a microscope, the particles were much
larger than molecules. Perrin determined Avogadro's number from looking at how
particles were distributed vertically when placed in colloidal suspensions. He prepared
tiny spheres of gamboge, a resin, all of uniform size and density. He measured how the
particles were distributed vertically when placed in water; calculating what forces would
have to be in place to account for this keeping the particles suspended, he could calculate
their average kinetic energy. If we know the mass and velocities, we can then determine
the mass of a molecule of the fluid, and hence Avogadro's number, which is the
molecular weight divided by the mass of a single molecule.
(2) Alpha Decay. Rutherford recognized that alpha particles were helium nuclei. Alpha
particles can be detected by scintillation techniques. By counting the number of helium
atoms that were required to make up a certain mass of helium, Rutherford calculated
Avogadro's number.
(3) X-ray diffraction - A crystal will diffract x-rays, the matrix of atoms acting like a
diffraction grating. From the wavelength of the x-rays and the diffraction pattern, you can
calculate the spacing of the atoms. Since that is regular in a crystal, you could then
determine how many atoms it takes to make up the crystal, and so Avogadro's number.
(Friedrich & Knipping)
(4) Blackbody Radiation - Planck derived a formula for the law of blackbody radiation,
which made use of Planck's constant (which was obtainable using Einstein's theory of the
photoelectric effect) and macroscopically measurable variables such as the speed of light
to derive Boltzmann's constant. Then you use the ideal gas law PV = nRT; n is the
number of moles of an ideal gas, and R (the universal gas constant) is the gas constant per
mole. Boltzmann's constant k is the gas constant per molecule. Hence R/k = Avogadro's
number (number of molecules per mole).
(5) Electrochemistry - A Faraday F is the charge required to deposit a mole of
monovalent metal during electrolysis. This means that you can calculate the number of
molecules per mole if you know the charge of the electron, F/e = N. Millikan's
experimental measurements of the charge of the electron can then be used to derive
Avogadro's number.
Now, the scientific realist wants to claim that the fact that all of these different
measurement techniques (along with many others not mentioned here) all lead to the
same value for Avogadro's number. The argument, then, is how you can explain this
remarkable convergence on the same result if it were not for the fact that there are atoms
& molecules that are behaving as the theories say they do? Otherwise, it would be a
miracle.
Inference To The Best Explanation (Revisited)
We have to now state carefully what is being claimed here. The scientific realist argues
that the fact that the reality of molecules explains the convergence on Avogadro's number
better than its rival, that the world is not really molecular but that everything behaves as
if it were. That is because given the molecular hypothesis, convergence would be
strongly favored, whereas if the underlying world were not molecular we wouldn't expect
any stability in such a result. Now it is claimed that being the best explanation of
something is a mark of truth. Thus, we have an inference pattern.
A explains X better than its rivals, B, C, and so on.
The ability of a hypothesis to explain something better
than all its rivals is a mark of its truth.
A is true.
Now why should we think that this is a good reasoning pattern? (That is, why should we
think that the second premise is true?) The scientific realist argues that it is a reasoning
pattern that we depend on in everyday life; we must assume the truth of the second
premise if we are to act reasonably in everyday life. To refer to the example we discussed
the last session, the scientific realist argues that if this reasoning pattern good for the
detective work that infers the presence of an unseen mouse, it is good enough for the
detective work that infers the presence of unseen constituents of matter.
Van Fraassen has argued that the hypothesis that we infer to the truth of our best
explanation can be replaced by the hypothesis that we infer to the empirical adequacy of
our best explanation without loss in the case of the mouse, since it is observable. How
then, can you determine whether we ought to follow the former rule rather than the latter?
The only reason we have for thinking that we follow IBE is everyday examples such as
the one about mice; but the revised rule that van Fraassen suggests could account for this
inferential behavior just as well as the IBE hypothesis. Thus, there is no real reason to
think that we follow the rule of IBE in our reasoning.
Van Fraassen's Criticisms Of Inference To The Best Explanation
Van Fraassen also has positive criticisms of IBE. First, he argues that it is not what it
claims to be. In science, you don't really choose the best overall explanation of the
observable phenomena, but the best overall explanation that you have available to you.
However, why should we think that the kinds of explanations that we happen to have
thought of are the best hypotheses that could possibly be thought up by any intelligent
being? Thus, the IBE rule has to be understood as inferring the truth of the best
explanation that we have thought of. However, if that's the case our "best" explanation
might very well be the best of a bad lot. To be committed to IBE, you have to hold that
the hypotheses that we think of are more likely to be true than those that we do not, for
that reason. This seems implausible.
Reactions On Behalf Of The Scientific Realist
(1) Privilege - Human beings are more likely to think up hypotheses that are true than
those that are false. For otherwise, evolution would have weeded us out. False beliefs
about the world make you less fit, in that you cannot predict and control your
environment, and may likely die.
Objection: The kinds of things that selected us during our evolution based on our
inferences do not depend on what we have believed to be true. They just have to not kill
us, and enable us to have children. In addition, we only have to be able to infer what is
empirically adequate.
(2) Forced Choice - We cannot do without inferring what goes beyond our evidence; thus
the choice between competing hypotheses is forced. To guide our choices, we need rules
of reasoning, and IBE fits the bill.
Objection - The situation may force us to choose the best we have; but it cannot force us
to believe that the best we have is true. Having to choose a research program, for
example, only means that you think that it is the best one available to you, and that you
can best contribute to the advancement of science by choosing that one. It does not
thereby commit you to belief in its truth.
The problem that is evinced here, and will crop up again, is that any rule of this sort has
to make assumptions about the way the world is. The things that seem simple to us are
most likely to be true; the things that seem to explain better than any of the alternatives
that we've come up with are more likely to be true that those that do not; and so on.
Lecture 19
4/12/94
Entity Realism (Hacking & Cartwright)
Last time we looked at an argument for scientific realism based on an appeal to a putative
rule of reasoning, called Inference to the Best Explanation (IBE). We examined a case
study where the scientific realist argues that convergent but independent determinations
of Avogadro's number were better explained by the truth of the molecular hypothesis than
its empirical adequacy (Salmon 1984, Scientific Explanation and the Causal Structure of
the World, pages 213-227). (Note that Salmon's treatment of these cases does not
explicitly appeal to IBE, but to the Common Cause principle.) The primary problems
with that argument were (1) the realist has given no reason to think that we as a rule infer
the truth of the best explanation, rather than its empirical adequacy, and (2) it is
impossible to argue for IBE as a justified rule of inference unless one assumes that
human beings are, by nature, more likely to think up true explanations rather than ones
that are merely empirically adequate. There are, of course, responses a realist can make to
these objections, and we examined two responses to problem (2), one that argues that (a)
evolution selected humans based on our ability to generate true rather than false
hypotheses about the world, and the other that (b) accepts the objection but argues that
we are somehow forced to believe the best available explanation. Neither of these
responses seems very convincing (van Fraassen 1989, Laws and Symmetry, pages 142-
150).
Today we will look at a moderate form of realism, which I will call "entity realism," and
arguments for that view that do not depend on IBE. Entity realists hold that what one is
rationally compelled to believe the existence of some of the unobservable entities
postulated by our best scientific theories, but one is not obligated to believe that
everything that our best theories say about those entities is true. Nancy Cartwright, for
example, argues that we are compelled to believe in those entities that figure essentially
in causal explanations of the observable phenomena, but not in the theoretical
explanations that accompany them. The primary reason she gives is that causal
explanations, e.g., that a change in pressure is caused by molecules impinging on the
surface of a container with greater force after heat energy introduced into the container
increases the mean kinetic energy of the molecules, make no sense unless you really
think that molecules exist and behave roughly as described. Cartwright claims that you
have offered no explanation at all if you give the preceding story and then add, "For all
we know molecules might not really exist, and the world simply behaves as if they exist."
Theoretical explanations, on the other hand, which merely derive the laws governing the
behavior of those entities from more fundamental laws, are not necessary to believe,
since a multiplicity of theoretical laws can account for the phenomenological laws that
we derive from experiment. Cartwright argues that scientists use often different and
incompatible theoretical models based on how useful those models are in particular
experimental situations; if this is so, scientists cannot be committed to the truth of all
their theoretical models. However, scientists do not admit incompatible causal
explanations of the same phenomenon; according to Cartwright, that is because a causal
explanation cannot explain at all unless the entities that play the causal roles in the
explanation exist.
Cartwright's argument depends on a certain thesis about explanation (explanations can
either cite causes or can be derivations from fundamental laws) and an associated
inference rule (one cannot endorse a causal explanation of a phenomenon without
believing in the existence of the entities that, according to the explanation, play a role in
causing the phenomenon). As Cartwright sometimes puts it, she rejects the rule of
Inference to the Best Explanation but accepts a rule of Inference to the Most Probable
Cause. Van Fraassen, of course, is unlikely to acquiesce in such reasoning, since he
rejects the notion that a causal explanation cannot be acceptable unless the entities it
postulates exist; on the contrary, if what is requested in the circumstances is information
about causal processes according to a particular scientific theory, it will be no less
explanatory if we merely accept the theory (believe it to be empirically adequate) rather
than believe that theory. Thus, the constructive empiricist can reject Cartwright's
argument since he holds a different view of what scientific explanation consists in.
Hacking takes a different route in arguing to an entity realist position. Hacking argues
that the mistake that Cartwright and van Fraassen both make is concentrating on
scientific theory rather than experimental practice. His approach can be summed up in the
slogans "Don't Just Peer, Interfere" (with regard to microscopes), and "If you can
manipulate them, they must be real" (with regard to experimental devices that use
microscopic particles such as electrons as tools). Let's look at his arguments for these two
cases.
In his article, "Do We See Through a Microscope?" (Churchland and Hooker, eds., 1985,
Images of Science), Hacking argues that what convinces experimentalists that they are
seeing microscopic particles has nothing to do with the theory of those particles or of
how a microscope behaves, but that they can manipulate those particles in very direct and
tangible ways to achieve certain results.
• The ability to see through a microscope is acquired through manipulation (what is
an artifact of the instrument and what is reality are learned through practice)
• We believe what we see because by manipulation we have found the preparation
process that produces these sights to give stable and reliable results, and to be
related to what we see macroscopically in certain regular ways (the Grid
Argument, the electron beam arguments of "Experimentation and Scientific
Realism," in Kourany)
• We believe what we see through a particular instrument is reliable because we can
invent new and better ways of seeing it (optical microscopes, ultraviolet
microscopes, electron microscopes, etc.)
Hacking's argument contains three elements, that (a) manipulation causes cognitive
changes that give us new perceptual abilities, (b) we can manipulate the world in such a
way as to create microstructures that have the same properties as macrostructures we can
observe, and that, (c) combined with this fact, the convergence of the various instruments
on the same visual results gives us additional reason to believe that what we are seeing is
real, not an artifact of any particular instrument.
The final element (c) seems similar to the convergence argument we looked at last time,
when we were discussing IBE. There is a difference, however, since what is at issue is
not whether a single scientific theory implies things that are verified under many
independent circumstances, but whether we are convinced that we are seeing something
based on the fact of stable features using different viewing techniques. Nevertheless, it is
an argument from coincidence--wouldn't it be a miracle if all these independent viewing
techniques shared stable structural features and those features weren't really present in the
microscopic specimen?--and stands or falls on the same grounds as were discussed last
time.
However, that is not all that Hacking has at his disposal. His greatest strength is
discussing how we acquire new modes of perception by using instruments to manipulate
a world we cannot see. In his words, we don't see through a microscope, we see with a
microscope. That is something that must be learned by interacting with the microscopic
world, just as ordinary vision is acquired by interacting with the macroscopic world
around us.
In addition, Hacking wants to argue that we come to manipulate things in ways that do
not involve direct perception. This is where the example of using electrons to check for
parity violation of weak, neutral currents comes in. In this case, Hacking argues that it
might have once been the case that the explanatory virtues of atomic theory led one to
believe in their existence; but now we have more direct evidence. We can now use
electrons to achieve other results, and thus we are convinced of the existence of entities
with well-defined, stable causal properties. That does not mean that we know everything
there is to know about those particles (thus, we may disbelieve any of the particular
theories of the electron that are in existence); however, that there are entities with certain
causal properties is shown by experience, by manipulating electrons to achieve definite,
predictable results. (Hence the slogan, "If you can manipulate them, they must be real.")
This is why Hacking, like Cartwright, is an entity realist, but not a realist about scientific
theories.
Next time, we will examine various responses that a constructive empiricist such as van
Fraassen might give to such arguments.
Lecture 20
4/14/94
Entity Realism And The "Non-Empirical" Virtues
Last time we discussed arguments due to Cartwright and Hacking for the entity realist
position. Entity realism, as you recall, is the view that belief in certain microphysical
entities can be (and is) rationally compelling. Cartwright argues that we are rationally
required to believe in the existence of those entities that figure essentially in causal
explanations that we endorse. (Van Fraassen's response to her argument is that since the
endorsement of the explanation only amounts to accepting it--i.e., believing it to be
empirically adequate--belief in the unobservable entities postulated by the explanation is
not rationally required.) By contrast, Hacking argued that we are rationally required to
believe in the existence of those entities that we can reliably and stably manipulate. He
argues that once we start using entities such as electrons then we have compelling
evidence of their existence (and not merely the empirical adequacy of the theory that
postulates their existence). To make his case, he gives detailed descriptions of how we
stably and reliably interact with things shown by optical microscopes, and with electron
guns in the PEGGY series. The microscope case is especially interesting since it indicates
that a person can acquire new perceptual abilities by using new instruments and that
"observability" is a flexible notion.
The question that we will examine in the first part of the lecture today is whether
Hacking's arguments do not simply beg the question against van Fraassen's constructive
empiricism. Let us begin by discussing Hacking's argument that stability of certain
features of something observed using different instruments is a compelling sign of their
reality. In response, van Fraassen asks us to consider the process of developing such
instruments. When we are building various types of microscopes, we not only use theory
to guide the design, but also learn to correct for various artifacts (e.g., chromatic
aberration). As van Fraassen puts it, "I discard similarities that do not persist and also
build machines to process the visual output in a way that emphasizes and brings out the
noticed persistent similarities. Eventually the refined products of these processes are
strikingly similar when initiated in similar circumstances ... Since I have carefully
selected against non-persistent similarities in what I allow to survive the visual output
processing, it is not all that surprising that I have persistent similarities to display to you"
(Images of Science, page 298). In other words, van Fraassen argues that we design our
observational instruments (such as microscopes) to emphasize those features that we
regard as real, and de-emphasize those we regard as artifacts. If that is so, however, we
cannot point to convergence as evidence in the reality of those features, since we have
designed the instruments (optical, ultraviolet, electron microscopes) so that they all
converge on those features we have antecedently decided are real, not artifacts.
The principle that Hacking is using in his argument from stability across different
observational techniques is that if there is a certain kind of stable input-output match in
our instruments, we can be certain that the output is a reliable indicator of what is there at
the microphysical level. Van Fraassen notes that, given the constraints that we place on
design and that the input is the same (say, a certain type of prepared blood sample), it is
not surprising that the output would be the same, even if the microstructure that we "see"
through the microscope has no basis in reality.
Thus, Hacking is correct in seeking a more striking example, which he attempts to
provide with his Grid Argument. (Here, as you recall, a grid is drawn, photographically
reduced, and that reduction is used to manufacture a tiny metal grid. The match between
the pattern we see through the microscope at the end of the process and the pattern
according to which the grid was drawn at the beginning indicates both that the grid-
manufacturing process is a reliable one and that the microscope is a reliable instrument
for viewing microphysical structure. Thus, by analogy, Hacking argues that we should
believe it what the microscope reveals about the microphysical structure of things that we
have not manufactured.) Van Fraassen objects on two levels. First, he argues that
argument by analogy is not strong enough to support scientific realism. For analogy
requires an assumption that if one class of things resembles another in one respect, they
must resemble it in another. (To be concrete, if the microscopic grid and a blood cell
resemble one another in being microscopic, and we can be sure that the former is real
because its image in the microscope matches the pattern that we photographically
reduced, then we can by analogy infer that what the microscope shows us about the blood
cell must be accurate as well.) To this van Fraassen replies, "Inspiration is hard to find,
and any mental device that can help us concoct more complex and sophisticated novel
hypotheses is to be welcomed. Hence, analogical thinking is welcome. But it belongs to
the context of discovery, and drawing ingenious analogies may help to find, but does not
support, novel conjectures" (Images of Science, page 299).
Discuss. What could Hacking reply? Consider the following: to show that an analogical
inference is unjustified, you have to argue that there is an important disanalogy between
the two that blocks the inference. Here the obvious candidate is that the grid is a
manufactured object, whereas the blood cell, if it exists, is not. Since we would not
accept any process as reliable that did not reproduce the macroscopic grid pattern we
drew, it is no surprise that what we see through the microscope matches the pattern we
drew--for we developed the process so that things would work out that way. However, we
cannot say the same thing about the blood cell. Is this convincing?
Second, van Fraassen argues that to make Hacking's argument work we have to assume
that we have successfully produced a microscopic grid. How do we know that? Well,
because there is a coincidence between the pattern we observe through the microscope
and the pattern we drew at the macroscopic level. Hacking's argument is that we would
have to assume some sort of cosmic conspiracy to explain the coincidence if the
microscopic pattern did not reflect the real pattern that was there, which would be
unreasonable. Van Fraassen's response is that not all observable regularities require
explanation.
Discuss. Might Hacking fairly object that, while it is true that not all coincidences require
explanation, it is true that coincidences require explanation unless there is positive reason
to think they cannot be explained. For this reason, he might argue that van Fraassen's
counterexamples to a generalized demand for explanation, e.g., those based on the
existence of verified coincidences predicted by quantum physics (the EPR type), are not
telling, since we have proofs (based on the physical theories themselves) that there can be
no explanations of those kinds of events. (Also, discuss van Fraassen's argument that to
explain a coincidence by postulating a deeper theory will not remove all coincidences;
eventually, "why" questions terminate.)
Now, we turn to the question of whether Hacking's claim that the fact that we manipulate
certain sorts of microscopic objects speaks to their reality, not just the empirical adequacy
of the theory that postulates those entities. Van Fraassen would want to ask the following
question: How do you know that your description of what you are manipulating is a
correct one? All you know is that if you build a machine of a certain sort, you get certain
regular effects at the observable level, and that your theory tells you that this is because
the machine produces electrons in a particular way to produce that effect. The
constructive empiricist would accept that theory, too, and so would accept the same
description of what was going on; but for him that would just mean that the description is
licensed by a theory he believes to be empirically adequate. To infer that the observable
phenomena that are described theoretically as "manipulating electrons in such-and-such a
way to produce an effect" compels belief in something more than empirical adequacy
requires assuming that the theoretical description of what he is doing is not merely
empirically adequate, but also true. However, that is what the constructivist would deny.
Non-Empirical Virtues As A Guide To Truth: A Defense?
We have already examined in some detail van Fraassen's reasons for thinking that non-
empirical features of theories that we regard as desirable (simplicity, explanation,
fruitfulness) are not marks of truth, nor even of empirical adequacy, though they are
certain pragmatic reasons for preferring one theory over another (i.e., using that theory
rather than another). Let us look briefly at a defense of the view that "non-empirical
virtues" of this sort are indeed guides to truth, due to Paul Churchland.
Churchland's view is basically that van Fraassen's argument against the non-empirical
virtues as guides to truth can be used against him. Since we are unable to assess all the
empirical evidence for or against a particular theory, we have in the end to decide what to
believe based on what is simplest, most coherent or explanatory. This would sound much
like the "forced choice" response that we looked at last time, except that Churchland is
arguing something subtly different. He argues that van Fraassen cannot decide which of
two theories is more likely to be empirically adequate without basing his decision on
which of the two theories is simplest, most fruitful and explanatory, and so on. That is
because the arguments that apply to the theory's truth with regard to its subject matter
(available evidence in principle can never logically force the issue one way or another
when it comes to unobservable structure), apply just as well to empirical adequacy. We'll
never have complete knowledge of all observable evidence either, and so nothing
compels us one way or another to accept one theory as empirically adequate rather than
another (when both agree on what we have observed so far, or ever will observe).
However, we cannot completely suspend belief altogether. Since that is the only choice
that van Fraassen's reasoning gives us in the end, it ought to be rejected, and so we can
conclude that non-empirical virtues are just as much a guide to truth as are what we have
observed.
Churchland is a philosopher who works in cognitive psychology (with an emphasis on
neurophysiology). He argues that we know that "values such as ontological simplicity,
coherence, and explanatory power are some of the brain's most basic criteria for
recognizing information, for distinguishing information from noise ... Indeed, they even
dictate how such a framework is constructed by the questing infant in the first place"
("The Ontological Status of Observables," Images of Science, page 42). Thus, he
concludes that since even our beliefs about what is observable are grounded in what van
Fraassen calls the merely "pragmatic" virtues (so-called because they give us reason to
use a theory without giving us reason to believe that it is true), then it cannot be
unreasonable to use criteria such as simplicity, explanatory power, and coherence to form
beliefs about what is unobservable to us. Van Fraassen's distinction between 'empirical,
and therefore truth-relevant' and 'pragmatic, and therefore not truth-relevant' is therefore
not a tenable one.
Churchland concludes with a consideration that I will leave you as food for thought.
Suppose that a humanoid race of beings were born with electron microscopes for eyes.
Van Fraassen would then argue that since for them the microstructure of the world would
be seen directly, they would unlike us have different bounds on what they regard as
"observable." Churchland regards this distinction as wholly unmotivated. He points out
that van Fraassen's view leads to the absurd conclusion that they can believe in what their
eyes tell them but we cannot, even though if we put our eyes up to an electron microscope
we will have the same visual experiences as the humanoids. There is no differences
between the causal chain leading from the objects that are perceived and the experience
of perception in both cases, but van Fraassen's view leads to the conclusion that
nevertheless we and the humanoids must embrace radically different attitudes towards the
same scientific theories. This he regards as implausible.
Lecture 21
4/19/94
Laudan on Convergent Realism
Last time, we discussed van Fraassen's criticism of Hacking's arguments for entity
realism based on the Principle of Consistent Observation (as in the microscope case, most
notably the grid argument). We also talked about Churchland's criticisms of van
Fraassen's distinguishing empirical adequacy from the "pragmatic" or non-empirical
virtues. Churchland argued that, based on what we know about perceptual learning,
there's no principled reason to think that empirical adequacy is relevant to truth but
simplicity is not.
Today we're going to look at an argument for scientific anti-realism that proceeds not by
arguing a priori that observability has a special epistemological status, but by arguing
that, based on what we know about the history of science, scientific realism is
indefensible. Laudan discusses a form of realism he calls "convergent" realism, and seeks
to refute it. Central to this position is the view that the increasing success of science
makes it reasonable to believe (a) that theoretical terms that remain across changes in
theories refer to real entities and (b) that scientific theories are increasing approximations
to the truth. The convergent realist's argument is a dynamic one: it argues from the
increasing success of science to the thesis that scientific theories are converging on the
truth about the basic structure of the world. The position thus involves the concepts of
reference, increasing approximations to the truth, and success, which we have to
explicate first before discussing arguments for the position.
Sense vs. Reference. Philosophers generally distinguish the sense of a designating term
from its reference. For example, "the current President of the U.S." and "the Governor of
Arkansas in 1990" are different descriptive phrases with distinct senses (meanings), but
they refer to the same object, namely Bill Clinton. Some descriptive phrases are
meaningful, but have no referent, e.g., "the current King of France." We can refer to
something by a description if the description uniquely designates some object by virtue of
a certain class of descriptive qualities. For example, supposing that Moses was a real
rather than a mythical figure, descriptions such as "the author of Genesis," "the leader of
the Israelites out of Egypt" and "the author of Exodus" might serve to designate Moses
(as would the name "Moses" itself). Suppose now that we accept these descriptions as
designations of Moses but subsequently discover that Moses really didn't write the books
of Genesis and Exodus. (Some biblical scholars believe the stories in these books were
transmitted orally from many sources and weren't written down until long after Moses
had died.) This would amount to discovering that descriptions such as "the author of
Genesis" do not refer to Moses, but this wouldn't mean that Moses didn't exist, simply
that we had a false belief about him. While we may pick out Moses using various
descriptions, it doesn't mean that all the descriptions we associate with Moses have to be
correct for us to do so.
Realists hold that unobservable entities are like that between changes in scientific
theories. Though different theories of the electron have come and gone, all these theories
refer to the same class of objects, i.e., electrons. Realists argue that terms in our best
scientific theories (such as "electron") typically refer to the same classes of unobservable
entities across scientific change, even though the set of descriptions (properties) that we
associate with those classes of objects change as the theories develop.
Approximate Truth. The notion of approximate truth has never been clearly and
generally defined, but the intuitive idea is clear enough. There are many qualities we
attribute to a class of objects, and the set of qualities that we truly attribute to those
objects increases over time. If that is so, then we say that we are moving "closer to the
truth" about those objects. (Perhaps in mathematical cases we can make clearer sense of
the notion of increasing closeness to the truth; i.e., if we say a physical process evolves
according to a certain mathematical equation or "law," we can sometimes think of
improvements of that law converging to the "true" law in a well-defined mathematical
sense.)
The "Success" of Science. This notion means different things to different people, but is
generally taken to refer to the increasing ability science gives us to manipulate the world,
predict natural phenomena, and build more sophisticated technology.
Convergent realists often argue for their position by pointing to the increasing success of
science. This requires that there be a reasonable inference from a scientific theory's
"success" to its approximate truth (or to the thesis that its terms refer to actual entities).
However, can we make such an inference? As Laudan presents the convergent realist's
argument in "A Confutation of Convergent Realism," the realist is arguing that the best
explanation of a scientific theory's success is that it is true (and its terms refer). Thus, the
convergent realist uses "abductive" arguments of the following form.
If a scientific theory is approximately true, it will (normally) be successful.
[If a scientific theory is not approximately true, it will (normally) not be
successful.]
Scientific theories are empirically successful.
Scientific theories are approximately true.
If the terms in a scientific theory refer to real objects, the theory will (normally)
be successful.
[If the terms in a scientific theory do not refer to real objects, the theory will
(normally) not be successful.]
Scientific theories are empirically successful.
The terms in scientific theories refer to real objects.
Laudan does not present the argument quite this way; in particular, he does not include
the premises in brackets. These are needed to make the argument even prima facie
plausible. (Note, however, that the premises in brackets are doing almost all the work.
That's OK since, as Laudan points out, the convergent realist often defends the first
premise in each argument by appealing to the truth of the second premise in that
argument.)
Question: Could the convergent realist use weaker premises, e.g., that a scientific theory
is more likely to be successful if it is true than if it is false (with a similar premise serving
the role in the case of reference). Unfortunately, this would not give the convergent
realist what he wants since he could only infer from this that success makes it more likely
that the theory is true--which is not to say that success makes it likely that the theory is
true. (This can be seen by examining the probabilistic argument below, where pr(T is
successful) _ 1 and pr(T is true) _ 1, and PR(-) = pr(-|T is successful).)
(1) pr(T is successful | T is true) > pr(T is successful | T is false)
(2) pr(T is true | T is successful) > pr(T is true)
(3) PR(T is successful) = 1
PR(T is true) > pr(T is true)
That said, let's go with the stronger argument that uses the conditionals stated above.
Without the premises in brackets, these arguments are based on inference to the best
explanation of the success of science. As such, it would be suspect, for the reasons given
before. Leaving that issue aside, Laudan argues that any premises needed for the
argument to go through are false. Let us consider the case of reference first. Laudan
argues that there is no logical connection between having terms that refer and success.
Referring theories often are not successful for long periods of time (e.g., atomic theory,
Proutian hydrogen theory of matter, and the Wegnerian theory of plates), and successful
theories can be non-referring (e.g., phlogiston theory, caloric theory, and nineteenth
century ether theories).
Next, we consider the premise that if a scientific theory is approximately true, it is
(normally) successful. However, we wouldn't expect theories that are approximately true
overall to result in more true than false consequences in the realm of what we have
happened to observe. (That's because a false but approximately true scientific theory is
likely to make many false predictions, which could for all we know "cluster" in the
phenomena we've happened to observe.) In addition, since theories that don't refer can't
be "approximately true," the examples given above to show that there are successful but
non-referring theories also show that there can be successful theories that aren't
approximately true (e.g., phlogiston theory, caloric theory, and nineteenth century ether
theories).
Retention Arguments. Because the simple arguments given above are not plausible,
convergent realists often try to specify more precisely which terms in successful scientific
theories we can reasonably infer refer to real things in the world. The basic idea is that we
can infer that those features of a theory that remain stable as the theory develops over
time are the ones that "latch onto" some aspect of reality. In particular, we have the
following two theses.
Thesis 1 (Approximate Truth). If a certain claim appears an initial member of a
succession of increasingly successful scientific theories, and either that claim or a claim
of which the original claim is a special case appears in subsequent members of that
succession, it is reasonable to infer that the original claim was approximately true, and
that the claims that replace it as the succession progresses are increasing approximations
to the truth.
Thesis 2 (Reference). If a certain term putatively referring to certain type of entity
occurs in a succession of increasingly successful scientific theories, and there is an
increasingly large number of properties that are stably attributed to that type of entity as
the succession progresses, then it is reasonable to infer that the term refers to something
real that possesses those properties.
There are many ways a theory can retain claims (or terms) as it develops and becomes
more successful. (We will allow those claims and terms that remain stable across radical
change in theories, such as "paradigm shifts" of the sort described by Kuhn.) For
example, the old theory could be a "limiting case" of the new, in the formal sense of
being derivable from it (perhaps only with auxiliary empirical assumptions that according
to the new theory are false); or, the new theory may reproduce those empirical
consequences of the old theory that are known to be true (and maybe also explain why
things would behave just as if the old there were true in the domain that was known at the
time); finally, the new theory may preserve some explanatory features of the old theory.
Convergent realists argue from retention of some structure across theoretical change that
leads to greater success that whatever is retained must be either approximately true (in the
case of theoretical claims such as Newton's Laws of Motion, which are "limiting cases"
of laws that appear in the new, more successful theory) or must refer to something real
(in the case of theoretical terms such as "electron," which have occurred in an
increasingly successful succession of theories as described by Thesis 2).
Next time, we will examine an objection to retention arguments, as well as several replies
that could be made on behalf of the convergent realist.
Lecture 22
4/21/94
Convergent Realism and the History of Science
Last time, we ended by discussing a certain type of argument for convergent realism
known as retention arguments. These arguments rely on the following two theses.
Thesis 1 (Approximate Truth). If a certain claim appears an initial member of a
succession of increasingly successful scientific theories, and either that claim or a claim
of which the original claim is a special case appears in subsequent members of that
succession, it is reasonable to infer that the original claim was approximately true, and
that the claims that replace it as the succession progresses are increasing approximations
to the truth.
Thesis 2 (Reference). If a certain term putatively referring to certain type of entity
occurs in a succession of increasingly successful scientific theories, and there is an
increasingly large number of properties that are stably attributed to that type of entity as
the succession progresses, then it is reasonable to infer that the term refers to something
real that possesses those properties.
As I noted last time, there are many ways a theory can retain claims (or terms) as it
develops and becomes more successful. (We will allow those claims and terms that
remain stable across radical change in theories, such as "paradigm shifts" of the sort
described by Kuhn.) For example, the old theory could be a "limiting case" of the new, in
the formal sense of being derivable from it (perhaps only with auxiliary empirical
assumptions that according to the new theory are false); or, the new theory may
reproduce those empirical consequences of the old theory that are known to be true (and
maybe also explain why things would behave just as if the old there were true in the
domain that was known at the time); finally, the new theory may preserve some
explanatory features of the old theory. Convergent realists argue from retention of some
structure across theoretical change that leads to greater success that whatever is retained
must be either approximately true (in the case of theoretical claims such as Newton's
Laws of Motion, which are "limiting cases" of laws that appear in the new, more
successful theory) or must refer to something real (in the case of theoretical terms such as
"electron," which have occurred in an increasingly successful succession of theories as
described by Thesis 2).
Warning: Now I should note that I am presenting the convergent realist's position, and
retention arguments in general, somewhat differently than Laudan does in "A confutation
of convergent realism." In that article, Laudan examines the "retentionist" thesis that new
theories should retain the central explanatory apparatus of their predecessors, or that the
central laws of the old theory should provably be special cases of the central laws of the
new theory. This is a prescriptive account of how science should proceed. According to
Laudan, convergent realists also hold that scientists follow this strategy, and that the fact
that scientists are able to do so proves that the successive theories as a whole are
increasing approximations to the truth. He objects, rightly, that while there are certain
cases where retention like this occurs (e.g., Newton-Einstein), there are many cases in
which retention of this sort does not occur (Lamarck-Darwin; catastrophist-
uniformitarian geology; corpuscular-wave theory of light; also, any of the examples that
involve an ontological "loss" of the sort emphasized by Kuhn, e.g., phlogiston, ether,
caloric). Moreover, Laudan points out that when retention occurs (as in the transition
from Newtonian to relativistic physics), it only occurs with regard to a few select
elements of the older theory. The lesson he draws from this is that the "retentionist"
strategy is generally not followed by scientists, so the premise in the retentionist
argument that says that scientists successfully follow this strategy is simply false. I take it
that Laudan is correct in his argument, and refer you to his article for his refutation of that
type of "global" retentionist strategy. What I'm doing here is slightly different, and more
akin to the retentionist argument given by Ernan McMullin in his article "A Case for
Scientific Realism." A retentionist argument that uses Theses 1 and 2 above and the fact
that scientific theories are increasingly successful to argue for a realist position is not
committed to the claim that everything in the old theory must be preserved in the new
(perhaps only as a special case); it is enough that some things are preserved. (McMullin,
in particular, uses a variation on Thesis 2, where the properties in question are structural
properties, to argue for scientific realism. See the section of his article entitled "The
Convergences of Structural Explanation.") That is because the convergent realist whom I
am considering claims only that it is reasonable to infer the reality (or approximate truth)
of those things that are retained (in one of the senses described above) across theoretical
change. Thus, it is not an objection to the more reasonable convergent realist position that
I'm examining here (of which McMullin's convergent realism is an example) to claim that
not everything is retained when the new theory replaces the old, and that losses in overall
ontology occur with theoretical change along with gains.
That said, there are still grounds for challenging the more sensible, selective retentionist
arguments that are based on Theses 1 and 2. I will concentrate on the issue of successful
reference (Thesis 2); similar arguments can be given for the case of approximate truth
(Thesis 1). (Follow each objection and reply with discussion.)
Objection: The fact that a theoretical term has occurred in a succession of increasingly
successful scientific theories as described by Thesis 2 does not guarantee that it will
continue to appear in all future theories. After all, there have been many theoretical terms
that appeared in increasingly successful research programs (phlogiston, caloric, ether) in
the way described by Thesis 2, but those research programs subsequently degenerated
and went the way of the dinosaurs. What the convergent realist needs to show to fully
defend his view is that there is reason to think that those terms and concepts that have
been retained in recent theories (electron, quark, DNA, genes, fitness) will continue to be
retained in all future scientific theories. However, there is no reason to think that: indeed,
if we examine the history of science we should infer that any term or concept that appears
in our theories today is likely to be replaced at some time in the future.
Reply 1: The fact that many terms have been stably retained as described by Thesis 2 in
recent theories is the best reason one could have to believe that they will continue to be
retained in all future theories. Of course, there are no guarantees that this will be the case:
quarks may eventually go the way of phlogiston and caloric. Nevertheless, it is
reasonable to believe that those terms will be retained, and, what is more important, it
becomes more reasonable to believe this the longer such terms are retained and the more
there is a steady accumulation of properties that are stably attributed to the type of
entities those terms putatively designate. Anti-realists such as Laudan are right when they
point out that one cannot infer that the unobservable entities postulated by a scientific
theory are real based solely on the empirical success of that theory, but that is not what
scientists do. The inference to the reality of the entities is also a function of the degree of
stability in the properties attributed to those entities, the steadiness of growth in that class
of properties over time, and how fruitful the postulation of entities of that sort has been in
generating accumulation of this sort. (See McMullin, "The Convergences of Structural
Explanation" and "Fertility and Metaphor," for a similar point.)
Reply 2: It's not very convincing to point to past cases in the history of science where
theoretical entities, such as caloric and phlogiston, were postulated by scientists but later
rejected as unreal. Science is done much more rigorously, rationally, and successfully
nowadays than it was in the past, so we can have more confidence that the terms that
occur stably in recent theories (e.g., electron, molecule, gene, and DNA) refer to
something real, and that they will continue to play a role in future scientific theories.
Moreover, though our conception of these entities has changed over time, there is a
steady (and often rapid) accumulation of properties attributed to entities postulated by
modern scientific theories, much more so than in the past. This shows that the terms that
persist across theoretical change in modern science should carry more weight than those
terms that persisted across theoretical change in earlier periods.
Reply 3: The objection overemphasizes the discontinuity and losses that occur in the
history of science. Laudan, like Kuhn, wants to argue there are losses as well as gains in
the ontology and explanatory apparatus of science. However, in doing so he
overemphasizes the importance of those losses for how we should view the progression
of science. The fact that scientific progress is not strictly cumulative does not mean that
there is not a steady accumulation of entities in scientific ontology and knowledge about
the underlying structural properties that those entities possess. Indeed, one can counter
Laudan's lists of entities that have been dropped from the ontology of science with
equally long lists of entities that were introduced and were retained across theoretical
change. Our conception of these entities has changed, but rather than focusing on the
conceptual losses, one should focus on the remarkable fact that there is a steadily
growing class of structural properties that we attribute to the entities that have been
retained, even across radical theoretical change. (See McMullin, "Sources of Antirealism:
History of Science," for a similar point.)
Lecture 23
4/26/94
The Measurement Problem, Part I
For the next two days we will be examining the philosophical problems that arise in
quantum physics, specifically those connected with the interpretation of measurement.
Today's lecture will provide the necessary background and describe how measurement is
understood according to the "orthodox" interpretation of quantum mechanics, and several
puzzling features of that view. The second lecture will describe various "non-orthodox"
interpretations of quantum mechanics that attempt to overcome these puzzles by
interpreting measurement in a quite different way.
A Little Historical Background
As many of you know, the wave theory of light won out over the corpuscular theory by
the beginning of the nineteenth century. By the end of that century, however, problems
had emerged with the standard wave theory. The first problem came from attempts to
understand the interactions of matter and light. A body at a stable temperature will radiate
and absorb light. The body will be at equilibrium with the light, which has its energy
distributed among various frequencies. The frequencies of light emitted by the body
varies with its temperature, as is familiar from experience. However, by the end of the
nineteenth century physicists found that they could not account theoretically for that
frequency distribution. Two attempts to do so within the framework of Maxwell's theory
of electromagnetic radiation and statistical mechanics led to two laws--Wien's law and
the Rayleigh-Jeans law--that failed at lower and higher frequencies, respectively.
In 1900, Max Planck postulated a compromise law that accounted for the observed
frequency distribution, but it had the odd feature of postulating that light transmitted
energy in discrete packets. The packets at each frequency had energy E = hn, where h is
Planck's constant and n is the frequency of the light. Planck regarded his law as merely a
phenomenological law, and thought of the underlying distribution of energy as
continuous. In 1905, Einstein proposed that all electromagnetic radiation consists of
discrete particle-like packets of energy, each with an energy hn. He referred to these
packets, which we now call photons, as quanta of light. Einstein used this proposal to
explain the photoelectric effect (where light shining on an electrode causes the emission
of electrons).
After it became apparent that light, a wave phenomenon, had particle-like aspects,
physicists began to wonder whether particles such as electrons might also have wave-like
aspects. L. de Broglie proposed that they did and predicted that under certain conditions
electrons would exhibit wave-like behavior such as interference and diffraction. His
prediction turned out to be correct: a diffraction pattern can be observed when electrons
are reflected off a suitable crystal lattice (see Figure 1, below).
Figure 1
Electron Diffraction By A Crystal Lattice
Schrödinger soon developed a formula for describing the wave associated with electrons,
both in conditions where they are free and when they are bound by forces. Famously, his
formula was able to account for the various possible energy levels that had already been
postulated for electrons orbiting a nucleus in an atom. At the same time, Heisenberg was
working on a formalism to account for the various patterns of emission and absorption of
light by atoms. It soon became apparent that the discrete energy levels in Heisenberg's
abstract mathematical formalism corresponded to the energy levels of the various
standing waves that were possible in an atom according to Schrödinger's equation.
Thus, Schrödinger proposed that electrons could be understood as waves in physical
space, governed by his equation. Unfortunately, this turned out to be a plausible
interpretation only in the case of a single particle. With multiple-particle systems, the
wave associated with Schrödinger's equation was represented in a multi-dimensional
space (3n dimensions, where n is the number of particles in the system). In addition, the
amplitudes of the waves were often complex (as opposed to real) numbers. Thus, the
space in which Schrödinger's waves exist is an abstract mathematical space, not a
physical space. Thus, the waves associated with electrons in Schrödinger's wave
mechanics could not be interpreted generally as "electron waves" in physical space. In
any case, the literal interpretation of the wave function did not cohere with the observed
particle-like aspects of the electron, e.g., that electrons are always detected at particular
points, never as "spread-out" waves.
The Probability Interpretation Of Schrödinger's Waves; The Projection Postulate
M. Born contributed to a solution to the problem by suggested that the square of the
amplitude of Schrödinger's wave at a certain point in the abstract mathematical space of
the wave represented the probability of finding the particular value (of position or
momentum) associated with that point upon measurement. This only partially solves the
problem, however, since the interference effects are physically real in cases such as
electron diffraction and the famous two-slit experiment (see Figure 2).
Figure 2
The Two-Slit Experiment
If both slits are open, the probability of an electron impinging on a certain spot on the
photographic plate is not the sum of the probabilities that it would impinge on that spot if
it passes through slit 1 and that it would impinge on that spot if it passes through slit 2, as
is illustrated by pattern (a). Instead, if both slits are opinion you get an interference
pattern of the sort illustrated by pattern (b). This is the case even if the light source is so
weak that you are in effect sending only one photon at a time through the slits.
Eventually, an interference pattern emerges, one point at a time.
Interestingly, the interference pattern disappears if you place a detector just behind one of
the slits to determine whether the photon passed through that slit or not, and you get a
simple sum of the waves as illustrated by pattern (a). Thus, you can't explain the
interference pattern as resulting from interaction among the many photons in the light
source.
A similar sort of pattern is exhibited by electrons when you measure their spin (a two-
valued quantity) in a Stern-Gerlach device. (This device creates a magnetic field that is
homogeneous in all but one direction, e.g., the vertical direction; in that case, electrons
are deflected either up or down when they pass through the field.) You get interference
effects here just as in the two-slit experiment described above. (See Figure 3)
Figure 3
Interference Effects Using Stern-Gerlach Devices
When a detector is placed after the second, left-right oriented Stern-Gerlach device as in
(d), the final, up-down oriented Stern-Gerlach device shows that half of the electrons are
spin "up" and half of the electrons are spin "down." In this case, the electron beam that
impinges on the final device is a definite "mixture" of spin "left" and spin "right"
electrons. On the other hand, when no detector is placed after the second, left-right
oriented Stern-Gerlach device as in (e), all the electrons are measured "up" by the final
Stern-Gerlach device. In that case, we say that the beam of electrons impinging on the
final device is a "superposition" of spin "left" and spin "right" electrons, which happens
to be equivalent to a beam consisting of all spin "up" electrons. Thus, placing a device
before the final detector destroys some information present in the interference of the
electron wave, just as placing a detector after a slit in the two-slit experiment destroyed
the interference pattern there.
J. von Neumann generalized the formalisms of Schrödinger and Heisenberg. In von
Neumann's formalism, the state of a system is represented by a vector in a complex,
multi-dimensional vector space. Observable features of the world are represented by
operators on this vector space, which encode the various possible values that that
observable quantity can have upon measurement. Von Neumann postulated that the "state
vector" evolves deterministically in a manner consistent with Schrödinger's equation,
until there is a measurement, in which case there is a "collapse," which
indeterministically alters the physical state of the system. This is von Neumann's famous
"Projection Postulate."
Thus, von Neumann postulated that there were two kinds of change that could occur in a
state of a physical system, one deterministic (Schrödinger evolution), which occurs when
the system is not being measured, and one indeterministic (projection or collapse), which
occurs as a result of measuring the system. The main argument that von Neumann gave
for the Projection Postulate is that whatever value of a system is observed upon
measurement will be found with certainty upon subsequent measurement (so long as
measurement does not destroy the system, of course). Thus, von Neumann argued that the
fact that the value has a stable result upon repeated measurement indicates that the system
really has that value after measurement.
The Orthodox Copenhagen Interpretation (Bohr)
What about the state of the system in between measurements? Do observable quantities
really have particular values that measurement reveals? Or are there real values that exist
before measurement that the measurement process itself alters indeterministically? In
either case, the system as described by von Neumann's state vectors (and Schrödinger's
wave equation) would have to be incomplete, since the state vector is not always an
"eigenvector," i.e., it does not always lie along an axis that represents a particular one of
the possible values that a particular observable quantity (such as spin "up"). Heisenberg
at first took the view that measurement indeterministically alters the state of the system
that existed before measurement. This implies that the system was in a definite state
before measurement, and that the quantum mechanical formalism gives an incomplete
description of physical systems. Famously, N. Bohr proposed an interpretation of the
quantum mechanical formalism that denies that the description given by the quantum
mechanical formalism is incomplete. According to Bohr, it only makes sense to attribute
values to observable quantities of a physical system when system is being measured in a
particular way. Descriptions of physical systems therefore only make sense relative to
particular contexts of measurement. (This is Bohr's solution to the puzzling wave-particle
duality exhibited by entities such as photons and electrons: the "wave" and "particle"
aspects of these entities are "complementary," in the sense that it is physically impossible
to construct a measuring device that will measure both aspects simultaneously. Bohr
concluded that from a physical standpoint it only makes sense to speak about the "wave"
or "particle" aspects of quantum entities as existing relative to particular measurement
procedures.) One consequence of Bohr's view is that one cannot even ask what a physical
system is like between measurements, since any answer to this question would
necessarily have to describe what the physical system is like, independent of any
particular context of measurement.
It is important to distinguish the two views just described, and to distinguish Bohr's view
from a different view that is sometimes attributed to him:
A physical system's observable properties always have definite values between
measurement, but we can never know what those values are since the values can only be
determined by measurement, which indeterministically disturbs the system. (Heisenberg)
It does not make sense to attribute definite values to a physical system's observable
properties except relative to a particular kind of measurement procedure, and then it only
makes sense when that measurement is actually being performed. (Bohr)
A physical system's observable properties have definite values between measurement, but
these values are not precise, as is the case when the system's observable properties are
being measured; rather, the values of the system's observable quantities before
measurement are "smeared out" between the particular values that the observable quantity
could have upon measurement. (Pseudo-Bohr)
Each of these views interprets a superposition differently, as a representation of our
ignorance about the true state of the system (Heisenberg), as a representation of that
values that the various observable quantities of the system could have upon measurement
(Bohr), or as a representation of the indefinite, imprecise values that the observable
quantities have between measurements (Pseudo-Bohr). Accordingly, each of these views
interprets projection or collapse differently, as a reduction in our state of ignorance
(Heisenberg), as the determination of a definite result obtained when a particular
measurement procedure is performed (Bohr), or as an instantaneous localization of the
"smeared out," imprecise values of a particular observable quantity (Pseudo-Bohr).
Next time we will discuss several problematic aspects of the Copenhagen understanding
of measurement, along with several alternative views.
Lecture 24
4/28/94
The Measurement Problem, Part II
Last time, we saw that Bohr proposed an interpretation of the quantum mechanical
formalism, the Copenhagen interpretation, that has become the received view among
physicists. Today we will look more closely at the picture that view gives us of physical
reality and the measurement process in particular. Then we will examine several
alternative interpretations of measurement, proceeding from the least plausible to the
most plausible alternatives.
The Copenhagen Understanding Of Measurement
According to Bohr, it only makes sense to attribute values to observable quantities of a
physical system when system is being measured in a particular way. Descriptions of
physical systems therefore only make sense relative to particular contexts of
measurement. (This is Bohr's solution to the puzzling wave-particle duality exhibited by
entities such as photons and electrons: the "wave" and "particle" aspects of these entities
are "complementary," in the sense that it is physically impossible to construct a
measuring device that will measure both aspects simultaneously. Bohr concluded that
from a physical standpoint it only makes sense to speak about the "wave" or "particle"
aspects of quantum entities as existing relative to particular measurement procedures.)
One consequence of Bohr's view is that one cannot even ask what a physical system is
like between measurements, since any answer to this question would necessarily have to
describe what the physical system is like, independent of any particular context of
measurement.
On Bohr's view (which is generally accepted as the "orthodox" interpretation of quantum
mechanics even though it is often misunderstood), the world is divided into two realms of
existence, that of quantum systems, which behave according to the formalism of quantum
mechanics and do not have definite observable values outside the context of
measurement, and of "classical" measuring devices, which always have definite values
but are not described within quantum mechanics itself. The line between the two realms
is arbitrary: at any given time, one can consider a part of the world that serves as a
"measuring instrument" (such as the Stern-Gerlach device) either as a quantum system
that interacts with other quantum systems according to the deterministic laws governing
the state vector (Schrödinger's equation) or as a measuring device that behaves
"classically" (i.e., always has definite observable properties) though indeterministically.
There are several difficulties with this view, which together constitute the "measurement
problem." To begin with, the orthodox interpretation gives no principled reason why
physics should not be able to give a complete description of the measurement process.
After all, a measuring device (such as a Stern-Gerlach magnet) is a physical system, and
in performing a measurement it simply interacts with another physical system such as a
photon or an electron. However, (a) there seems to be no principled reason why one
particular kind of physical interaction is either indescribable within quantum physics
itself (as Bohr suggests), or is not subject to the same laws (e.g., Schrödinger's equation)
that governs all other physical interactions, and (b) the orthodox view offers no precise
characterization that would mark off those physical interactions that are measurements
from those physical interactions that are not. Indeed, the orthodox interpretation claims
that whether a certain physical interaction is a "measurement" is arbitrary, i.e., a matter of
choice on the part of the theorist modeling the interaction. However, this hardly seems
satisfactory from a physical standpoint. It does not seem to be a matter of mere
convenience where we are to draw the line between the "classical" realm of measuring
devices and the realm of those physical systems that obey the deterministic laws of that
formalism (in particular, Schrödinger's equation). For example, Schrödinger pointed out
that the orthodox interpretation allows for inconsistent descriptions of the state of
macroscopic systems, depending on whether we consider them measuring devices. For
example, suppose that you placed a cat in an enclosed box along with a device that will
release poisonous gas if (and only if) a Geiger counter measures that a certain radium
atom has decayed. According to the quantum mechanical formalism, the radium atom is
in a superposition of decaying and not decaying, and since there is a correlation between
the state of the radium atom and the Geiger counter, and between the state of the Geigen
counter and the state of the cat, the cat should also be in a superposition, specifically, a
superposition of being alive and dead, if we do not consider the cat to be a measuring
device. If the orthodox interpretation is correct, however, this would mean that there is no
fact of the matter about whether the cat is alive or dead, if we consider the cat not to be a
measuring device. On the other hand, if we consider the cat to be a measuring device,
then according to the orthodox interpretation, the cat will either be definitely alive or
definitely dead. However, it certainly does not seem to be a matter of "arbitrary" choice
whether (a) there is no fact of the matter about whether the cat is alive or dead before we
look into the box, or (b) the cat is definitely alive or dead before we look into the box.
Wigner's Idealism: Consciousness As The Cause Of Collapse
Physicist E. Wigner argued against Bohr's view that the distinction between measurement
and "mere" physical interaction could be made arbitrarily by trying to emphasize its
counterintuitive consequences. Suppose that you put one of Wigner's friends in the box
with the cat. The "measurement" you make at a given time is to ask Wigner's friend if the
cat is dead or alive. If we consider your friend as part of the experimental setup, quantum
mechanics predicts that before you ask Wigner's friend whether the cat is dead or alive,
he is in a superposition of definitely believing the cat is dead and definitely believing that
the cat is alive. Wigner argued that this was an absurd consequence of Bohr's view.
People simply do not exist in superposed belief-states. Wigner's solution was that,
contrary to what Bohr claimed, there is a natural division between what constitutes a
measurement and what does not--the presence of a conscious observer. Wigner's friend is
conscious; thus, he can by an act of observation cause a collapse of the wave function.
(Alternately, if we consider the cat to be a conscious being, the cat would be definitely
alive or definitely dead even before Wigner's friend looked to see how the cat was doing.)
Few physicists have accepted Wigner's explanation of what constitutes a measurement,
though his view has "trickled down" to some of the (mainly poor quality) popular
literature on quantum mechanics (especially the type of literature that sees a direct
connection between quantum mechanics and Eastern religious mysticism). The basic
source of resistance to Wigner's idealism is that it requires that physicists solve many
philosophical problems they'd like to avoid, such as whether cats are conscious beings.
More seriously, Wigner's view requires a division of the world into two realms, one
occupied by conscious beings who are not subject to the laws of physics but who can
somehow miraculously disrupt the ordinary deterministic evolution of the physical
systems, and the other by the physical systems themselves, which evolve
deterministically until a conscious being takes a look at what's going on. This is hardly
the type of conceptual foundation needed for a rigorous discipline such as physics.
The Many-Worlds Interpretation
Another view, first developed by H. Everett (a graduate student at Princeton) in his Ph.D.
thesis, is called the many-worlds interpretation. It was later developed further by another
physicist, B. de Witt. This view states that there is no collapse when a measurement
occurs; instead, at each such point where a measurement occurs the universe "branches"
into separate, complete worlds, a separate world for every possible outcome that
measurement could have. In each branch, it looks like there the measuring devices
indeterministically take on a definite value, and the empirical frequencies that occur upon
repeated measurement in almost every branch converge on the probabilities predicted by
the Projection Postulate. The deterministically evolving wave function correctly describes
the quantum mechanical state of the universe as a whole, including all of its branches.
Everett's view has several advantages, despite its conceptual peculiarity. First, it can be
developed with mathematical precision. Second, it postulates that the universe as a whole
evolves deterministically according to Schrödinger's equation, so that you only need only
type of evolution, not Schrödinger's equation plus collapse during a measurement. Third,
the notion of a measurement (which is needed to specify where the branching occurs) can
be spelled out precisely and non-arbitrarily as a certain type of physical interaction.
Fourth, it makes sense to speak of the quantum state of the universe as a whole on
Everett's view, which is essential if you want to use quantum mechanical laws to develop
cosmological theories. By contrast, on the orthodox, Copenhagen view, quantum
mechanics can never describe the universe as a whole since quantum mechanical
description is always relativized to an arbitrarily specified measuring device, which is not
itself given a quantum mechanical description.
On the other hand, the many worlds interpretation has several serious shortcomings. First,
it's simply weird conceptually. Second, it does not account for the fact that we never
experience anything like a branching of the world. How is it that our experience is unified
(i.e., how is it that my experience follows one particular path in the branching universe
and not others)? Third, as noted above, the many worlds interpretation predicts that there
will be worlds where the observed empirical frequencies of repeated measurements will
not fit the predictions of the quantum mechanical formalism. In other words, if the theory
is true, there will be worlds at which it looks as if the theory is false! Finally, the theory
does not seem to give a clear sense to locutions such as "the probability of this electron
going up after passing through this Stern-Gerlach device is 1/2." What exactly does that
number 1/2 mean, if the universe simply branches into two worlds, in one of which the
electron goes up and in the other of which it goes down? The same branching would
occur, after all, if the electron were in a state where the probability of its going up were
3/4 and the probability of its going down were 1/4.
Because of problems like this, the many worlds interpretation has gained much notoriety
among physicists, but almost no one accepts it as a plausible interpretation.
The Ghirardi-Rimini-Weber (GRW) Interpretation
In 1986, three physicists (G.C. Ghirardi, A. Rimini, and T. Weber) proposed a way of
accounting for the fact that macroscopic objects (such as Stern-Gerlach devices, cats, and
human beings) are never observed in superpositions, whereas microscopic systems (such
as photons and electrons) are. (Their views were developed further in 1987 by physicist
John Bell.) According to the Ghirardi-Rimini-Weber (GRW) interpretation, there is a
very small probability (one in a trillion) that the wave functions for the positions of
isolated, individual particles will collapse spontaneously at any given moment. When
particles couple together to form an object, the tiny probabilities of spontaneous collapse
quickly add up for the system as a whole, since when one particle collapses so does every
particle to which that particle is coupled. Moreover, since even a microscopic bacterium
is composed of trillions and trillions of subatomic particles, the probability that a
macroscopic object will have a definite spatial configuration at any given moment is
vanishingly close to 100 percent. Thus, GRW can explain in a mathematically precise
way why we never observe macroscopic objects in superpositions, but often observe
interference effects due to superposition when we're looking at isolated subatomic
particles. Moreover, they too can give a precise and non-arbitrary characterization of the
measurement process as a type of physical interaction.
There are two problems with the GRW interpretation. First, though subatomic particles
are localized to some degree (in that the collapse turns the wave function representing the
position of those particles into a narrowly focused bell curve), they never do have precise
positions. The tails of the bell curve never vanish, extending to infinity. (Draw diagram
on board.) Thus, GRW never give us particles with definite positions or even with a small
but spatially extended positions: instead, what we get are particles having "mostly" small
but spatially extended positions. Importantly, this is also true of macroscopic objects
(such as Stern-Gerlach devices, cats, and human beings) that are composed of subatomic
particles. In other words, according to GRW you are "mostly" in this room, but there's a
vanishingly small part of you at every other point in the universe, no matter how distant!
Second, GRW predicts that energy is not conserved when spontaneous collapses occur.
While the total violation of conservation of energy predicted by GRW is too small to be
observed, even over the lifetime of the universe, it does discard a feature of physical
theory that many physicists consider to be conceptually essential.
Bohm's Interpretation
In 1952, physicist David Bohm formulated a complete alternative to standard (non-
relativistic) quantum mechanics. Bohm's theory was put into a relatively simple and
elegant mathematical form by John Bell in 1982. Bohm's theory makes the same
predictions as does standard (non-relativistic) quantum mechanics but describes a
classical, deterministic world that consists of particles with definite positions. The way
that Bohm does this is by postulating that besides particles, there is a quantum force that
moves the particles around. The physically real quantum force, which is represented
mathematically by Schrödinger's wave equation, pushes the particles around so that they
behave exactly as standard quantum mechanics predicts. Bohm's theory is deterministic:
if you knew the initial configuration of every particle in the universe, applying Bohm's
theory would allow you to predict with certainty every subsequent position of every
particle in the universe. However, there's a catch: the universe is set up so that it is a
physical (rather than a merely practical) impossibility for us to know the configuration of
particles in the universe. Thus, from our point of view the world behaves just as if it's
indeterministic (though the apparent indeterminism is simply a matter of our ignorance);
also, though the wave function governing the particle's motion never collapses, the
particle moves around so that it looks as if measurement causes it to collapse.
Despite its elegance and conceptual clarity, Bohm's theory has not been generally
accepted by physicists, for two reasons. (1) Physicists can't use Bohm's theory, since it's
impossible to know the configuration of all particles in the universe at any given time.
Because of our ignorance of this configuration, the physical theory we have to use to
generate predictions is standard quantum mechanics. Why then bother with Bohm's
theory at all? (2) What is more important, in Bohm's theory all the particles in the
universe are intimately connected so that every particle instantaneously affects the
quantum force governing the motions of the other particles. Many physicists (Einstein
included) saw this feature of Bohm's theory as a throwback to the type of "action at a
distance" expunged by relativity theory. Moreover, Bohm's theory assumes that
simultaneity is absolute and so is inconsistent with relativity theory in its present form.
(There is no Bohmian counterpart to relativistic quantum mechanics, though people are
currently working on producing one.)
Conclusion
Each of the various interpretations of quantum mechanics we have examined, from the
orthodox Copenhagen interpretation to the four non-orthodox interpretations (idealistic,
many-worlds, GRW, and Bohm) has some sort of conceptual shortcoming. This is what
makes the philosophy of quantum mechanics so interesting: it shows us that the fact that
experiments agree with the predictions of quantum mechanics, under any of these
interpretations, indicates one thing with certainty--the world we live in is a very weird
place!