Eur. J. Phys. 17 (1996) 172–175. Printed in the UK
172
Derivation of the Dirac equation from
a relativistic representation of spin
L Lerner
Department of Physics and Theoretical Physics, Australian National University, Canberra, Australia 0200
Received 20 December 1995
Abstract. A derivation of the Dirac and Weyl equations is
presented which does not assume a first-order time derivative
but instead is based on the relativistic transformation of
spinors in ordinary quantum mechanics. The Dirac and Weyl
4-currents, the transformation properties of the Dirac and
Weyl spinors, and the Lorentz covariance of the resulting
equations are established in the course of the derivation.
R´esum´e. On ´etablit les ´equations de Dirac et de Weyl sans
supposer a priori que la d´eriv´ee temporelle est du premier
ordre; on se base plutˆot sur la transformation relativiste des
spineurs en m´ecanique quantique ordinaire. On obtient par
cette m´ethode les quadricourants de Dirac et de Weyl, les
propri´et´es de transformation des spineurs de Dirac et de
Weyl, ainsi que la covariance de Lorentz des ´equations.
1. Introduction
A standard presentation of the Dirac equation to
undergraduate physics students follows the path of
its originator [1].
One attempts to find a Lorentz
invariant classical field equation of first order so as
to cure, it is stipulated, the problems associated with
the Klein–Gordon equation and arrive at a positive
definite charge density. That the resulting minimum-
dimensional solution to this equation takes the form of
a four-component field, which a posteriori is shown
to have spin-
1
2
, is frequently presented as a special
relativistic prediction of, or demand for, spin (see the
references in [2]). This analysis is normally followed
by a determination of the transformation properties of
the Dirac field, and the establishment of the relevant
current.
Such an approach is typical of a mathematical
investigation,
starting as it does with an initial
mathematical formulation, proceeding to the discovery
of the solution and its properties, and a subsequent
association of these with the features of a physical
system. Physical derivations generally run in the reverse
order.
The former approach has the detraction that
it is not physically transparent, presenting much as
a consequence of mathematics, and this leads many
physics students to feel uncomfortable with the result.
It also leads to the perception that the mathematical
features are amenable to an interpretation in terms of
a physical spin-
1
2
particle due to chance.
Another detraction of the approach of [1] is the
misunderstanding generated by it as to the origin of
spin in relativistic quantum mechanics.
We refer to
such statements in the literature as ‘Dirac by means of
a proper relativistic treatment of the wave equation was
able to show that the spin was a necessary consequence
of the principle of relativity’ [3] and ‘spin and all
its properties are now understood as a necessary and
natural consequence of the theory of relativity combined
with wave mechanics’ [4].
[2] contains a fuller list
of such rotations from the literature as well as a good
explanation of the problem associated with them (see
also [5]). The essence of the objection to them is as
follows. It is well known that further investigation of the
Dirac equation reveals that it is less physically motivated
than originally appears. The first-order equation and
associated definite sign of probability and charge density
is inadequate to solve the problems of the Klein–Gordon
equation. This is due to the fact that some solutions of
the Dirac equation can only be interpreted in terms of
antiparticles (which are a prediction of relativity) and
these require the existence of a charge density with
indeterminate sign. Other problems also appear, such
as zitterbewegung [6], negative scattering cross sections
[6], and so on. Hence the original reason for a first-
order equation has to be abandoned a posteriori, and
so the derivation is no longer physically based. The
‘prediction’ of a spin-
1
2
field is of course due to the
fact that the theories of special relativity and quantum
mechanics together form a very constrained system, so
that the requirement of a first-order equation with a
minimum-dimensional solution is adequate to pick out
the spin-
1
2
field.
The current paper presents a physical derivation
of the Dirac equation.
We start from the rotational
transformation properties of a non-relativistic spin-
1
2
particle, which are well known to undergraduate physics
students, and proceed to establish the behaviour of this
0143-0807/96/040172+04$19.50
c
1996 IOP Publishing Ltd & The European Physical Society
Derivation of the Dirac equation
173
field under relativistic transformations. This leads to
the discovery of the relativistic spin-
1
2
current, that is
the Dirac current. The requirement that this current is
conserved leads to a unique determination of the Lorentz
invariant equation satisfied by the relativistic spin-
1
2
field.
The present derivation also has the advantage
of [7], it is Lorentz covariant ab-initio, so this property
does not have to be checked subsequent to derivation.
A similar approach to the problem was adopted
by several authors [8], who used advanced group
theoretic methods. However, the knowledge of group
theory required is generally beyond that of the average
physicist, let alone an undergraduate student.
The
present method follows along the lines of Feynman [9],
the main point of difference being that the field equation
is derived from the current in a straightforward manner,
without any further suppositions.
Initially it might appear that the present derivation
is considerably longer than the standard one. However,
one should note that most of the standard properties
of the Dirac equation are obtained in the course of
the one derivation, so that the overall length of both
presentations is comparable.
On the other hand, the
present approach is based on an understanding of
spin in the context of ordinary quantum mechanics
and so is claimed to give a much better physical
understanding of the equation and consequently be
superior pedagogically.
(We assume
c = ¯h = 1
throughout.)
2. Lorentz transformation of a spin-
1
2
field
In non-relativistic quantum mechanics a two-component
spin-
1
2
wavefunction
u transforms under a rotation R by
angle
θ about axis ˆn as [10]
u → e
i
ˆn·θ/2
u
(1)
where
σ are the Pauli matrices. It is easily checked by
matrix multiplication that a three-dimensional vector
a
i
transforming by the usual rotation matrix
a
i
R
j
i
can be
represented by
a
i
σ
i
in the spinor representation in (1)
so that
e
−i ˆn·σθ/2
(a
i
σ
i
) e
i
ˆn·σθ/2
= (a
i
R
j
i
)σ
j
.
(2)
Multiplying by the spin state
u on the right, and by its
Hermitian conjugate on the left we obtain
a
i
u
†
σ
i
u → a
i
u
†
e
−i ˆn·σ θ/2
σ
i
e
i
ˆn·σ θ/2
u = a
i
R
j
i
u
†
σ
j
u. (3)
Since the
a
i
are three arbitrary components we also have
u
†
σ
i
u → R
j
i
u
†
σ
j
u
(4)
hence the quantity
u
†
σ
i
u is a 3-vector.
We
can
extend
the
representation
of
three-
dimensional rotations (4) to four dimensions by defin-
ing a spinor representation of a 4-vector
a
µ
as
a
µ
σ
µ
=
a
0
σ
0
− a · σ, where we have defined σ
0
= I. The
matrices of this representation are Hermitian since the
Pauli matrices and the identity are also such. If one
transforms
a
µ
σ
µ
by the unimodular matrix
3, so that
a
µ
σ
µ
→ 3a
µ
σ
µ
3
†
= a
µ
0
σ
µ
, the invariant length
det
(a
µ
σ
µ
) = a
2
0
− a
2
1
− a
2
2
− a
2
3
is unchanged and the
transformed matrix is also Hermitian. Thus
a
µ
0
rep-
resents another spacetime point. Then
a
µ
0
σ
µ
must be
related to the original 4-vector by a Lorentz transforma-
tion
3, just as a
i
σ
ι
in (3) was related to its transform
by the rotation matrix, so
a
µ
u
†
σ
µ
u → a
µ
u
†
L(3)σ
µ
L
†
(3)u = a
ν
3
µ
ν
u
†
σ
µ
u,
(5)
where the requirement is now to find
L(3). If we write
3 = e
α
then we must have Tr
α = 0 for 3 to be
unimodular. There are six such independent matrices,
which we can choose to be the Pauli matrices
σ and iσ ,
giving the transformation matrices as
L(3) = e
i
n·σ θ/2
and
L(3) = e
−n·σw/2
. (6)
We see at once that the first set of matrices corresponds
to three-dimensional rotations (1), so that (6) is in accord
with (5) when
3 = R. The second set represents
boosts and this can be checked by first verifying (5)
for a boost
3 in the z direction ˆn = (1, 0, 0), and then
using the fact that any boost
3
0
can be represented in
terms of
3 by 3
0
= R3R
−1
. From (5) we see that
the quantity
u
†
σ
µ
u must be a covariant 4-vector, hence
if
y
µ
is a contravariant 4-vector
y
µ
u
†
σ
µ
u is a Lorentz
invariant quantity. Since relativistic boosts correspond
to a rapidity
w rotation of the vector x
i
= (it, x
1
, x
2
, x
3
)
by angle
±iw [11], we see that w in (6) is the rapidity
increment of the boost.
3. The Dirac current and field equation
3.1. The two-component relativistic equation
The fact that the time component of the 4-vector
u
†
σ
µ
u looks exactly like the probability density in non-
relativistic quantum mechanics, and the absence of any
other bilinear combination of spin fields
u forming a 4-
vector, suggests that
u
†
σ
µ
u should be interpreted as the
relativistic spin-
1
2
particle 4-current. Now, the physical
requirement that the total probability is unity as seen
in all Lorentz frames demands that the integral of the
probability density over all infinite spacelike surfaces
S
µ
be invariant [9]
R
J
µ
d
S
µ
= constant, which requires
that the 4-current be conserved
∂
µ
(u
†
σ
µ
u) = 0.
(7)
Since the Pauli matrices are Hermitian this can be
written as
u
†
∂
µ
(σ
µ
u) = pure imaginary.
(8)
Now the superposition principle of quantum mechan-
ics requires that (8) reduces to a linear equation for
field
u. The only possible equation satisfying the re-
quirements is
i
∂
µ
(σ
µ
u) + Mu = 0
(9)
174
L Lerner
where
M is Hermitian. However, as u
†
∂
µ
(σ
µ
u) is
an invariant and
u
†
→ u
†
e
− ˆn·σw/2
the first term of
(9) transforms as
∂
µ
(σ
µ
u) → e
ˆn·σw/2
∂
µ
(σ
µ
u), so that
M = 0 is required to ensure covariance. A Fourier
transform of (9) gives
p
µ
σ
µ
u = 0.
(10)
Rewriting (10) as (
E + p · σ)u = 0 and using the fact
that
ˆp · σ is the helicity operator in quantum mechanics
we see that (9) and its Fourier transform (10) can
only describe a particle with negative helicity. From
relativistic invariance it also follows that the particle
has zero mass. We could alternatively have obtained
a field equation for a particle of positive helicity by
choosing the
+ sign in (6) so that the new spinor ν now
satisfies
p
µ
¯σ
µ
ν = 0
(11)
with the corresponding current being
ν
†
¯σ
µ
ν, where
¯σ = (σ
0
, −σ).
Equations (10) and (11) are
called the Weyl equations.
They can describe
particles of definite helicity (which must thus be
massless) and seem to apply well to the neu-
trino/antineutrino.
3.2. The four-component equations
It is clear from the above that a two-component spinor
cannot be used to describe a relativistic massive spin-
1
2
particle, as the latter can have both signs of helicity. Its
current must therefore be a linear combination of the
available bilinear forms for the spin-
1
2
field. Symmetry
of
u and v states for a massive particle requires
J
µ
= u
†
σ
µ
u + v
†
¯σ
µ
v
(12)
which is equivalent to the Dirac current [6]. Since this
particle is now described by four independent functions
we choose to define a 4-component state vector
ϕ and
matrices
α
µ
as
ϕ =
u
v
,
α
µ
=
σ
µ
0
0
¯σ
µ
(13)
so that the current can now be written as
J
µ
= ϕ
†
α
µ
ϕ.
(14)
The transformation properties of the spinor
ϕ are evident
from (6) and (11)
ϕ → S(3)ϕ,
S(3) =
e
ˆn·σw/2
0
0
e
ˆn·σw/2
. (15)
Applying the requirement of current conservation and
linearity as before, we obtain the field equation
i
∂
µ
(α
µ
ϕ) + Mϕ = 0
(16)
where
M must be Hermitian, and thus it can be
written as a linear combination of the
γ matrix algebra
{I, γ
µ
, σ
µν
, γ
5
γ
µ
, γ
5
} [6]. Now the requirement that (16)
is Lorentz invariant imposes constraint on
M
S
T
(3) = MS(3).
(17)
Since we wish (16) to describe a spin-
1
2
particle in free
space we cannot use any 4-vector or tensor in composing
M, whose components must therefore be scalars. From
(17) it then follows that
M = aγ
5
,
γ
5
=
0
I
I 0
(18)
where
a is a constant. Its relation to a particle property
can be established by writing the Dirac equation in
momentum space (
p
µ
α
µ
+ aγ
5
)ϕ = 0, multiplying by
p
µ
α
µ
− aγ
5
, and using the anticommutation relations
for the
α matrices {α
µ
, α
ν
} = 2δ
µν
, which follow from
(13), to find
p
2
ϕ = a
2
ϕ. Thus from the relativistic
energy equation
p
2
= m
2
it follows that
a = ±m,
and we can choose the positive sign without loss of
generality.
The Dirac equation in (13)–(18) appears in what is
normally called spinor representation form.
Now as
in non-relativistic quantum mechanics we are free to
transform to a different representation of the spinor
ϕ →
Uϕ, where U is unitary. In this case it is clear that the
explicit form of the matrices in the Dirac equation (16)
will change according to
α
µ
→ Uα
µ
U
−1
. It is clear
that a particle at rest is symmetric in the amount of
negative
u and positive v helicity components, and
according to our definition of 4-spinor
ϕ in (13), such
a particle will be represented by
u = v in the spinor
representation of the wavefunction
ϕ. In the standard
(Dirac) representation one makes the choice
U =
1
√
2
1
1
1
−1
,
⇒
u
v
→
1
√
2
u + v
u − v
,
(19)
so that for a particle at rest only the top two
components are non-zero, while the opposite is true
for the antiparticle. Incorporating definition (18) and
transformation (19) into the Dirac equation (16) we
obtain the equation in the Dirac representation
(i∂
µ
γ
µ
+ M)ϕ = 0,
γ
µ
= Uγ
5
α
µ
U
−1
(20)
with
γ
µ
being the standard Dirac matrices [6].
Acknowledgments
I wish to thank Michael Hall and Mark Andrews for
useful discussions.
References
[1] Dirac P A 1958 The Principles of Quantum Mechanics
4th edn (Oxford: Oxford University Press)
[2] Levy-Leblond J M 1974 Rev. Del Nuovo Cimento 4
136–41 and references therein
[3] Heitler W 1956 Elementary Wave Mechanics 2nd edn
(Oxford: Oxford University Press)
[4] Mott N F and Massey H M 1952 The Theory of Atomic
Collisions 3rd edn (Oxford: Oxford University Press)
[5] Feynman R P 1961 Quantum Electrodynamics (New
York: Benjamin)
Derivation of the Dirac equation
175
[6] Itzykson C and Zuber J B 1980 Quantum Field Theory
(New York: McGraw-Hill) ch 2
[7] Buitrago J 1988 Eur. J. Phys. 9 149–50
[8] Ramond P 1990 Field Theory: A Modern Primer (New
York: Addison-Wesley) ch 1
Sterman G 1993 Quantum Field Theory (Cambridge:
Cambridge University Press) ch 5
[9] Feynman R P 1961 Theory of Fundamental Processes
(New York: Benjamin) ch 22, 23
[10] Merzbacher E 1961 Quantum Mechanics (New York:
Wiley) ch 12
[11] Woodhouse N M 1992 Special Relativity (New York:
Springer) ch 2