Chapter 2
The Theory of Special Relativity

T he previous considerations concerning the configuration of rigid bodies have been founded, irrespective of the assumption as to the validity of the Euclidean geometry, upon the hypothesis that all directions in space, or all configurations of Cartesian systems of co-ordinates, are physically equivalent. We may express this as the “principle of relativity with respect to direction,” and it has been shown how equations (laws of nature) may be found, in accord with this principle, by the aid of the calculus of tensors. We now inquire whether there is a relativity with respect to the state of motion of the space of reference; in other words, whether there are spaces of reference in motion relatively to each other which are physically equivalent. From the standpoint of mechanics it appears that equivalent spaces of reference do exist. For experiments upon the earth tell us nothing of the fact that we are moving about the sun with a velocity of approximately 30 kilometres a second. On the other hand, this physical equivalence does not seem to hold for spaces of reference in arbitrary motion; for mechanical effects do not seem to be subject to the same laws in a jolting railway train as in one moving with uniform velocity; the rotation of the earth must be considered in writing down the equations of motion relatively to the earth. It appears, therefore, as if there were Cartesian systems of co-ordinates, the so-called inertial systems, with reference to which the laws of mechanics (more generally the laws of physics) are expressed in the simplest form. We may infer the validity of the following theorem: If K is an inertial system, then every other system K which moves uniformly and without rotation relatively to K, is also an inertial system; the laws of nature are in concordance for all inertial systems. This statement we shall call the “principle of special relativity.” We shall draw certain conclusions from this principle of “relativity of translation” just as we have already done for relativity of direction.

In order to be able to do this, we must first solve the following problem. If we are given the Cartesian co-ordinates, xν, and the time, t, of an event relatively to one inertial system, K, how can we calculate the co-ordinates, xν, and the time, t, of the same event relatively to an inertial system K which moves with uniform translation relatively to K? In the pre-relativity physics this problem was solved by making unconsciously two hypotheses:—

1. The time is absolute; the time of an event, t, relatively to K is the same as the time relatively to K. If instantaneous signals could be sent to a distance, and if one knew that the state of motion of a clock had no influence on its rate, then this assumption would be physically established. For then clocks, similar to one another, and regulated alike, could be distributed over the systems K and K, at rest relatively to them, and their indications would be independent of the state of motion of the systems; the time of an event would then be given by the clock in its immediate neighbourhood.

2. Length is absolute; if an interval, at rest relatively to K, has a length s, then it has the same length s relatively to a system K which is in motion relatively to K.

If the axes of K and K are parallel to each other, a simple calculation based on these two assumptions, gives the equations of transformation

x ν = xν - aν - bνt, t = t - b. (28)

This transformation is known as the “Galilean Transformation.” Differentiating twice by the time, we get

d2xν dt2 = d2xν dt2 .
Further, it follows that for two simultaneous events,
x ν(1) - x ν(2) = x ν(1) - x ν(2).
The invariance of the distance between the two points results from squaring and adding. From this easily follows the co-variance of Newton’s equations of motion with respect to the Galilean transformation (28). Hence it follows that classical mechanics is in accord with the principle of special relativity if the two hypotheses respecting scales and clocks are made.

But this attempt to found relativity of translation upon the Galilean transformation fails when applied to electromagnetic phenomena. The Maxwell-Lorentz electromagnetic equations are not co-variant with respect to the Galilean transformation. In particular, we note, by (28), that a ray of light which referred to K has a velocity c, has a different velocity referred to K, depending upon its direction. The space of reference of K is therefore distinguished, with respect to its physical properties, from all spaces of reference which are in motion relatively to it (quiescent ther). But all experiments have shown that electromagnetic and optical phenomena, relatively to the earth as the body of reference, are not influenced by the translational velocity of the earth. The most important of these experiments are those of Michelson and Morley, which I shall assume are known. The validity of the principle of special relativity can therefore hardly be doubted.

On the other hand, the Maxwell-Lorentz equations have proved their validity in the treatment of optical problems in moving bodies. No other theory has satisfactorily explained the facts of aberration, the propagation of light in moving bodies (Fizeau), and phenomena observed in double stars (De Sitter). The consequence of the Maxwell-Lorentz equations that in a vacuum light is propagated with the velocity c, at least with respect to a definite inertial system K, must therefore be regarded as proved. According to the principle of special relativity, we must also assume the truth of this principle for every other inertial system.

Before we draw any conclusions from these two principles we must first review the physical significance of the concepts “time” and “velocity.” It follows from what has gone before, that co-ordinates with respect to an inertial system are physically defined by means of measurements and constructions with the aid of rigid bodies. In order to measure time, we have supposed a clock, U, present somewhere, at rest relatively to K. But we cannot fix the time, by means of this clock, of an event whose distance from the clock is not negligible; for there are no “instantaneous signals” that we can use in order to compare the time of the event with that of the clock. In order to complete the definition of time we may employ the principle of the constancy of the velocity of light in a vacuum. Let us suppose that we place similar clocks at points of the system K, at rest relatively to it, and regulated according to the following scheme. A ray of light is sent out from one of the clocks, Um, at the instant when it indicates the time tm, and travels through a vacuum a distance rmn, to the clock Un; at the instant when this ray meets the clock Un the latter is set to indicate the time tn = tm + rmn c .7 The principle of the constancy of the velocity of light then states that this adjustment of the clocks will not lead to contradictions. With clocks so adjusted, we can assign the time to events which take place near any one of them. It is essential to note that this definition of time relates only to the inertial system K, since we have used a system of clocks at rest relatively to K. The assumption which was made in the pre-relativity physics of the absolute character of time (i.e. the independence of time of the choice of the inertial system) does not follow at all from this definition.

The theory of relativity is often criticized for giving, without justification, a central theoretical rle to the propagation of light, in that it founds the concept of time upon the law of propagation of light. The situation, however, is somewhat as follows. In order to give physical significance to the concept of time, processes of some kind are required which enable relations to be established between different places. It is immaterial what kind of processes one chooses for such a definition of time. It is advantageous, however, for the theory, to choose only those processes concerning which we know something certain. This holds for the propagation of light in vacuo in a higher degree than for any other process which could be considered, thanks to the investigations of Maxwell and H. A. Lorentz.

From all of these considerations, space and time data have a physically real, and not a mere fictitious, significance; in particular this holds for all the relations in which co-ordinates and time enter, e.g. the relations (28). There is, therefore, sense in asking whether those equations are true or not, as well as in asking what the true equations of transformation are by which we pass from one inertial system K to another, K, moving relatively to it. It may be shown that this is uniquely settled by means of the principle of the constancy of the velocity of light and the principle of special relativity.

To this end we think of space and time physically defined with respect to two inertial systems, K and K, in the way that has been shown. Further, let a ray of light pass from one point P1 to another point P2 of K through a vacuum. If r is the measured distance between the two points, then the propagation of light must satisfy the equation

r = cΔt.

If we square this equation, and express r2 by the differences of the co-ordinates, Δxν, in place of this equation we can write

(Δxν)2 - c2Δt2 = 0. (29)

This equation formulates the principle of the constancy of the velocity of light relatively to K. It must hold whatever may be the motion of the source which emits the ray of light.

The same propagation of light may also be considered relatively to K, in which case also the principle of the constancy of the velocity of light must be satisfied. Therefore, with respect to K, we have the equation

(Δx ν)2 - c2Δt2 = 0. (30)

Equations (30) and (29) must be mutually consistent with each other with respect to the transformation which transforms from K to K. A transformation which effects this we shall call a “Lorentz transformation.”

Before considering these transformations in detail we shall make a few general remarks about space and time. In the pre-relativity physics space and time were separate entities. Specifications of time were independent of the choice of the space of reference. The Newtonian mechanics was relative with respect to the space of reference, so that, e.g. the statement that two non-simultaneous events happened at the same place had no objective meaning (that is, independent of the space of reference). But this relativity had no rle in building up the theory. One spoke of points of space, as of instants of time, as if they were absolute realities. It was not observed that the true element of the space-time specification was the event, specified by the four numbers x1x2x3t. The conception of something happening was always that of a four-dimensional continuum; but the recognition of this was obscured by the absolute character of the pre-relativity time. Upon giving up the hypothesis of the absolute character of time, particularly that of simultaneity, the four-dimensionality of the time-space concept was immediately recognized. It is neither the point in space, nor the instant in time, at which something happens that has physical reality, but only the event itself. There is no absolute (independent of the space of reference) relation in space, and no absolute relation in time between two events, but there is an absolute (independent of the space of reference) relation in space and time, as will appear in the sequel. The circumstance that there is no objective rational division of the four-dimensional continuum into a three-dimensional space and a one-dimensional time continuum indicates that the laws of nature will assume a form which is logically most satisfactory when expressed as laws in the four-dimensional space-time continuum. Upon this depends the great advance in method which the theory of relativity owes to Minkowski. Considered from this standpoint, we must regard x1x2x3t as the four co-ordinates of an event in the four-dimensional continuum. We have far less success in picturing to ourselves relations in this four-dimensional continuum than in the three-dimensional Euclidean continuum; but it must be emphasized that even in the Euclidean three-dimensional geometry its concepts and relations are only of an abstract nature in our minds, and are not at all identical with the images we form visually and through our sense of touch. The non-divisibility of the four-dimensional continuum of events does not at all, however, involve the equivalence of the space co-ordinates with the time co-ordinate. On the contrary, we must remember that the time co-ordinate is defined physically wholly differently from the space co-ordinates. The relations (29) and (30) which when equated define the Lorentz transformation show, further, a difference in the rle of the time co-ordinate from that of the space co-ordinates; for the term Δt2 has the opposite sign to the space terms, Δx12Δx22Δx32.

Before we analyse further the conditions which define the Lorentz transformation, we shall introduce the light-time, l = ct, in place of the time, t, in order that the constant c shall not enter explicitly into the formulas to be developed later. Then the Lorentz transformation is defined in such a way that, first, it makes the equation

Δx12 + Δx 22 + Δx 32 - Δl2 = 0 (31)

a co-variant equation, that is, an equation which is satisfied with respect to every inertial system if it is satisfied in the inertial system to which we refer the two given events (emission and reception of the ray of light). Finally, with Minkowski, we introduce in place of the real time co-ordinate l = ct, the imaginary time co-ordinate

x4 = il = ict(-1 = i).
Then the equation defining the propagation of light, which must be co-variant with respect to the Lorentz transformation, becomes

(4)Δxν2 = Δx 12 + Δx 22 + Δx 32 + Δx 42 = 0. (32)

This condition is always satisfied8 if we satisfy the more general condition that

s2 = Δx 12 + Δx 22 + Δx 32 + Δx 42 (33)

shall be an invariant with respect to the transformation. This condition is satisfied only by linear transformations, that is, transformations of the type

x μ = aμ + bμαxα (34)

in which the summation over the α is to be extended from α = 1 to α = 4. A glance at equations (33) and (34) shows that the Lorentz transformation so defined is identical with the translational and rotational transformations of the Euclidean geometry, if we disregard the number of dimensions and the relations of reality. We can also conclude that the coefficients bμα must satisfy the conditions

bμαbνα = δμν = bαμbαν. (35)

Since the ratios of the xν are real, it follows that all the aμ and the bμα are real, except a4, b41b42b43, b14b24 and b34, which are purely imaginary.

Special Lorentz Transformation. We obtain the simplest transformations of the type of (34) and (35) if only two of the co-ordinates are to be transformed, and if all the aμ, which determine the new origin, vanish. We obtain then for the indices 1 and 2, on account of the three independent conditions which the relations (35) furnish,

2x 1 = x1 cosϕ - x2 sinϕ, x 2 = x1 sinϕ + x2 cosϕ, x 3 = x3, x 4 = x4. (36)
This is a simple rotation in space of the (space) co-ordinate system about x3-axis. We see that the rotational transformation in space (without the time transformation) which we studied before is contained in the Lorentz transformation as a special case. For the indices 1 and 4 we obtain, in an analogous manner,
2x 1 = x1 cosψ - x4 sinψ, x 4 = x1 sinψ + x4 cosψ, x 2 = x2, x 3 = x3. (37)

On account of the relations of reality ψ must be taken as imaginary. To interpret these equations physically, we introduce the real light-time l and the velocity v of K relatively to K, instead of the imaginary angle ψ. We have, first,

x 1 = x1 cosψ - i lsinψ, l = -i x 1 sinψ + lcosψ.

Since for the origin of K i.e., for x1 = 0, we must have x1 = vl, it follows from the first of these equations that

v = itanψ, (38)

and also

sinψ = - iv 1 - v2, cosψ = 1 1 - v2, (39)

so that we obtain

x 1 = x1 - vl 1 - v2, l = l - vx1 1 - v2, x 2 = x2, x 3 = x3. (40)

These equations form the well-known special Lorentz transformation, which in the general theory represents a rotation, through an imaginary angle, of the four-dimensional system of co-ordinates. If we introduce the ordinary time t, in place of the light-time l, then in (40) we must replace l by ct and v by v c.

We must now fill in a gap. From the principle of the constancy of the velocity of light it follows that the equation

Δxν2 = 0
has a significance which is independent of the choice of the inertial system; but the invariance of the quantity Δxν2 does not at all follow from this. This quantity might be transformed with a factor. This depends upon the fact that the right-hand side of (40) might be multiplied by a factor λ, independent of v. But the principle of relativity does not permit this factor to be different from 1, as we shall now show. Let us assume that we have a rigid circular cylinder moving in the direction of its axis. If its radius, measured at rest with a unit measuring rod is equal to R0, its radius R in motion, might be different from R0, since the theory of relativity does not make the assumption that the shape of bodies with respect to a space of reference is independent of their motion relatively to this space of reference. But all directions in space must be equivalent to each other. R may therefore depend upon the magnitude q of the velocity, but not upon its direction; R must therefore be an even function of q. If the cylinder is at rest relatively to K the equation of its lateral surface is
x2 + y2 = R 02.
If we write the last two equations of (40) more generally

x 2 = λx2, x 3 = λx3,

then the lateral surface of the cylinder referred to K satisfies the equation

x2 + y2 = R02 λ2 .
The factor λ therefore measures the lateral contraction of the cylinder, and can thus, from the above, be only an even function of v.

If we introduce a third system of co-ordinates, K, which moves relatively to K with velocity v in the direction of the negative x-axis of K, we obtain, by applying (40) twice,

x 1 = λ(v)λ(-v)x1, x 2 = λ(v)λ(-v)x2, x 3 = λ(v)λ(-v)x3, l = λ(v)λ(-v)l.

Now, since λ(v) must be equal to λ(-v), and since we assume that we use the same measuring rods in all the systems, it follows that the transformation of K to K must be the identical transformation (since the possibility λ = -1 does not need to be considered). It is essential for these considerations to assume that the behaviour of the measuring rods does not depend upon the history of their previous motion.

Moving Measuring Rods and Clocks. At the definite K-time, l = 0, the position of the points given by the integers x1 = n, is with respect to K, given by x1 = n1 - v2; this follows from the first of equations (40) and expresses the Lorentz contraction. A clock at rest at the origin x1 = 0 of K, whose beats are characterized by l = n, will, when observed from K, have beats characterized by

l = n 1 - v2;
this follows from the second of equations (40) and shows that the clock goes slower than if it were at rest relatively to K. These two consequences, which hold, mutatis mutandis, for every system of reference, form the physical content, free from convention, of the Lorentz transformation.

Addition Theorem for Velocities. If we combine two special Lorentz transformations with the relative velocities v1 and v2, then the velocity of the single Lorentz transformation which takes the place of the two separate ones is, according to (38), given by

v12 = itan(ψ1 + ψ2) = i tanψ1 + tanψ2 1 - tanψ1 tanψ2 = v1 + v2 1 + v1v2. (41)

General Statements about the Lorentz Transformation and its Theory of Invariants. The whole theory of invariants of the special theory of relativity depends upon the invariant s2 (33). Formally, it has the same rle in the four-dimensional space-time continuum as the invariant Δx12 + Δx22 + Δx32 in the Euclidean geometry and in the pre-relativity physics. The latter quantity is not an invariant with respect to all the Lorentz transformations; the quantity s2 of equation (33) assumes the rle of this invariant. With respect to an arbitrary inertial system, s2 may be determined by measurements; with a given unit of measure it is a completely determinate quantity, associated with an arbitrary pair of events. The invariant s2 differs, disregarding the number of dimensions, from the corresponding invariant of the Euclidean geometry in the following points. In the Euclidean geometry s2 is necessarily positive; it vanishes only when the two points concerned come together. On the other hand, from the vanishing of

s2 = (4)Δxν2 = Δx 12 + Δx 22 + Δx 32 - Δt2
it cannot be concluded that the two space-time points


pict

Figure 1:

fall together; the vanishing of this quantity s2, is the invariant condition that the two space-time points can be connected by a light signal in vacuo. If P is a point (event) represented in the four-dimensional space of the x1x2x3l, then all the “points” which can be connected to P by means of a light signal lie upon the cone s2 = 0 (compare ??, in which the dimension x3 is suppressed). The “upper” half of the cone may contain the “points” to which light signals can be sent from P; then the “lower” half of the cone will contain the “points” from which light signals can be sent to P. The points P enclosed by the conical surface furnish, with P, a negative s2; PP, as well as PP is then, according to Minkowski, of the nature of a time. Such intervals represent elements of possible paths of motion, the velocity being less than that of light.9 In this case the l-axis may be drawn in the direction of PP by suitably choosing the state of motion of the inertial system. If P lies outside of the “light-cone” then PP is of the nature of a space; in this case, by properly choosing the inertial system, Δl can be made to vanish.

By the introduction of the imaginary time variable, x4 = il, Minkowski has made the theory of invariants for the four-dimensional continuum of physical phenomena fully analogous to the theory of invariants for the three-dimensional continuum of Euclidean space. The theory of four-dimensional tensors of special relativity differs from the theory of tensors in three-dimensional space, therefore, only in the number of dimensions and the relations of reality.

A physical entity which is specified by four quantities, Aν, in an arbitrary inertial system of the x1x2x3x4, is called a 4-vector, with the components Aν, if the Aν correspond in their relations of reality and the properties of transformation to the Δxν; it may be of the nature of a space or of a time. The sixteen quantities Aμν then form the components of a tensor of the second rank, if they transform according to the scheme

Aμν = bμαbνβAαβ.
It follows from this that the Aμν behave, with respect to their properties of transformation and their properties of reality, as the products of components, UμV ν, of two 4-vectors, (U) and (V ). All the components are real except those which contain the index 4 once, those being purely imaginary. Tensors of the third and higher ranks may be defined in an analogous way. The operations of addition, subtraction, multiplication, contraction and differentiation for these tensors are wholly analogous to the corresponding operations for tensors in three-dimensional space.

Before we apply the tensor theory to the four-dimensional space-time continuum, we shall examine more particularly the skew-symmetrical tensors. The tensor of the second rank has, in general, 16 = 44 components. In the case of skew-symmetry the components with two equal indices vanish, and the components with unequal indices are equal and opposite in pairs. There exist, therefore, only six independent components, as is the case in the electromagnetic field. In fact, it will be shown when we consider Maxwell’s equations that these may be looked upon as tensor equations, provided we regard the electromagnetic field as a skew-symmetrical tensor. Further, it is clear that the skew-symmetrical tensor of the third rank (skew-symmetrical in all pairs of indices) has only four independent components, since there are only four combinations of three different indices.

We now turn to Maxwell’s equations (24), (26), (25), (27), and introduce the notation:10

6ϕ23ϕ31ϕ12 ϕ14 ϕ24 ϕ34 h23 h31h12 - iex - iey - iez  (42) 4J1 J2 J3 J4 1 cix1 ciy1 ciziρ  (43)

with the convention that ϕμν shall be equal to  - ϕνμ. Then Maxwell’s equations may be combined into the forms

ϕμν xν = Jμ,  (44) ϕμν xσ + ϕνσ xμ + ϕσμ xν = 0,  (45)

as one can easily verify by substituting from (42) and (43). Equations (44) and (45) have a tensor character, and are therefore co-variant with respect to Lorentz transformations, if the ϕμν and the Jν have a tensor character, which we assume. Consequently, the laws for transforming these quantities from one to another allowable (inertial) system of co-ordinates are uniquely determined. The progress in method which electrodynamics owes to the theory of special relativity lies principally in this, that the number of independent hypotheses is diminished. If we consider, for example, equations (24) only from the standpoint of relativity of direction, as we have done above, we see that they have three logically independent terms. The way in which the electric intensity enters these equations appears to be wholly independent of the way in which the magnetic intensity enters them; it would not be surprising if instead of eμ l , we had, say, 2eμ l2 , or if this term were absent. On the other hand, only two independent terms appear in equation (44). The electromagnetic field appears as a formal unit; the way in which the electric field enters this equation is determined by the way in which the magnetic field enters it. Besides the electromagnetic field, only the electric current density appears as an independent entity. This advance in method arises from the fact that the electric and magnetic fields draw their separate existences from the relativity of motion. A field which appears to be purely an electric field, judged from one system, has also magnetic field components when judged from another inertial system. When applied to an electromagnetic field, the general law of transformation furnishes, for the special case of the special Lorentz transformation, the equations

2ex = ex hx = hx, ey = ey - vhz 1 - v2 hy = hy + vez 1 - v2 , ez = ez + vhy 1 - v2 hz = hz - vey 1 - v2 . (46)

If there exists with respect to K only a magnetic field, h, but no electric field, e, then with respect to K there exists an electric field e as well, which would act upon an electric particle at rest relatively to K. An observer at rest relatively to K would designate this force as the Biot-Savart force, or the Lorentz electromotive force. It therefore appears as if this electromotive force had become fused with the electric field intensity into a single entity.

In order to view this relation formally, let us consider the expression for the force acting upon unit volume of electricity,

k = ρe + [i,h], (47)

in which i is the vector velocity of electricity, with the velocity of light as the unit. If we introduce Jμ and ϕμν according to (42) and (43), we obtain for the first component the expression

ϕ12J2 + ϕ13J3 + ϕ14J4.
Observing that ϕ11 vanishes on account of the skew-symmetry of the tensor (ϕ), the components of k are given by the first three components of the four-dimensional vector

Kμ = ϕμνJν, (48)

and the fourth component is given by

K4 = ϕ41J1 + ϕ42J2 + ϕ43J3 = i(exix + eyiy + eziz) = iλ. (49)

There is, therefore, a four-dimensional vector of force per unit volume, whose first three components, K1K2K3, are the ponderomotive force components per unit volume, and whose fourth component is the rate of working of the field per unit volume, multiplied by -1.


pict

Figure 2:

A comparison of (48) and (47) shows that the theory of relativity formally unites the ponderomotive force of the electric field, ρe, and the Biot-Savart or Lorentz force [i,h].

Mass and Energy. An important conclusion can be drawn from the existence and significance of the 4-vector Kμ. Let us imagine a body upon which the electromagnetic field acts for a time. In the symbolic figure (??) Ox1 designates the x1-axis, and is at the same time a substitute for the three space axes Ox1Ox2Ox3; Ol designates the real time axis. In this diagram a body of finite extent is represented, at a definite time l, by the interval AB; the whole space-time existence of the body is represented by a strip whose boundary is everywhere inclined less than 45 to the l-axis. Between the time sections, l = l1 and l = l2, but not extending to them, a portion of the strip is shaded. This represents the portion of the space-time manifold in which the electromagnetic field acts upon the body, or upon the electric charges contained in it, the action upon them being transmitted to the body. We shall now consider the changes which take place in the momentum and energy of the body as a result of this action. We shall assume that the principles of momentum and energy are valid for the body. The change in momentum, ΔIxΔIyΔIz, and the change in energy, ΔE, are then given by the expressions

ΔIx =l1l2 dlkxdxdydz = 1 i K1dx1dx2dx3dx4, ΔIy =l1l2 dlkydxdydz = 1 i K2dx1dx2dx3dx4, ΔIz =l1l2 dlkzdxdydz = 1 i K3dx1dx2dx3dx4, ΔE =l1l2 dlλdxdydz = 1 i 1 i K4dx1dx2dx3dx4.

Since the four-dimensional element of volume is an invariant, and (K1,K2,K3,K4) forms a 4-vector, the four-dimensional integral extended over the shaded portion transforms as a 4-vector, as does also the integral between the limits l1 and l2, because the portion of the region which is not shaded contributes nothing to the integral. It follows, therefore, that ΔIxΔIyΔIziΔE form a 4-vector. Since the quantities themselves transform in the same way as their increments, it follows that the aggregate of the four quantities

Ix,Iy,Iz,iE
has itself the properties of a vector; these quantities are referred to an instantaneous condition of the body (e.g. at the time l = l1).

This 4-vector may also be expressed in terms of the mass m, and the velocity of the body, considered as a material particle. To form this expression, we note first, that

-ds2 = dτ2 = -(dx 12 + dx 22 + dx 32) - dx 42 = dl2(1 - q2) (50)

is an invariant which refers to an infinitely short portion of the four-dimensional line which represents the motion of the material particle. The physical significance of the invariant dτ may easily be given. If the time axis is chosen in such a way that it has the direction of the line differential which we are considering, or, in other words, if we reduce the material particle to rest, we shall then have dτ = dl; this will therefore be measured by the light-seconds clock which is at the same place, and at rest relatively to the material particle. We therefore call τ the proper time of the material particle. As opposed to dl, dτ is therefore an invariant, and is practically equivalent to dl for motions whose velocity is small compared to that of light. Hence we see that

uσ = dxσ dτ (51)

has, just as the dxν, the character of a vector; we shall designate (uσ) as the four-dimensional vector (in brief, 4-vector) of velocity. Its components satisfy, by (50), the condition

uσ2 = -1. (52)

We see that this 4-vector, whose components in the ordinary notation are

qx 1 - q2, qy 1 - q2, qz 1 - q2, i 1 - q2 (53)

is the only 4-vector which can be formed from the velocity components of the material particle which are defined in three dimensions by

qx = dx dl ,qy = dy dl ,qz = dz dl .
We therefore see that

mdxμ dτ (54)

must be that 4-vector which is to be equated to the 4-vector of momentum and energy whose existence we have proved above. By equating the components, we obtain, in three-dimensional notation,

Ix = mqx 1 - q2, Iy = mqy 1 - q2, Iz = mqz 1 - q2, E = m 1 - q2. (55)

We recognize, in fact, that these components of momentum agree with those of classical mechanics for velocities which are small compared to that of light. For large velocities the momentum increases more rapidly than linearly with the velocity, so as to become infinite on approaching the velocity of light.

If we apply the last of equations (55) to a material particle at rest (q = 0), we see that the energy, E0, of a body at rest is equal to its mass. Had we chosen the second as our unit of time, we would have obtained

E0 = mc2. (56)

Mass and energy are therefore essentially alike; they are only different expressions for the same thing. The mass of a body is not a constant; it varies with changes in its energy.11 We see from the last of equations (55) that E becomes infinite when q approaches 1, the velocity of light. If we develop E in powers of q2, we obtain,

E = m + m 2 q2 + 3 8mq4 + . (57)

The second term of this expansion corresponds to the kinetic energy of the material particle in classical mechanics.

Equations of Motion of Material Particles. From (55) we obtain, by differentiating by the time l, and using the principle of momentum, in the notation of three-dimensional vectors,

K = d dl mq 1 - q2 . (58)
This equation, which was previously employed by H. A. Lorentz for the motion of electrons, has been proved to be true, with great accuracy, by experiments with β-rays.

Energy Tensor of the Electromagnetic Field. Before the development of the theory of relativity it was known that the principles of energy and momentum could be expressed in a differential form for the electromagnetic field. The four-dimensional formulation of these principles leads to an important conception, that of the energy tensor, which is important for the further development of the theory of relativity. If in the expression for the 4-vector of force per unit volume,

Kμ = ϕμνJν,
using the field equations (44), we express Jν in terms of the field intensities, ϕμν, we obtain, after some transformations and repeated application of the field equations (44) and (45), the expression

Kμ = -Tμν xν , (59)

where we have written12

Tμν = -1 4ϕαβ2δ μν + ϕμαϕνα. (60)

The physical meaning of equation (59) becomes evident if in place of this equation we write, using a new notation,

4kx = -pxx x -pxy y -pxz z -(ibx) (il) , ky = -pyx x -pyy y -pyz z -(iby) (il) , kz = -pzx x -pzy y -pzz z -(ibz) (il) , iλ = -(isx) x -(isy) y -(isz) z -(-η) (il) ; (61)

or, on eliminating the imaginary,

4kx = -pxx x -pxy y -pxz z -bx l , ky = -pyx x -pyy y -pyz z -by l , kz = -pzx x -pzy y -pzz z -bz l , λ = -sx x -sy y -sz z -η l . (62)

When expressed in the latter form, we see that the first three equations state the principle of momentum; pxx,…, pzx are the Maxwell stresses in the electromagnetic field, and (bx,by,bz) is the vector momentum per unit volume of the field. The last of equations (62) expresses the energy principle; s is the vector flow of energy, and η the energy per unit volume of the field. In fact, we get from (60) by introducing the well-known expressions for the components of the field intensity from electrodynamics,

4pxx = -hxhx + 1 2(hx2 + h y2 + h z2) -exex + 1 2(ex2 + e y2 + e z2), 3pxy = -hxhy pxz = -hxhz -exey, -exez, bx = sx = eyhz -ezhy, by = sy = ezhx -exhz, bz = sz = exhy -eyhx, η = +1 2(ex2 + e y2 + e z2 + h x2 + h y2 + h z2). (63)

We conclude from (60) that the energy tensor of the electromagnetic fi eld is symmetrical; with this is connected the fact that the momentum per unit volume and the flow of energy are equal to each other (relation between energy and inertia).

We therefore conclude from these considerations that the energy per unit volume has the character of a tensor. This has been proved directly only for an electromagnetic field, although we may claim universal validity for it. Maxwell’s equations determine the electromagnetic field when the distribution of electric charges and currents is known. But we do not know the laws which govern the currents and charges. We do know, indeed, that electricity consists of elementary particles (electrons, positive nuclei), but from a theoretical point of view we cannot comprehend this. We do not know the energy factors which determine the distribution of electricity in particles of definite size and charge, and all attempts to complete the theory in this direction have failed. If then we can build upon Maxwell’s equations in general, the energy tensor of the electromagnetic field is known only outside the charged particles.13 In these regions, outside of charged particles, the only regions in which we can believe that we have the complete expression for the energy tensor, we have, by (59),

Tμν xν = 0. (64)

General Expressions for the Conservation Principles. We can hardly avoid making the assumption that in all other cases, also, the space distribution of energy is given by a symmetrical tensor, Tμν, and that this complete energy tensor everywhere satisfies the relation (64). At any rate we shall see that by means of this assumption we obtain the correct expression for the integral energy principle. Let us consider a spatially bounded, closed system, which, four-dimensionally, we may represent as a strip, outside of which the Tμν vanish. Integrate equation (64) over a space section. Since the integrals of Tμ1 x1 , Tμ2 x2 and Tμ3 x3 vanish because the Tμν vanish at the limits of integration, we obtain

l Tμ4dx1dx2dx3 = 0. (65)

Inside the parentheses are the expressions for the momentum of the whole system, multiplied by i, together with the negative energy of the system, so that (65) expresses the conservation principles in their integral form. That this gives the right conception of energy and


pict

Figure 3:

the conservation principles will be seen from the following considerations.