Chapter I

Fundamental Definitions and Laws. Hypothesis of Quanta

113. Since a wholly new element, entirely unrelated to the fundamental principles of electrodynamics, enters into the range of investigation with the introduction of probability considerations into the electrodynamic theory of heat radiation, the question arises at the outset, whether such considerations are justifiable and necessary. At first sight we might, in fact, be inclined to think that in a purely electrodynamical theory there would be no room at all for probability calculations. For since, as is well known, the electrodynamic equations of the field together with the initial and boundary conditions determine uniquely the way in which an electrodynamical process takes place, in the course of time, considerations which lie outside of the equations of the field would seem, theoretically speaking, to be uncalled for and in any case dispensable. For either they lead to the same results as the fundamental equations of electrodynamics and then they are superfluous, or they lead to different results and in this case they are wrong.

In spite of this apparently unavoidable dilemma, there is a flaw in the reasoning. For on closer consideration it is seen that what is understood in electrodynamics by “initial and boundary” conditions, as well as by the “way in which a process takes place in the course of time,” is entirely different from what is denoted by the same words in thermodynamics. In order to make this evident, let us consider the case of radiation in vacuo, uniform in all directions, which was treated in the last chapter.

From the standpoint of thermodynamics the state of radiation is completely determined, when the intensity of monochromatic radiation $K_{ν}$ is given for all frequencies $ν$ . The electrodynamical observer, however, has gained very little by this single statement; because for him a knowledge of the state requires that every one of the six components of the electric and magnetic field-strength be given at all points of the space; and, while from the thermodynamic point of view the question as to the way in which the process takes place in time is settled by the constancy of the intensity of radiation $K_{ν}$ , from the electrodynamical point of view it would be necessary to know the six components of the field at every point as functions of the time, and hence the amplitudes $C_{n}$ and the phase-constants $θ_{n}$ of all the several partial vibrations contained in the radiation would have to be calculated. This, however, is a problem whose solution is quite impossible, for the data obtainable from the measurements are by no means sufficient. The thermodynamically measurable quantities, looked at from the electrodynamical standpoint, represent only certain mean values, as we saw in the special case of stationary radiation in the last chapter.

We might now think that, since in thermodynamic measurements we are always concerned with mean values only, we need consider nothing beyond these mean values, and, therefore, need not take any account of the particular values at all. This method is, however, impracticable, because frequently and that too just in the most important cases, namely, in the cases of the processes of emission and absorption, we have to deal with mean values which cannot be calculated unambiguously by electrodynamical methods from the measured mean values. For example, the mean value of $C_{n}$ cannot be calculated from the mean value of $C_{n}^{2}$ , if no special information as to the particular values of $C_{n}$ is available.

Thus we see that the electrodynamical state is not by any means determined by the thermodynamic data and that in cases where, according to the laws of thermodynamics and according to all experience, an unambiguous result is to be expected, a purely electrodynamical theory fails entirely, since it admits not one definite result, but an infinite number of different results.

114. Before entering on a further discussion of this fact and of the difficulty to which it leads in the electrodynamical theory of heat radiation, it may be pointed out that exactly the same case and the same difficulty are met with in the mechanical theory of heat, especially in the kinetic theory of gases. For when, for example, in the case of a gas flowing out of an opening at the time $t = 0$ , the velocity, the density, and the temperature are given at every point, and the boundary conditions are completely known, we should expect, according to all experience, that these data would suffice for a unique determination of the way in which the process takes place in time. This, however, from a purely mechanical point of view is not the case at all; for the positions and velocities of all the separate molecules are not at all given by the visible velocity, density, and temperature of the gas, and they would have to be known exactly, if the way in which the process takes place in time had to be completely calculated from the equations of motion. In fact, it is easy to show that, with given initial values of the visible velocity, density, and temperature, an infinite number of entirely different processes is mechanically possible, some of which are in direct contradiction to the principles of thermodynamics, especially the second principle.

115. From these considerations we see that, if we wish to calculate the way in which a thermodynamic process takes place in time, such a formulation of initial and boundary conditions as is perfectly sufficient for a unique determination of the process in thermodynamics, does not suffice for the mechanical theory of heat or for the electrodynamical theory of heat radiation. On the contrary, from the standpoint of pure mechanics or electrodynamics the solutions of the problem are infinite in number. Hence, unless we wish to renounce entirely the possibility of representing the thermodynamic processes mechanically or electrodynamically, there remains only one way out of the difficulty, namely, to supplement the initial and boundary conditions by special hypotheses of such a nature that the mechanical or electrodynamical equations will lead to an unambiguous result in agreement with experience. As to how such an hypothesis is to be formulated, no hint can naturally be obtained from the principles of mechanics or electrodynamics, for they leave the question entirely open. Just on that account any mechanical or electrodynamical hypothesis containing some further specialization of the given initial and boundary conditions, which cannot be tested by direct measurement, is admissible a priori. What hypothesis is to be preferred can be decided only by testing the results to which it leads in the light of the thermodynamic principles based on experience.

116. Although, according to the statement just made, a decisive test of the different admissible hypotheses can be made only a posteriori, it is nevertheless worth while noticing that it is possible to obtain a priori, without relying in any way on thermodynamics, a definite hint as to the nature of an admissible hypothesis. Let us again consider a flowing gas as an illustration (Sec. 114). The mechanical state of all the separate gas molecules is not at all completely defined by the thermodynamic state of the gas, as has previously been pointed out. If, however, we consider all conceivable positions and velocities of the separate gas molecules, consistent with the given values of the visible velocity, density, and temperature, and calculate for every combination of them the mechanical process, assuming some simple law for the impact of two molecules, we shall arrive at processes, the vast majority of which agree completely in the mean values, though perhaps not in all details. Those cases, on the other hand, which show appreciable deviations, are vanishingly few, and only occur when certain very special and far-reaching conditions between the coordinates and velocity-components of the molecules are satisfied. Hence, if the assumption be made that such special conditions do not exist, however different the mechanical details may be in other respects, a form of flow of gas will be found, which may be called quite definite with respect to all measurable mean values—and they are the only ones which can be tested experimentally—although it will not, of course, be quite definite in all details. And the remarkable feature of this is that it is just the motion obtained in this manner that satisfies the postulates of the second principle of thermodynamics.

117. From these considerations it is evident that the hypotheses whose introduction was proven above to be necessary completely answer their purpose, if they state nothing more than that exceptional cases, corresponding to special conditions which exist between the separate quantities determining the state and which cannot be tested directly, do not occur in nature. In mechanics this is done by the hypothesis34 that the heat motion is a “molecular chaos”;35 in electrodynamics the same thing is accomplished by the hypothesis of “natural radiation,” which states that there exist between the numerous different partial vibrations (149) of a ray no other relations than those caused by the measurable mean values (compare below, Sec. 148). If, for brevity, we denote any condition or process for which such an hypothesis holds as an “elemental chaos,” the principle, that in nature any state or any process containing numerous elements not in themselves measurable is an elemental chaos, furnishes the necessary condition for a unique determination of the measurable processes in mechanics as well as in electrodynamics and also for the validity of the second principle of thermodynamics. This must also serve as a mechanical or electrodynamical explanation of the conception of entropy, which is characteristic of the second law and of the closely allied concept of temperature.36 It also follows from this that the significance of entropy and temperature is, according to their nature, connected with the condition of an elemental chaos. The terms entropy and temperature do not apply to a purely periodic, perfectly plane wave, since all the quantities in such a wave are in themselves measurable, and hence cannot be an elemental chaos any more than a single rigid atom in motion can. The necessary condition for the hypothesis of an elemental chaos and with it for the existence of entropy and temperature can consist only in the irregular simultaneous effect of very many partial vibrations of different periods, which are propagated in the different directions in space independent of one another, or in the irregular flight of a multitude of atoms.

118. But what mechanical or electrodynamical quantity represents the entropy of a state? It is evident that this quantity depends in some way on the “probability” of the state. For since an elemental chaos and the absence of a record of any individual element forms an essential feature of entropy, the tendency to neutralize any existing temperature differences, which is connected with an increase of entropy, can mean nothing for the mechanical or electrodynamical observer but that uniform distribution of elements in a chaotic state is more probable than any other distribution.

Now since the concept of entropy as well as the second principle of thermodynamics are of universal application, and since on the other hand the laws of probability have no less universal validity, it is to be expected that the connection between entropy and probability should be very close. Hence we make the following proposition the foundation of our further discussion: The entropy of a physical system in a definite state depends solely on the probability of this state. The fertility of this law will be seen later in several cases. We shall not, however, attempt to give a strict general proof of it at this point. In fact, such an attempt evidently would have no meaning at this point. For, so long as the “probability” of a state is not numerically defined, the correctness of the proposition cannot be quantitatively tested. One might, in fact, suspect at first sight that on this account the proposition has no definite physical meaning. It may, however, be shown by a simple deduction that it is possible by means of this fundamental proposition to determine quite generally the way in which entropy depends on probability, without any further discussion of the probability of a state.

119. For let $S$ be the entropy, $W$ the probability of a physical system in a definite state; then the proposition states that

(162)

$S = f (W)$

where $f (W)$ represents a universal function of the argument $W$ . In whatever way $W$ may be defined, it can be safely inferred from the mathematical concept of probability that the probability of a system which consists of two entirely independent37 systems is equal to the product of the probabilities of these two systems separately. If we think, e.g., of the first system as any body whatever on the earth and of the second system as a cavity containing radiation on Sirius, then the probability that the terrestrial body be in a certain state $1$ and that simultaneously the radiation in the cavity in a definite state $2$ is

(163)

$W = W_{1} W_{2},$

where $W_{1}$ and $W_{2}$ are the probabilities that the systems involved are in the states in question.

If now $S_{1}$ and $S_{2}$ are the entropies of the separate systems in the two states, then, according to (162), we have

$S_{1} = f (W_{1}) S_{2} = f (W_{2}) .$

But, according to the second principle of thermodynamics, the total entropy of the two systems, which are independent (see preceding footnote) of each other, is $S = S_{1} + S_{2}$ and hence from (162) and (163)

$f (W_{1} W_{2}) = f (W_{1}) + f (W_{2}) .$

From this functional equation $f$ can be determined. For on differentiating both sides with respect to $W_{1}$ , $W_{2}$ remaining constant, we obtain

$W_{2} ḟ (W_{1} W_{2}) = ḟ (W_{1}) .$

On further differentiating with respect to $W_{2}$ , $W_{1}$ now remaining constant, we get

$ḟ (W_{1} W_{2}) + W_{1} W_{2} \ddot{f} (W_{1} W_{2}) = 0$

$ḟ (W) + W \ddot{f} (W) = 0 .$

The general integral of this differential equation of the second order is

$f (W) = k log W + const .$

Hence from (162) we get

(164)

$S = k log W + const .,$

an equation which determines the general way in which the entropy depends on the probability. The universal constant of integration $k$ is the same for a terrestrial as for a cosmic system, and its value, having been determined for the former, will remain valid for the latter. The second additive constant of integration may, without any restriction as regards generality, be included as a constant multiplier in the quantity $W$ , which here has not yet been completely defined, so that the equation reduces to

$S = k log W .$

120. The logarithmic connection between entropy and probability was first stated by L. Boltzmann38 in his kinetic theory of gases. Nevertheless our equation (164) differs in its meaning from the corresponding one of Boltzmann in two essential points.

Firstly, Boltzmann’s equation lacks the factor $k$ , which is due to the fact that Boltzmann always used gram-molecules, not the molecules themselves, in his calculations. Secondly, and this is of greater consequence, Boltzmann leaves an additive constant undetermined in the entropy $S$ as is done in the whole of classical thermodynamics, and accordingly there is a constant factor of proportionality, which remains undetermined in the value of the probability $W$ .

In contrast with this we assign a definite absolute value to the entropy $S$ . This is a step of fundamental importance, which can be justified only by its consequences. As we shall see later, this step leads necessarily to the “hypothesis of quanta” and moreover it also leads, as regards radiant heat, to a definite law of distribution of energy of black radiation, and, as regards heat energy of bodies, to Nernst’s heat theorem.

From (164) it follows that with the entropy $S$ the probability $W$ is, of course, also determined in the absolute sense. We shall designate the quantity $W$ thus defined as the “thermodynamic probability,” in contrast to the “mathematical probability,” to which it is proportional but not equal. For, while the mathematical probability is a proper fraction, the thermodynamic probability is, as we shall see, always an integer.

121. The relation (164) contains a general method for calculating the entropy $S$ by probability considerations. This, however, is of no practical value, unless the thermodynamic probability $W$ of a system in a given state can be expressed numerically. The problem of finding the most general and most precise definition of this quantity is among the most important problems in the mechanical or electrodynamical theory of heat. It makes it necessary to discuss more fully what we mean by the “state” of a physical system.

By the state of a physical system at a certain time we mean the aggregate of all those mutually independent quantities, which determine uniquely the way in which the processes in the system take place in the course of time for given boundary conditions. Hence a knowledge of the state is precisely equivalent to a knowledge of the “initial conditions.” If we now take into account the considerations stated above in Sec. 113, it is evident that we must distinguish in the theoretical treatment two entirely different kinds of states, which we may denote as “microscopic” and “macroscopic” states. The microscopic state is the state as described by a mechanical or electrodynamical observer; it contains the separate values of all coordinates, velocities, and field-strengths. The microscopic processes, according to the laws of mechanics and electrodynamics, take place in a perfectly unambiguous way; for them entropy and the second principle of thermodynamics have no significance. The macroscopic state, however, is the state as observed by a thermodynamic observer; any macroscopic state contains a large number of microscopic ones, which it unites in a mean value. Macroscopic processes take place in an unambiguous way in the sense of the second principle, when, and only when, the hypothesis of the elemental chaos (Sec. 117) is satisfied.

122. If now the calculation of the probability $W$ of a state is in question, it is evident that the state is to be thought of in the macroscopic sense. The first and most important question is now: How is a macroscopic state defined? An answer to it will dispose of the main features of the whole problem.

For the sake of simplicity, let us first consider a special case, that of a very large number, $N$ , of simple similar molecules. Let the problem be solely the distribution of these molecules in space within a given volume, $V$ , irrespective of their velocities, and further the definition of a certain macroscopic distribution in space. The latter cannot consist of a statement of the coordinates of all the separate molecules, for that would be a definite microscopic distribution. We must, on the contrary, leave the positions of the molecules undetermined to a certain extent, and that can be done only by thinking of the whole volume $V$ as being divided into a number of small but finite space elements, $G$ , each containing a specified number of molecules. By any such statement a definite macroscopic distribution in space is defined. The manner in which the molecules are distributed within every separate space element is immaterial, for here the hypothesis of elemental chaos (Sec. 117) provides a supplement, which insures the unambiguity of the macroscopic state, in spite of the microscopic indefiniteness. If we distinguish the space elements in order by the numbers $1$ , $2$ , $3, \dots$ and, for any particular macroscopic distribution in space, denote the number of the molecules lying in the separate space elements by $N_{1}$ , $N_{2}$ , $N_{3} \dots$ , then to every definite system of values $N_{1}$ , $N_{2}$ , $N_{3} \dots$ , there corresponds a definite macroscopic distribution in space. We have of course always:

(165)

$N_{1} + N_{2} + N_{3} + \dots = N$

or if

(166)

$\frac{N_{1}}{N} = w_{1}, \frac{N_{2}}{N} = w_{2}, \dots$

(167)

$w_{1} + w_{2} + w_{3} + \dots = 1 .$

The quantity $w_{i}$ may be called the density of distribution of the molecules, or the mathematical probability that any molecule selected at random lies in the $i$ th space element.

If we now had, e.g., only $10$ molecules and $7$ space elements, a definite space distribution would be represented by the values:

(168)

$N_{1} = 1, N_{2} = 2, N_{3} = 0, N_{4} = 0, N_{5} = 1, N_{6} = 4, N_{7} = 2,$

which state that in the seven space elements there lie respectively $1$ , $2$ , $0$ , $0$ , $1$ , $4$ , $2$ molecules.

123. The definition of a macroscopic distribution in space may now be followed immediately by that of its thermodynamic probability $W$ . The latter is founded on the consideration that a certain distribution in space may be realized in many different ways, namely, by many different individual coordinations or “complexions,” according as a certain molecule considered will happen to lie in one or the other space element. For, with a given distribution of space, it is of consequence only how many, not which, molecules lie in every space element.

The number of all complexions which are possible with a given distribution in space we equate to the thermodynamic probability $W$ of the space distribution.

In order to form a definite conception of a certain complexion, we can give the molecules numbers, write these numbers in order from $1$ to $N$ , and place below the number of every molecule the number of that space element to which the molecule in question belongs in that particular complexion. Thus the following table represents one particular complexion, selected at random, for the distribution in the preceding illustration

(169)

$\begin{matrix} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ 6 & 1 & 7 & 5 & 6 & 2 & 2 & 6 & 6 & 7 \end{matrix}$

By this the fact is exhibited that the

Molecule $2$ lies in space element $1$ .

Molecules $6$ and $7$ lie in space element $2$ .

Molecule $4$ lies in space element $5$ .

Molecules $1$ , $5$ , $8$ , and $9$ lie in space element $6$ .

Molecules $3$ and $10$ lie in space element $7$ .

As becomes evident on comparison with (168), this complexion does, in fact, correspond in every respect to the space distribution given above, and in a similar manner it is easy to exhibit many other complexions, which also belong to the same space distribution. The number of all possible complexions required is now easily found by inspecting the lower of the two lines of figures in (169). For, since the number of the molecules is given, this line of figures contains a definite number of places. Since, moreover, the distribution in space is also given, the number of times that every figure (i.e., every space element) appears in the line is equal to the number of molecules which lie in that particular space element. But every change in the table gives a new particular coordination between molecules and space elements and hence a new complexion. Hence the number of the possible complexions, or the thermodynamic probability, $W$ , of the given space distribution, is equal to the number of “permutations with repetition” possible under the given conditions. In the simple numerical example chosen, we get for $W$ , according to a well-known formula, the expression

$\frac{10!}{1! 2! 0! 0! 1! 4! 2!} = 37, 800 .$

The form of this expression is so chosen that it may be applied easily to the general case. The numerator is equal to factorial $N$ , $N$ being the total number of molecules considered, and the denominator is equal to the product of the factorials of the numbers, $N_{1}$ , $N_{2}$ , $N_{3}, \dots$ of the molecules, which lie in every separate space element and which, in the general case, must be thought of as large numbers. Hence we obtain for the required probability of the given space distribution

(170)

$W = \frac{N!}{N_{1}! N_{2}! N_{3}! \dots} .$

Since all the $N$ ’s are large numbers, we may apply to their factorials Stirling’s formula, which for a large number may be abridged39 to40

(171)

$n! = {(\frac{n}{e})}^{n} .$

Hence, by taking account of (165), we obtain

(172)

$W = {(\frac{N}{N_{1}})}^{N_{1}} {(\frac{N}{N_{2}})}^{N_{2}} {(\frac{N}{N_{3}})}^{N_{3}} \dots .$

124. Exactly the same method as in the case of the space distribution just considered may be used for the definition of a macroscopic state and of the thermodynamic probability in the general case, where not only the coordinates but also the velocities, the electric moments, etc., of the molecules are to be dealt with. Every thermodynamic state of a system of $N$ molecules is, in the macroscopic sense, defined by the statement of the number of molecules, $N_{1}$ , $N_{2}$ , $N_{3}, \dots$ , which are contained in the region elements $1$ , $2$ , $3, \dots$ of the “state space.” This state space, however, is not the ordinary three-dimensional space, but an ideal space of as many dimensions as there are variables for every molecule. In other respects the definition and the calculation of the thermodynamic probability $W$ are exactly the same as above and the entropy of the state is accordingly found from (164), taking (166) also into account, to be

(173)

$S = - k N \sum w_{1} log w_{1},$

where the sum $\sum$ is to be taken over all region elements. It is obvious from this expression that the entropy is in every case a positive quantity.

125. By the preceding developments the calculation of the entropy of a system of $N$ molecules in a given thermodynamic state is, in general, reduced to the single problem of finding the magnitude $G$ of the region elements in the state space. That such a definite finite quantity really exists is a characteristic feature of the theory we are developing, as contrasted with that due to Boltzmann, and forms the content of the so-called hypothesis of quanta. As is readily seen, this is an immediate consequence of the proposition of Sec. 120 that the entropy $S$ has an absolute, not merely a relative, value; for this, according to (164), necessitates also an absolute value for the magnitude of the thermodynamic probability $W$ , which, in turn, according to Sec. 123, is dependent on the number of complexions, and hence also on the number and size of the region elements which are used. Since all different complexions contribute uniformly to the value of the probability $W$ , the region elements of the state space represent also regions of equal probability. If this were not so, the complexions would not be all equally probable.

However, not only the magnitude, but also the shape and position of the region elements must be perfectly definite. For since, in general, the distribution density $w$ is apt to vary appreciably from one region element to another, a change in the shape of a region element, the magnitude remaining unchanged, would, in general, lead to a change in the value of $w$ and hence to a change in $S$ . We shall see that only in special cases, namely, when the distribution densities $w$ are very small, may the absolute magnitude of the region elements become physically unimportant, inasmuch as it enters into the entropy only through an additive constant. This happens, e.g., at high temperatures, large volumes, slow vibrations (state of an ideal gas, Sec. 132, Rayleigh’s radiation law, Sec. 159). Hence it is permissible for such limiting cases to assume, without appreciable error, that $G$ is infinitely small in the macroscopic sense, as has hitherto been the practice in statistical mechanics. As soon, however, as the distribution densities $w$ assume appreciable values, the classical statistical mechanics fail.

126. If now the problem be to determine the magnitude $G$ of the region elements of equal probability, the laws of the classical statistical mechanics afford a certain hint, since in certain limiting cases they lead to correct results.

Let $φ_{1}$ , $φ_{2}$ , $φ_{3}, \dots$ be the “generalized coordinates,” $ψ_{1}$ , $ψ_{2}$ , $ψ_{3}, \dots$ the corresponding “impulse coordinates” or “moments,” which determine the microscopic state of a certain molecule; then the state space contains as many dimensions as there are coordinates $φ$ and moments $ψ$ for every molecule. Now the region element of probability, according to classical statistical mechanics, is identical with the infinitely small element of the state space (in the macroscopic sense)41

(174)

$d φ_{1} d φ_{2} d φ_{3} \dots d ψ_{1} d ψ_{2} d ψ_{3} \dots .$

According to the hypothesis of quanta, on the other hand, every region element of probability has a definite finite magnitude

(175)

$G = \int d φ_{1} d φ_{2} d φ_{3} \dots d ψ_{1} d ψ_{2} d ψ_{3} \dots$

whose value is the same for all different region elements and, moreover, depends on the nature of the system of molecules considered. The shape and position of the separate region elements are determined by the limits of the integral and must be determined anew in every separate case.

Chapter I

34L. Boltzmann, Vorlesungen ¨uber Gastheorie 1, p. 21, 1896. Wiener Sitzungsberichte 78, Juni, 1878, at the end. Compare also S. H. Burbury, Nature, 51, p. 78, 1894.

35Hereafter Boltzmann’s “Unordnung” will be rendered by chaos, “ungeordnet” by chaotic (Tr.).

36To avoid misunderstanding I must emphasize that the question, whether the hypothesis of elemental chaos is really everywhere satisfied in nature, is not touched upon by the preceding considerations. I intended only to show at this point that, wherever this hypothesis does not hold, the natural processes, if viewed from the thermodynamic (macroscopic) point of view, do not take place unambiguously.

37It is well known that the condition that the two systems be independent of each other is essential for the validity of the expression (163). That it is also a necessary condition for the additive combination of the entropy was proven first by M. Laue in the case of optically coherent rays. Annalen d. Physik 20, p. 365, 1906.

38L. Boltzmann, Vorlesungen ¨uber Gastheorie, 1, Sec. 6.

39Abridged in the sense that factors which in the logarithmic expression (173) would give rise to small additive terms have been omitted at the outset. A brief derivation of equation (173) may be found on p. 473 (Tr.).

40See for example E. Czuber, Wahrscheinlichkeitsrechnung (Leipzig, B. G. Teubner) p. 22, 1903; H. Poincaré, Calcul des Probabilités (Paris, Gauthier-Villars), p. 85, 1912.

41Compare, for example, L. Boltzmann, Gastheorie, 2, p. 62 et seq., 1898, or J. W. Gibbs, Elementary principles in statistical mechanics, Chapter I, 1902.