Compositional Thermostatics

applied category theory

physics

entropy

Author

Published

2021-09-09

Abstract

In this post, we explore the compositionality of thermodynamic systems at equilibrium (thermostatic systems). The main body of this uses no category theory, and reviews the physics of thermostatic systems in a formalism that puts entropy first. Then there is a brief teaser at the end showing how this entropy-first approach can be treated categorically.

1 Introduction

Thermostatics is the study of thermodynamic systems at equilibrium. Typically people call this subject “equilibrium thermodynamics”, but we prefer “thermostatics” because it is in analogy with other physical disciplines such as “electrostatics”.

Traditionally, thermodynamics is a subject impenetrable to any with a good sense of mathematical rigor. In the appropriately named “The Tragicomical History of Thermodynamics”, Clifford Truesdell describes the state of thermodynamics in 1980 as “a dismal swamp of obscurity”, and goes on to proclaim that there is “something rotten in the [thermodynamic] state of the Low Countries”. (cited in [1]). Unlike other fields of physics, like classical mechanics, relativity, quantum mechanics, or even statistical mechanics, thermodynamics has so far (to my knowledge) eluded a well-accepted formal mathematical framework, and this is in spite of its incredible importance to chemistry, biology and engineering.

In this blog post, I aim to bring the mathematical reader out of this swamp, and show that there is a formulation of thermostatics that is both mathematically precise and simple. To this end, I will start with an axiomatic framework for thermostatics, slightly more rigorous than the axiomatic framework one often sees in the beginning of a physics book, but still not perfectly precise. This axiomatic framework will be accompanied by examples and physical reasoning as motivating material. For the sake of pedagogy, this will be less general than possible, and also somewhat flawed.

I will then reformulate the axiomatic framework in categorical terms, which leads to more rigorous and general setting. Finally, I will conclude with a teaser of something surprising that can be done within this categorical framework. This blog post is a preview of some content that I will be releasing a paper on with John Baez and some other collaborators, and what I am teasing will be fleshed out in more detail there.

2 An Axiomatic Framework for Thermostatics

Before we begin, a brief note on the use of axioms rather than definitions, which are more typical in a mathematical setting. This section is intended to give a physical intuition for why my mathematical definitions in the next section are natural. Therefore, the real content of these axioms is the assertion that they correspond to physical reality. We have some intuition in our heads already for thermostatic equilibrium, where we expect that things touching each other should have the same temperature, and we expect gas to be distributed uniformly in a container. Rather than just stating definitions, these axioms are claiming that our experience of the world is in accordance to a certain formalism.

Inspiration for this axiomatic framework is due to the opening chapter of [2].

Axiom 1

A thermostatic system \Sigma consists of a state space \mathbb{X} = \mathbb{R}^{n}_{> 0} for some natural number n, and a differentiable entropy function S \colon \mathbb{X}^{\Sigma} \to \mathbb{R}. We call the coordinates in \mathbb{X} extensive variables.

When we are talking about systems that are given by subscripts on \Sigma, i.e. \Sigma_{1}, \Sigma_{2}, we will typically refer to the state space and entropy function by \mathbb{X}^{1}, S^{1}.

Example

The thermostatic state of a lump of metal is completely determined by its energy, U. Thus, a thermostatic system \Sigma_{\mathrm{lump}} representing a lump of metal consists of a state space \mathbb{X}^{\mathrm{lump}} = \mathbb{R}^{1}_{>0}, and an entropy function S^{\mathrm{lump}} \colon \mathbb{S}^{\mathrm{lump}} \to \mathbb{R} given by S(U) = C \log(U), where C is a constant.

The temperature T of this lump of metal is given by the equation:

\frac{1}{T} = \frac{\partial}{\partial U} S^{\mathrm{lump}}(U) = C \frac{1}{U}

so CT = U. Thus we can identify C with the heat capacity of the system; i.e. how much heat do you need to put in to have the temperature increase by one unit. This is how properties of the system are encapsulated in the system entropy.

Two lumps of metal, one cold and one hot — A lump of metal with more energy has a higher temperature

Example

The thermostatic state of an ideal gas is determined by its energy, U, volume, V, and number of atoms, N (which we treat as a continuous parameter because it is big). Thus \mathbb{X}^{\mathrm{ideal}} = \mathbb{R}^{3}_{\geq 0}. The entropy of the ideal gas is a function S^{\mathrm{ideal}} of U, V and N given by the “Sackur-Tetrode” equation, the details of which are not important. Assuming we have this S^{\mathrm{ideal}}, we can write down equations for quantities T (temperature), P (pressure), and \mu (chemical potential), namely:

\frac{1}{T} = \frac{\partial}{\partial U} S^{\mathrm{ideal}}(U,V,N) \frac{P}{T} = \frac{\partial}{\partial V} S^{\mathrm{ideal}}(U,V,N) \frac{\mu}{T} = -\frac{\partial}{\partial N} S^{\mathrm{ideal}}(U,V,N)

These quantities obey the ideal gas law

PV = NRT

where R is a constant, and it is possible to recover the Sackur-Tetrode equation from the ideal gas law and the knowledge that S(\lambda U, \lambda V, \lambda N) = \lambda S(U,V,N) for positive \lambda, a fact which we will explain later.

An ideal gas, with regulatory mechanisms — Here is pictured an ideal gas, along with mechanisms for changing its energy, volume, and particle number

Axiom 2

Given two thermostatic systems \Sigma_{1} and \Sigma_{2}, we may form their independent sum \Sigma_{1} \oplus \Sigma_{2}. The state space of \Sigma_{1} \oplus \Sigma_{2} is \mathbb{X}^{1} \times \mathbb{X}^{2}, and the entropy function on \Sigma_{1} \oplus \Sigma_{2} is

S^{\Sigma_{1} \oplus \Sigma_{2}}(X^{1}, X^{2}) = S^{\Sigma_{1}}(X^{1}) + S^{\Sigma_{2}}(X^{2})

The independent sum of two thermostatic systems corresponds physically to putting the two systems together and not letting them interact.

Example

The independent sum \Sigma_{\mathrm{lump}} \oplus \Sigma_{\mathrm{lump}} is the thermostatic system consisting of two lumps of metal not touching. Its state space consists of all possible values of the two energies of the two lumps of metal.

Two lumps of metal separated by an insulator — Any assignment of energies to the individual lumps can be an equilibrium solution, because they do not exchange heat

This is not a particularly exciting way of composing systems. Intuitively, if one has two rocks that are touching, the state where one is very hot and the other is very cold is not an equilibrium state, and we want to capture this. To capture this formally, let’s talk about what equilibrium means in this context.

In physics, there is a “maximum entropy” principle, which says that equilibrium happens at the state of highest entropy. However, this must be modified slightly in order to work, because we need to be careful about which states we are considering. That is, we want to say that equilibrium happens at the state of highest entropy with respect to contraints put on the system.

As an example, consider the following interactive applet, which attempts to layout a graph at a local maximum of some “goodness of layout” function, left unspecified. The red nodes are unconstrained, and when you click and drag on one, it becomes blue and stays in place. Clicking on it again returns it to an unconstrained red state. The applet attempts to find the best layout with respect to the constraints put on it by the blue nodes.

We describe this by calling a position of all the nodes an “endostate” (or internal state), and a position of the blue nodes an “exostate” (or external state). The system attempts to find an equilibrium endostate compatible with an exostate. We formalize this intuition with an axiom.

Axiom 3

Let \Sigma_{\mathrm{endo}} = (\mathbb{X}, S^{\mathbb{X}}) be a thermostatic system. In this axiom, we refer to system states of \Sigma_{\mathrm{endo}} as endostates, and we say that \Sigma_{\mathrm{endo}} is the internal system. Now, suppose that we have an external description of \Sigma_{\mathrm{endo}} in the form of a state space of exostates \mathbb{Y} and a relation R \subseteq \mathbb{X} \times \mathbb{Y}, where (X,Y) \in R iff X is compatible with Y. Then for a given Y \in \mathbb{Y}, an endostate X is in thermal equilibrium with respect to Y if S^{\mathbb{X}}(X) is maximal among X such that (X,Y) \in R.

Moreover, we define an entropy measure S^{\mathbb{Y}} on \mathbb{Y} by

S^{\mathbb{Y}}(Y) = \sup_{(X,Y) \in R} S^{\mathbb{X}}(X)

and thus a thermostatic system \Sigma_{\mathrm{exo}} = (\mathbb{Y}, S^{\mathbb{Y}}).

One justification for this axiom in terms of classical thermodynamics is that the second law says that entropy always increases. Therefore, a point of equilibrium for any constrained system must be a highest entropy state for that system subject to those constraints.

The problem with the classical second law, however, is that the second law refers to a dynamical system, whereas classical thermodynamics deals only with systems at equilibrium. No wonder thermodynamics was called a swamp!

Also, one should note that with our definition, S^{\mathbb{Y}}(Y) could be the supremum of an empty set, or an unbounded set. Moreover, there’s no guarantee that S^{\mathbb{Y}} is differentiable. In the categorical section, we address both of these problems; for now… just don’t choose bad R OK?

Example

We will now discuss the system consisting of two lumps of metal in thermal contact. The internal system here is \Sigma_{\mathrm{endo}} \oplus \Sigma_{\mathrm{endo}}; that is, the system where states are given by the energies U^{1} and U^{2} of each rock and the entropy is the sum of the two individual entropies.

An endostate here is a total energy U \in \mathbb{R}_{\geq 0}, which is compatible with (U_{1}, U_{2}) if U_{1} + U_{2} = U.

We want to find the maximum entropy endostate compatible with an exostate U. This is found by the unconstrained optimization problem optimizing S(U^{1}) + S(U-U^{1}) with respect to U^{1}. The maximum of this will be found when the partial derivative with respect to U^{1} is 0, which happens when

\frac{\partial}{\partial U^{1}} S(U^{1}) = - \frac{\partial}{\partial U^{1}} S(U - U^{1})

which reduces to

\frac{\partial}{\partial U^{1}} S(U^{1}) = \frac{\partial}{\partial U^{2}} S(U^{2})

that is, thermostatic equilibrium is found at the point where the (inverse) temperatures of the two lumps of metal are the same! In this case, because the lumps are identical this will happen when U_{1} = U_{2} = \frac{1}{2}U, but in the case that each lump had a different heat capacity, this could end up differently.

Example

Consider two ideal gases which can exchange volume and energy. We will not do this example out as formally as the previous example; we will just give an informal description of the endo and exo states.

An endostate consists of an assignment of energy, volume, and number of molecules to each gas, using variables (U^1, V^1, N^1, U^2, V^2, N^2) The exostates consist of the total energy total volume, and individual number of atoms for each gas, using variables (U,V,N^1_{\mathrm{exo}}, N^2_{\mathrm{exo}}). The compatibility condition is

U^{1} + U^{2} = U V^{1} + V^{2} = V N^{1} = N^{1}_{\mathrm{exo}} N^{2} = N^{2}_{\mathrm{exo}}

That is, the individual energies and volumes of the endostate must add to get the total energies and volumes of the exostate, and the particle numbers of the endostate are simply fixed in place by the exostate.

Two ideal gases allowed to exchange energy and volume — Note that we have controls for the number of molecules in each chamber, but we only have controls for the *total* energy and the *total* volume

Just as in the lumps of metal example equilibrium corresponded to equalizing temperature, Equilibrium here corresponds to equalizing temperature and pressure (which you should recall are related to the partial derivatives of energy with respect to energy U and volume V).

For our final axiom, we will first give some physical arguments for a couple of propositions. These arguments are typical of physics textbooks and are not rigorous, relying on our physical intuition for what “equilibrium” means. Their purpose is to motivate Axiom 4, by connecting it to our intuitions about the physical world.

Proposition

Entropy of an ideal gas is positively homogeneous of degree 1, that is

S(\lambda U, \lambda V, \lambda N) = \lambda S(U,V,N)

for all positive \lambda.

Proof. To show this, consider an ideal gas in a box split in half by an imaginary wall. The exostates that we will consider are the standard (U,V,N), and the endostates compatible with (U,V,N) are all (U^{1}, V^{1}, N^{1}), (U^{2}, V^{2}, N^{2}) such that

U^{1} + U^{2} = U N^{1} + N^{2} = N V^{1} = \frac{1}{2}V V^{2} = \frac{1}{2}V

An ideal gas split in two by an imaginary wall — The wall is only imaginary: particles and energy can flow freely through it

Our physical intuition tells us that equilibrium happens when U^{1} = U^{2} = \frac{1}{2} U and N^{1} = N^{2} = \frac{1}{2} N. That is, this is the endostate that maximizes the entropy. Thus, by the definition of exostate entropy as the maximum over compatible endostate entropies.

S(U, V, N) = S(\frac{1}{2} U, \frac{1}{2} V, \frac{1}{2} N) + S(\frac{1}{2} U, \frac{1}{2} V, \frac{1}{2} N) = 2 S(\frac{1}{2} U, \frac{1}{2} V, \frac{1}{2} N)

and thus \frac{1}{2} S(U,V,N) = S(\frac{1}{2} U, \frac{1}{2} V, \frac{1}{2} N).

Splitting the box into n equal parts similarly shows that \frac{1}{n} S(U,V,N) = S(\frac{1}{n} U, \frac{1}{n} V, \frac{1}{n} N), and then some algebra shows that

S(\lambda U, \lambda V, \lambda N) = \lambda S(U,V,N)

for all positive, rational \lambda. Finally, invoking continuity shows it for all \lambda.

Proposition

Entropy of an ideal gas is concave, that is

S(\lambda X^{1} + (1-\lambda)X^{2}) \leq \lambda S(X^{1}) + (1-\lambda)S(X^{2})

for all \lambda \in [0,1].

As a reminder what this means geometrically, here is a graph of a concave function.

An illustration of what it means for a function to be concave — The average of S(X^1) and S(X^2) is found at the midpoint of the line that connects them

Proof. We first show this for \lambda = \frac{1}{2}.

Again, consider the ideal gas with an imaginary divider, but this time suppose that the divider can move. Let (U^{1}, V^{1}, N^{1}) and (U^{2}, V^{2}, N^{2}) be two possible states of an ideal gas. Then fix an exostate of the “ideal gas with imaginary divider” system: U = U^{1} + U^{2}, V = V^{1} + V^{2}, N = N^{1} + N^{2}. An endostate that is clearly in thermal equilibrium with this exostate is

U^{1}_{\mathrm{eq}} = U^{2}_{\mathrm{eq}} = \frac{U^{1} + U^{2}}{2} V^{1}_{\mathrm{eq}} = V^{2}_{\mathrm{eq}} = \frac{V^{1} + V^{2}}{2} N^{1}_{\mathrm{eq}} = N^{2}_{\mathrm{eq}} = \frac{N^{1} + N^{2}}{2}

Therefore, by definition of thermal equilibrium,

S(U^{1}, V^{1}, N^{1}) + S(U^{2}, V^{2}, N^{2}) \leq 2 S(\frac{U^{1} + U^{2}}{2},\frac{V^{1} + V^{2}}{2},\frac{N^{1} + N^{2}}{2}) \frac{S(U^{1}, V^{1}, N^{1}) + S(U^{2}, V^{2}, N^{2})}{2} \leq S(\frac{U^{1} + U^{2}}{2},\frac{V^{1} + V^{2}}{2},\frac{N^{1} + N^{2}}{2})

This is what we wanted for \lambda = \frac{1}{2}. To show it for all \lambda, we sequentially subdivide as shown in the following picture. That we can do this is a standard fact for continuous concave function.

An iterative proof for general lambda — Just keep subdividing and applying the result for \lambda=\frac{1}{2}, you’ll get there eventually!

As I said before, these arguments are not really proofs, but rather appeals to our physical intuition about what “equilibrium” means. To bring this into mathematics, we take what we have just argued for as an axiom.

Axiom 4

Entropy is concave.

Entropy being concave also means that the partial derivative of entropy with respect to, for instance, energy, is always decreasing. This means that temperature increases as energy increases, which is physically true, so it’s a good thing that it’s true in our formulation.

But more fundamentally, entropy being concave means that “mixing” two states (via convex combination) always results in a greater than or equal entropy to the convex combination of the original entropies. Mixing raising entropy is arguably a fundamental property of anything that should be called entropy, just on the intuition that entropy measures how “mixed up” a state is.

Note that we do not include homogeneity of entropy as an axiom. We could do this in our current framework, because we have restricted state spaces to be \mathbb{R}^{n}_{>0}. This is because of the following thermostatic system, which violates Axiom 1.

Example

Let X be a finite set, and consider the probability simplex

\Delta^{X} = \{p \in \mathbb{R}^{X}_{>0} \mid \sum_{i \in X}p_{i} = 1 \}

Then the Shannon entropy S_{\mathrm{sh}} is given by

S_{\mathrm{sh}}(p) = - \sum_{i \in X} p_{i} \log(p_{i})

and is a concave function.

One of the exciting features of the formulation of thermostatics that I will present in the next section is that it is flexible enough to capture the previous example, and talk about how it relates to models like the classical ideal gas given by (U,V,N) coordinates, which means that this formalism edges in on the territory of statistical mechanics.

3 The Categorical Perspective

The categorical perspective on thermostatics repackages the four axioms listed above into a more elegant and rigorous framework. The axioms don’t quite map cleanly onto the parts of the categorical perspective; they rather come together and then come apart in a different organization. However, one can certainly see how each axiom comes into the framework in its own way.

We start with the category \mathsf{ConvRel} which has

as objects convex spaces (i.e., convex subsets of vector spaces)
as morphisms convex relations. A convex relation between \mathbb{X} and \mathbb{Y} is simply a convex subset of \mathbb{X} \times \mathbb{Y}.

In the language of thermostatics, an object of \mathsf{ConvRel} is a state space, and a morphism R \subseteq \mathbb{X} \times \mathbb{Y} is a way of seeing \mathbb{X} as the endostates of a system, and \mathbb{X} as the exostates. R is the compatibility relation; (X,Y) \in R if X is compatible with Y.

We then make a functor \mathrm{Ent} \colon \mathsf{ConvRel} \to \mathsf{Set}, which sends a state space \mathbb{X} to \mathrm{Ent}(\mathbb{X}) = \{ S \colon \mathbb{X} \to \bar{\mathbb{R}} = \mathbb{R} \cup \{ -\infty, +\infty \} \mid S \text{ is concave}\}. That is, it sends a state space to the set of compatible entropy functions. Note that we have not required these entropy functions to be differentiable, but we have required them to be concave. Moreover, entropy can be positive or negative infinity; this is necessary because the supremum of an empty set is -\infty and the supremum of an unbounded set is +\infty.

On morphisms, we must take a convex relation R \subseteq \mathbb{X} \times \mathbb{Y} (which we view as an endostate/exostate compatibility relation) to a function \mathrm{Ent}(R) \colon \mathrm{Ent}(\mathbb{X}) \to \mathrm{Ent}(\mathbb{Y}). To define \mathrm{Ent}(R), we must take a entropy function on \mathbb{X} and produce an entropy function on \mathbb{Y}. Axiom 3 tells us exactly how to do this: given an entropy function S^{\mathbb{X}} \colon \mathbb{X} \to \bar{\mathbb{R}}, we produce an entropy function S^{\mathbb{Y}} \colon \mathbb{Y} \to \bar{\mathbb{R}} by

S^{\mathbb{Y}}(Y) = \sup_{(X,Y) \in R} S^{\mathbb{X}}(X)

In words, the entropy function on \mathbb{Y} sends an exostate to the maximum of entropies of compatible endostates. This new entropy function S^{\mathbb{Y}} is also concave because R is a convex relation (which is not trivial to show, but also not too hard).

So far, we have covered axioms 1, 3, and 4. The remaining axiom, axiom 2, allows us to take an entropy function on \mathbb{X}^1 and an entropy function on \mathbb{X}^2 and construct an entropy function on \mathbb{X}^{1} \times \mathbb{X}^{2}. This is simply expressed as a function in \mathsf{Set}

\kappa_{\mathbb{X}^{1},\mathbb{X}^{2}} \mathrm{Ent}(\mathbb{X}^{1}) \times \mathrm{Ent}(\mathbb{X}^{2}) \to \mathrm{Ent}(\mathbb{X}^{1} \times \mathbb{X}^{2})

which is natural in \mathbb{X}^{1} and \mathbb{X}^{2}. That is, \kappa is a natural transformation between two functors \mathsf{ConvRel} \times \mathsf{ConvRel} \to \mathsf{Set}, the first sending (\mathbb{X}^{1}, \mathbb{X}^{2}) to \mathrm{Ent}(\mathbb{X}^{1}) \times \mathrm{Ent}(\mathbb{X}^{2}), and the second sending (\mathbb{X}^{1}, \mathbb{X}^{2}) to \mathrm{Ent}(\mathbb{X}^{1} \times \mathbb{X}^{2}).

Categorically speaking, all of this structure has a name. (\mathrm{Ent}, \kappa) is known as a lax monoidal functor between the monoidal categories (\mathsf{ConvRel},\times) and (\mathsf{Set},\times).

So, in the grand tradition of category theorists boiling down something very complex into tightly wrapped package, we might say “thermostatics is just the study of a particular lax monoidal functor from (\mathsf{ConvRel}, \times) to (\mathsf{Set},\times)”. And there is a way of repacking this definition into yet another package, using operads and operad algebras, but we will not delve into this here.

The reader who wishes to get more of an intuition for this construction is encouraged to go back to the examples of ideal gases and lumps of metal in the previous section and reformulate them in a more categorical context.

The last example I will give of this formalism, which I mentioned at the end of the previous example, I will give briefly; and you must wait for a subsequent blog post or paper for it to be realized in full detail.

Example

Fix a finite set \mathcal{X}, and a function H \colon \mathcal{X} \to \mathbb{R}. Then H induces a function \mathbb{E}_{-}[H] \colon \mathcal{P}(\mathcal{X}) \to \mathbb{R}, sending a probability distribution p on \mathcal{X} to its expected value \mathbb{E}_{p}[H] = \sum_{i \in \mathcal{X}} H(i) p_{i}. Let the compatibility relation R \subset \mathcal{P}(\mathcal{X}) \times \mathbb{R} be given by (p, h) \in R iff \mathbb{E}_{p}[H] = h.

Note that \mathcal{P}(\mathcal{X}) can be identified with the simplex \Delta^{\mathcal{X}}, and it inherits its convex structure from this. We then put an entropy function on this convex space: Shannon entropy!

S_{\mathrm{sh}}(p) = - \sum_{i \in \mathcal{X}} p_{i} \log p_{i}

Now, \mathrm{Ent} allows us to take S_{\mathrm{sh}} \in \mathrm{Ent}(\mathcal{P}(\mathcal{X})) and construct an entropy measure \mathrm{Ent}(R)(S_{\mathrm{sh}}). This sends a real number h to the maximal Shannon entropy of a distribution p such that \mathbb{E}_{p}[H] = h.

According to a well-known result in statistics, there is a fixed form for this distribution! It is

p_{i} = \frac{\mathrm{e}^{\beta H(i)}}{Z_{\beta}}

where \beta is a parameter we can tweak to make \mathbb{E}_{p}[H] = h, and

Z_{\beta} = \sum_{i \in \mathcal{X}} e^{\beta H(i)}

is a normalizing factor called the partition function. This type of probability distribution is called a Gibbs distribution, and it is very important in many areas of statistics, including…. statistical mechanics!

It was very surprising to me that this framework which I came up with to do macroscale thermostatics also seems to have inherent connections to statistical mechanics; it seems like I must be on to something.

Exactly what I am onto will have to wait until the future, however.

If you find this kind of thing interesting, please reach out to me, my email is “owen at topos dot institute”.

References

[1]

W.M. Haddad, A dynamical systems theory of thermodynamics, Princeton University Press, Princeton, NJ, 2019.

[2]

S. Friedli, Y. Velenik, Statistical mechanics of lattice systems: A concrete mathematical introduction, Cambridge University Press, Cambridge, United Kingdom ; New York, NY, 2017.