Yes, more causality

Next post on fairness in machine learning, I promise

Feb 17, 2024

All starts from Reichenbach’s principle really: if variable X is statistically associated to variable Y then either X causes Y, or Y causes X, or something else causes both. I have seen many attempts at refuting it, often with contrived examples. But I haven’t seen anywhere an example that, in hindsight, seems obvious: inertial motion.

Let X_i be the positions of particle A taken at times t_i. Let Y_i be the positions of particle B, also taken at times t_i. Make sure the two particles never get near each other so they do not interact. We know that their positions can be written as X_i = X₀ + (t_i – t₀)V and Y_i = Y₀ + (t_i – t₀)U. Take a projection along an arbitrary direction b if you do not like vectors and you will obtain two time series X_ib = X₀b + (t_i – t₀)Vb and Y_ib = Y₀b + (t_i – t₀)Ub that are perfectly correlated.

Yet neither the position of particle A causes that of particle B, nor vice versa: they are just happily going about their own inertial business. Similarly, I would be hard pressed to find a common cause of their motion. What gives?

Superficially, one might take our equations describing inertial motion to be part of a structural causal model, perhaps adding noise[1] as needed: X = f(t, δ) and Y = g(t, ε). This way time t would be causing both X and Y: X <- t -> Y. But this would work even if we used, instead of time, the position of a third particle, also in inertial motion. Arguably, if we were to do so, we would be using that particle as a clock. What is weird is that we can imagine that particle to be on a trajectory far removed from our two original particles, so that no interaction takes place. The claim that the position of that particle (or time, if you will) causes the positions of the other two sounds weird, especially if we think that nothing special singles out that particle from the other two. For time to work as a common cause it has to be more natural for us to write X = f(t, δ) rather than t = h(X, ζ) but this is not the case here. Maybe defining time as a suitable function of the positions of other particles would help?

*Non-inertial reference frame. Credits wikimedia commons*.

Perhaps unrelated, but worth noting, is that another way to obtain a statistical dependence between two otherwise independent variables that does not involve a common cause is collider bias, that is, conditioning on a common effect. Imagine that isolated particles can be animated, in fact, by any sort of motion: X and Y would be, in general, completely independent. However for some reason, we end up observing only those instances of motion where X and Y perfectly correlate. Perhaps all other combinations end up killing us: only a universe where isolated particles undergo inertial motion is survivable[2] – a sort of anthropic principle. This thought is terrifying, as it implies that inertia could end at any moment, for no reason at all, having endured until now by mere coincidence – or should I say synchronicity?

[1] You don’t add noise lightly though. Where does the noise come from? Ultimately it must come from interactions with other particles. Structural causal models typically assume that noise terms do not affect more than one variable at a time. This makes our model Markovian (Causality, p. 44). Whether a Markovian structural causal model can be used to describe a physical system seems to me to depend on the nature of the latter. My guess is that the system needs to be open (something something heath bath, canonical ensemble).

[2] Even in a universe that admits inertial frames, finding oneself in a non-inertial frame of reference where isolated particles are observed to be accelerating wildly and unpredictably may not be very survivable.

Mario’s Substack

Yes, more causality

Next post on fairness in machine learning, I promise