Theoretical physics, simplicity. Surely the two words do not go together. Theoretical physics has been the archetypal example of complicated since its invention. So what did Frank Wilczek (b. 1951) mean by that statement quoted in the title? It is the scientist’s trick of taking a well-defined word, such as simplicity, and giving it a technical meaning. In this case, the meaning is from algorithmic information theory. That theory defines complexity (Kolmogorov complexity) as the minimum length of a computer program needed to reproduce a string of numbers. Simplicity, as used in the title, is the opposite of this complexity. Science, not just theoretical physics, is driven, in part but only in part, by the quest for this simplicity.
How is that you might ask. This is best described by Greg Chaitin (b. 1947), a founder of algorithmic information theory. To quote: This idea of program-size complexity is also connected with the philosophy of the scientific method. You’ve heard of Occam’s razor, of the idea that the simplest theory is best? Well, what’s a theory? It’s a computer program for predicting observations. And the idea that the simplest theory is best translates into saying that a concise computer program is the best theory. What if there is no concise theory, what if the most concise program or the best theory for reproducing a given set of experimental data is the same size as the data? Then the theory is no good, it’s cooked up, and the data is incomprehensible, it’s random. In that case the theory isn’t doing a useful job. A theory is good to the extent that it compresses the data into a much smaller set of theoretical assumptions. The greater the compression, the better!—That’s the idea…
In many ways this is quite nice; the best theory is the one that compresses the most empirical information into the shortest description or computer program. It provides an algorithmic method to decide which of two competing theories is best (but not an algorithm for generating the best theory). With this definition of best, a computer could do science: generate programs to describe data and check which is the shortest. It is not clear, with this definition, that Copernicus was better than Ptolemy. The two approaches to planetary motion had a similar number of parameters and accuracy.
There are many interesting aspects of this approach. Consider compressibility and quantum mechanics. The uncertainty principle and the probabilistic nature of quantum mechanics put limits on the extent to which empirical data can be compressed. This is the main difference between classical mechanics and quantum mechanics. Given the initial conditions and the laws of motion, classically the empirical data is compressible to just that input. In quantum mechanics, it is not. The time, when each individual atom in a collection of radioactive atoms decays, is unpredictable and the measured results are largely incompressible. Interpretations of quantum mechanics may make the theory deterministic, but they cannot make the empirical data more compressible.
Compressibility highlights a significant property of initial conditions. While the data describing the motion of the planets can be compressed using Newton’s laws of motion and gravity, the initial conditions that started the planets on their orbits cannot be. This incompressibility tends to be a characteristic of initial conditions. Even the initial conditions of the universe, as reflected in the cosmic microwave background, have a large random non-compressible component – the cosmic variance. If it wasn’t for quantum uncertainly, we could probably take the lack of compressibility as a definition of initial conditions. For the universe, the two are the same since the lack of compressibility in the initial conditions is due to quantum fluctuations but that is not always the case.
The algorithmic information approach makes Occam’s razor, the idea that one should minimize assumptions, basic to science. If one considers that each character in a minimal computer program is a separate assumption, then the shortest program does indeed have the fewest assumptions. But you might object that some of the characters in a program can be predicted from other characters. However, if that is true the program can probably be made shorter. This is all a bit counterintuitive since one generally does not take such a fine grained approach to what one considers an assumption.
The algorithmic information approach to science, however, does have a major shortcoming. This definition of the best theory leaves out the importance of predictions. A good model must not only compress known data, it must predict new results that are not predicted by competing models. Hence, as noted in the introduction, simplicity is only part of the story.
The idea of reducing science to just a collection of computer programs is rather frightening. Science is about more than computer programs. It is, and should be, a human endeavour. As people, we want models of how the universe works that humans, not just computers, can comprehend and share with others. A collection of bits on a computer drive does not do this.
To receive a notice of future posts follow me on Twitter: @musquod.
 From “This Explains Everything”, Ed, John Brockman, Harper Perennial, New York, 2013
 Also known as descriptive complexity, Kolmogorov–Chaitin complexity, algorithmic entropy, or program-size complexity.
 In this regard, I have a sinking feeling that I am fighting a rearguard action against the inevitable.