Not ‘Just Automated Statistics’. Like us, AI is wrong but useful.
Responding to and making sense of the frequent objection in U.K. Government that AI is just automated statistics, just maths.
Drop the self-harming rhetoric from authority figures. This was one of mine and Rob’s recommendations in “Don’t Blink”. I wanted to unpack it a little further.
Throughout the Integrated Review of 2020, more than one senior official would dismiss arguments for AI’s potential with the claim that ‘AI is just automated statistics’ or ‘AI is just maths, not magic’. This is true, but:
(1) human cognition too is reducible to mathematics;
(2) AI is no more just statistics or maths than thermonuclear weapons are just physics, or a tiger is just biology.[1]
Dismissing AI as just maths, just statistics, just matrix multiplication should be deeply embarrassing for any scientist. All of science, is, in the end ‘just maths’. So is our cognition too. The problem is that for scientists in some disciplines, the limitations of their mathematical, or statistical, models, can blind them to this fundamental truth.
Wrong But Useful.
‘All models are wrong, but some are useful’. So wrote the mathematician George Box, and it’s a quote repeated so many times as to have reached the level of cliché among many experts. But I think we have under-extrapolated from this key insight, and doing so is what leads to scientists I admire dismissing AI with the ‘just maths/just stats’ wave of hand.
Scientists know their models are wrong, but useful. Many fields, for example epidemiology and social science, employ hugely complex statistical models to try to make sense of the world. The scientists that develop and employ them know that without their models we would be much the poorer and more vulnerable. Take the epidemiological models for Covid for example. They helped inform public policy to save many lives.
But no-one, least of all the scientists that built them or used them, pretended they were perfect. The more reliant one is on the model for giving advice in the real world, the more you would be aware the model was wrong but useful. Better than guessing, but deeply imperfect.
This sense of the unreliability of maths and statistics in understanding the world must therefore be even more acute for scientists that show the courage to take up the challenge of working in public policy, like our Chief Scientific Advisors (CSAs). If AI too is just a model, how useful can it be? Isn’t it just maths, just statistics?
Evolution, Assembly, Wolfram
There is little excuse for scientists in any field to dismiss AI as ‘just maths’, or ‘just automated statistics’ if they took a little time to think beyond their expert knowledge of how difficult it is to statistically or mathematically model the complexity of reality, to consider where that complexity came from.
For centuries we assumed humans were so uniquely complicated and special, that creationism was the only explanation contemplated, later intelligent design. But of course, humans were just animals, evolved from much simpler creatures – far enough back, and this means single-celled organisms. From simple, easily modelled maths, to ever more complex organisms – but still reducible to just maths.
What if we go back further? Even those simple, early organisms were complex in comparison to their chemical and atomic components parts. They had to come from somewhere. Where was it? We now understand, or at least confidently hypothesise in Lee Cronin’s Assembly Theory[2], that we get from the immutable laws of physics, to the mutable complexity of biological life via the evolution of first simple, and gradually more complex, chemical compounds. Another way to say this, is that incredibly simple combinations later yield the vast complexity of life, including human life, and our intelligence. From simple structures, easily reducible to maths, to complex structures and biology – conceptually still reducible to maths.
This idea has also been explained, at exhausting length, in Wolfram’s New Kind of Science a book that required the combined motivation of two smart reading companions, a determination not to give up, and the ability to channel the frustration at that way it is written into a focus on getting to the end of each chapter. Nevertheless, Wolfram introduces the profound and important idea that all of life, everything, must have evolved from incredibly simple patterns. It reverses our approach to science, suggesting we are better off trying to explain the world from this simplicity up, rather than down and in from the complexity of the world as viewed from our perspective. Wolfram goes further, suggesting that we cannot but see the world as complex, because we ourselves are. It takes effort to force ourselves to acknowledge that we too are the complexity that results from very simple patterns, just as Darwinian evolution, and Assembly Theory show. Another way to see this, is that everything evolved from simple mathematical combinations, 1+1+1 etc. but quickly generates vast complexity.
Human Cognition: Wrong But Useful
Neurons communicate through electrical impulses known as action potentials—a process that lends itself to mathematical modelling. Donald Hebb describes how “neurons that fire together wire together” electrical pulses, the zeros and ones of the brain build the synaptic connections that are the substance of thought: quantifiable at the lowest level by adjustments in mathematical equations. The complexity builds quickly, and so the models we build of this complexity are wrong but useful. Wolfram shows the underlying science, assembly theory and evolution show it operating in the real world. Cognition too, is just maths.
And our mental models too are wrong but useful. We never see the world as it is, we only see deeply flawed approximations. This profound idea is easily understood: all magic tricks rely on exploiting flaws in your approximated perception, with sleight of hand and visual illusion. Less obvious is that our perception is never in the present moment. It is, despite the exhortations of woo-woo spiritualists, never possible for us to live in the moment. We are prediction machines, constantly anticipating the future, seeing things as our brain anticipates they will be a few microseconds ahead, in order that we don’t live constantly in lag. But these models of the future while useful, are also often wrong, making errors of coordination and judgement. The dropped catch, the stumble, the gasp as rabbit appears from hat.
Furthermore, in addition to this constant, subconscious anticipation of the future, there is the conscious effortful, abstract ability to imagine and test alternative futures, to think ahead: if this, then that, and plan strategically. To do it, we take an informational input - an image or a puzzle, the words of a book, the advice of a friend, the sweeping panorama before us - process it in the brain, ‘cognition’ the act of processing, and translate it into a behavioural output, a response. Psychologists measure this mathematically. Neuroscientists model this mathematically. Like everything else, cognition is reducible to maths. Put another way, cognition is, like AI, ‘just automated statistics’. Wrong but useful.
This abstract reasoning may be the defining feature of human cognition. It is what made games like chess and Go such obvious arenas to train our brains for battle, for politics – we still refer to someone playing a constant game of calculation, perhaps of Machiavellian scheming, or brilliant planning, as someone playing ‘4D chess’. We are trying to build models of the future that are more useful, less wrong, than rivals. A win in chess or Go provides definitive evidence that our model was the better one over a rival’s. A successful outcome in a competitive environment in the real world likewise.
This is why IBM’s DeepBlue winning at chess (1997) and Deepmind’s AlphaGoZero at Go (2017) were such shocking and profound breakthroughs. No animal could have beaten a human in these games. For the first time, there was some kind of intelligence that could build models that were less wrong and more useful than our mental models.
Of course in response, ego-centric humans quickly moved on, suggesting that such games were never a meaningful test of human cognition as it acts in the real world. These were games that were too easily mathematically modelled, they said. True, but irrelevant: the direction of travel was clear, and these games existed precisely to train and test human cognition, which is what made them such a good proxy. Not the final test, but a Rubicon crossed. AlphaGoZero, DeepBlue, Kasparov’s and Lee Sedol’s cognition was, like the games, all just maths. And AI’s models were now more useful and less wrong than ours.
Bitter Lesson
Once you accept that the foundations of life, and therefore intelligence, emerge from very simple patterns, you can see why Rich Sutton’s 2019 essay The Bitter Lesson remains perhaps the most important insight into the future of AI progress: it is unlikely to come, as computer scientists once imagined, from complex symbolic systems that seek to replicate -or model - the complexity we see when we observe (e.g.) the working of the human brain, the complexity seen in cognition, or in social science. Rather, and it is worth quoting two paragraphs of Sutton’s essay at length:
"The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore's law, or rather its generalization of continued exponentially falling cost per unit of computation. Most AI research has been conducted as if the computation available to the agent were constant (in which case leveraging human knowledge would be one of the only ways to improve performance) but, over a slightly longer time than a typical research project, massively more computation inevitably becomes available. Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation. These two need not run counter to each other, but in practice they tend to. Time spent on one is time not spent on the other. There are psychological commitments to investment in one approach or the other. And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation. …
We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that:
AI researchers have often tried to build knowledge into their agents,
this always helps in the short term, and is personally satisfying to the researcher, but
in the long run it plateaus and even inhibits further progress, and
breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning.
The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.”
Wolfram shows us everything is built from simple patterns. Cronin and Darwin show us how the process of recombination of such simple patterns allowed life and everything to evolve. Sutton suggests the only way to match what has evolved is to build it the same way, bottom up, with just maths. This is what ML developers at the cutting edge are doing.
It’s those straight lines on graphs – on computation in particular, but also in algorithmic efficiency, that Rob & I point to, in making the case that we should take AI, AGI and ASI much more seriously than we are. It will not be the #1 priority in the Review, if our most influential scientists can make the dismissive argument that AI is ‘just statistics’, ’just maths’ unchallenged. I hope this post might help to stop such dismissals. To help us ensure our mental models of AI are less wrong, and more useful.
[1] https://x.com/primalpoly/status/1836867444947963984; https://x.com/nearcyan/status/1632661647226462211
[2] Assembly Theory is a framework that quantifies the complexity of objects based on their formation histories. It introduces the concept of the assembly index, which measures the minimal number of steps required to build an object from basic building blocks.