Doctor Weirdlove or: “How I Stopped Worrying and Learned to Love Skynet”
Why betting on a crewed RAF Tempest is a dangerous gamble—and how ignoring the AGI debate leaves us strategically exposed

We’re worried. On the public evidence, current UK military understanding of artificial intelligence, both present capabilities and the trajectory of its development, is dangerously inadequate. The consequences of this misunderstanding are existential: for our service personnel on the future battlefield, for our armed forces in future combat, and for humanity itself.
The danger is twofold:
First, UK Defence radically underestimates current AI capabilities and the speed of their advance. If an adversary adopts AI-enabled decision-making and autonomous capabilities, and we don’t, we lose.
Second, UK Defence is not considering the implications for humanity of Artificial General Intelligence and war. Without solving AI alignment—the challenge of ensuring advanced AI reliably follows human intent—we risk handing power over humanity’s most lethal weapons to misaligned, super-intelligent machines.
We make three arguments, each logically building on the other.
1. AI-driven decision-making and autonomy are essential for our national security.
2. Embracing AI-driven decision-making necessarily involves Defence leaders in the broader AI alignment debate.
3. Therefore, Defence leaders must actively engage in this debate, clearly articulating the "Strangelovian" dilemma we face: the paradox of urgently needing AI-driven military capabilities despite the existential risks posed by AGI.
Caveat
Before we get into that, we want to be clear on what we are and aren’t saying in this article.
Firstly, we recognise the need for a major programme to ensure the UK can, to the use the USAF phrase ‘fly, fight and win’ in future warfare. We are not opposed to the investment in this essential capability.
Second, the RAF’s decision to run the podcast series and talk openly about its next generation fighter aircraft ‘Tempest’, the wider Future Combat Air System (FCAS) of which it is a part, and the international collaboration, the ‘Global Combat Air Programme (GCAP)’ that is to design, manufacture, and deliver a next-generation crewed combat aircraft, is an admirable commitment to transparency.
Third, those speaking on the podcast showed courage in speaking publicly, spoke well, in the difficult circumstances of being on the record in an informal environment, and are to be commended. Without such transparency, debate is almost impossible.
Fourth, the announcement of the RAF’s procurement of StormShroud, an autonomous support to crewed fighters (or ‘loyal wingman’) is a step in the right direction, to be celebrated, it does not change what is argued herein.
We hope to show why debate is necessary, but recognise we risk reducing the willingness to be open in the future. We hope that is not the case. In writing this post we wish to be clear that our concerns relate to the wider issues signalled in the Armed Forces’ understanding of Artificial Intelligence and autonomy, and are not aimed at individuals. We offer constructive criticism, genuine concern. We are seeking to play the ball, not the man.
Why pick on Tempest? Because we judge that the air environment is the domain which will be automated first, and because the long timeline of its development – 15 years before it enters service, ~45 years before it is expected to stop serving as the RAF’s frontline fighter – are well in excess of most estimates for when we reach AGI.
AI Capabilities
UK Defence currently misunderstands, and dangerously underestimates, the capability and trajectory of AI systems. A recent illustrative example: The Economist’s Shashank Joshi noted two concerning claims[1] on the RAF’s otherwise excellent podcast[2]:
1. AI today is just a ‘just a stochastic parrot, it is just advanced auto complete’[3] … &
2. ‘We’re prepared for the time when AGI does catch-up.’
The first claim is heard widely in UK Defence, sounds smart, but isn’t. As Shashank noted on Bluesky, it is indicative of a ‘slightly impoverished view of frontier models that you see in defence & even parts of the intelligence world’. Commenting, Professor Ken Payne suggested this view reflects Defence’s ‘overconfidence, and signalling…expertise via knowing cynicism…’ rather than genuine understanding.[4] Both Payne and Joshi are well connected and trusted commentators and experts within UK defence. They highlight that this is a bigger issue than a throwaway comment on a podcast.
The “stochastic parrot” concept originates from Emily Bender’s 2021 paper published over 12-months before the release of ChatGPT. Today the idea is a misleading caricature of AI models. LLM’s do not just parrot what they were trained on. Here’s one of the earliest academic deconstructions of what is wrong with that claim, from August 2022. Here’s Geoff Hinton, Demis Hassabis, Tomaso Poggio, Ilya Sutskever – leading experts in AI - discussing how AI can reason in analogies, and be creative (2024). Here a paper from 2024 showing evidence LLMs can conduct inductive out-of-context reasoning, make inferences beyond their training data. Similarly, mathematician Stephen Wolfram explains how language is a world-model, from which LLMs are likely deriving their output. The ‘stochastic parrot’ claim is a zombie idea, long dead, it walks on.
The problem isn't just a misunderstood technical detail. The problem is that this error compounds into the future. It is a misunderstanding of what is likely to happen and at what speed. It misunderstands the stakes at play and the odds.
Three recent examples of how quickly AI is advancing, and why it matters:
1. ‘Humanity’s Last Exam’ (January 2025): There is a new AI benchmark test called ‘Humanity’s Last Exam’ a selection of extremely difficult questions which AI was initially very poor at.[5]But AI is improving (six times better within two months) and is expected surpass the test within 12-months.[6] Imagine if you sat the hardest exam ever written, flunked it, and then were improving at a similar rate – how smart would you be in 2-years, in 3, in 5?
2. AI’s Persuasive Power (April 2025): This week, we learn via an unauthorised experiment on Reddit that AI is now performing better than humans in persuading people to change their minds. Six times better.
3. Rapid Task Scaling. AI’s capability to perform increasingly lengthy human-equivalent tasks has surged dramatically. The length of tasks, judged by how long it would take a human, that AI can now complete is increasing. In 2022 AI could complete software coding tasks that would have taken a human 30 seconds. In 2025 they can complete tasks that would have taken a human 1 hour. The pace of advance (2024-2025) has gone from doubling every 7 months, to doubling every 4 months. If you understand exponentials, you’ll get the claim that this means agents will be doing tasks it normally takes a human 1-month to do by mid-2027. As one of us noted in a talk this week, at this rate of improvement, by 2029, the end of this Government’s Parliamentary Term and the Trump Presidency, AI will be doing tasks that take humans 2000 hours labour - roughly equivalent to a human working year - doing this in days, hours, maybe even in minutes. By 2029. …FCAS is planned to come into service in 2040, and likely goes out of service ~2070.[7]
Can those ‘straight lines on graphs’ that have given us exponential progress in AI just keep going? The answer seems likely to be yes, until at least 2030. A recent Epoch AI study suggests “2e29 FLOP training runs will likely be feasible by 2030.” That’s 200 quintillion FLOPS. If that doesn’t help you, over on Less Wrong they explain: “In other words, by 2030 it will be very likely possible to train models that exceed GPT-4 in scale to the same degree that GPT-4 exceeds GPT-2 in scale.2 If pursued, we might see by the end of the decade advances in AI as drastic as the difference between the rudimentary text generation of GPT-2 in 2019 and the sophisticated problem-solving abilities of GPT-4 in 2023.” You’ve probably not really heard of GPT-2, it predates ChatGPT (GPT-3) by 3-years. It’s like the jump between Model T Ford, when you could still argue horses were better for lots of work tasks, and the cars of the 1980s, when no one sensible was making that argument. We are looking at a similar or greater advance again by 2030.
These are some of the latest indicators. This blog has covered where we are going and how quickly we are heading there, particularly in ‘Don’t Blink’ and ‘In Athena’s Arms’. On the Wavell Room, we have all argued independently that ‘the future is uncrewed’. We’re on a track to AGI, perhaps as soon as 2026, but with a median Metaculus estimate for ‘oracle’ AGI by 2028 (a year earlier than was forecast throughout 2024). Taken together, these examples starkly illustrate that AI is advancing at a pace Defence is failing to appreciate or prepare for adequately. To underestimate AI is to risk strategic defeat.
The AI Quarterback: Why Tempest Does Not Need Humans
Let’s explore the claim in the podcast that Tempest, the fighter aircraft at the centre of FCAS ‘definitely won’t be remote’, for which we read ‘won’t be fully autonomous’, and that there is therefore a necessity for a human ‘quarterback’ to coordinate autonomous and crewed platforms in attack and defence. We think this claim rests on multiple flawed assumptions, and misunderstands both technological progress and operational realties.
Why would it be impossible for Tempest to be remote? One answer is that any kind of electronic emission can be detected, and so you can’t have a datalink to the aircraft, as you do for current remotely piloted drones. There are unchanging physics aspects to this argument. The range of the radio link to control an aircraft is defined by the power of the emission. A long-range aircraft, essential to operate over enemy terrain, would need very high-power emissions, which makes it easier to detect.
But this smuggles in an assumption, that there has to be a datalink if there isn’t a human in the cockpit. That ignores the speed of technology development. If an AI is able to take inputs of sensor data, process them faster and respond more effectively than humans, it can be fully autonomous. If it is fully autonomous, it doesn’t need a data link.
The argument that a human ‘quarterback’ - the ‘player’ coordinating the attack - is a necessity also assumes the crewed ‘quarterback’ platform must remain in (relative) close proximity to its autonomous wingmen. It must be close to reduce emissions and avoid detection when coordinating the attack. If not close, then the human quarterback has the same datalink problem that the long range remote piloted aircraft had. Proximity enables low-powered or visual communications, reducing (though not eliminating) detectability. However, this constraint applies equally whether coordination is human or AI-led. Although fully "zero-emission" platforms may be unrealistic, AI-managed coordination would likely significantly minimise emissions, surpassing even low-emission human-led approaches (AI communication likely more efficient). Thus, human presence offers no inherent advantage in emission management or stealth—AI coordination can achieve equal or greater effectiveness with reduced proximity constraints.
A related argument is that you need a pilot because maybe electronic warfare – jamming, spoofing, frying of the aircraft electronic sensors and systems – would cause a fully autonomous fighter aircraft to drop out the sky. This is indefensible. Current fighter aircraft are ‘fly by wire’. To optimise for agility they are inherently unstable, and the only thing that keeps them in the air is the computer constantly adjusting the configuration of the aircraft’s control surfaces. If an adversary is capable of jamming, spoofing or frying the electronic systems of an autonomous fighter, then they are equally capable of incapacitating a crewed aircraft, rendering the presence of a human pilot irrelevant to survivability. The distinction between crewed and uncrewed aircraft in terms of vulnerability to electronic attack is therefore illusory, and the solution lies not in insisting on human pilots but in hardening aircraft electronics against such threats.
Then there is the argument that there are minimal to no trade-offs involved in including a pilot in the cockpit, so long as the aircraft, as is the case with Tempest, can be optionally crewed. Again this seems indefensible.
In all major wars, pilot loss has been more of a limiting factor on a nation’s ability to fight than aircraft loss. We can manufacture aircraft faster than we can train pilots. This was true of both RAF pilots in the Battle of Britain and Luftwaffe pilots in the Defence of the Reich. In the Ukraine war, shortages of pilots has limited both Russian and Ukrainian air operations – and been specifically cited as a reason for delaying providing Ukraine with F-16s.
Another argument is that pilot skill is irreplaceable. Yet today AI-driven drones have begun to outperform human pilots in complex, high-speed drone racing scenarios, dodging drones through woods and around complex obstacle courses at impossible speed with super-human agility, signalling a significant shift in the capabilities of autonomous flight. The USAF’s Venom programme sees autonomous ‘VISTA’ F-16s dogfighting (Basic Fighter Manoeuvres), something we were told in 2020 wouldn’t happen for ‘in excess of a decade, maybe two’. That estimate came from a USAF pilot who was a relative AI optimist, schooled in DARPA’s simulator-based AlphaDogFight trials, where the humans lost. As another of those defeated pilot’s admitted ‘“If I were to walk away from today saying I don’t trust the AI’s ability to perform fine motor movement and achieve kills, I’d have a lack of integrity…”.
The quarterback analogy overlooks another crucial ‘skill' limitation: humans have extremely limited capacity for simultaneous attention. This is why texting while driving is illegal—it multiplies crash risk by a factor of twenty-three. Coordinating multiple combat aircraft is vastly more complex than texting, demanding constant real-time awareness, quick decisions, and continuous communication across multiple platforms. Human pilots simply don’t have the cognitive bandwidth to fly their own aircraft effectively while simultaneously managing a swarm of others. Swarming technologies, multi-agent systems, can do this better, faster, more responsively. And surely after AlphaGoZero in Go (2017), AlphaStar in the game Starcraft II (2019) – which required breakthroughs in AI that could account for game theory, imperfect information, long-term planning, real-time, large action spaces, we don’t need to make the case that AI is superior in coordinating complex tactical plays? It’s a big bet against the odds to suggest you’ll need a human quarterback for their tactical skill, when Tempest is deployed in 15 years (2040), let alone in 45 (2070) when the aircraft is expected to still be operating.
Then there is the argument that aircraft capability isn’t majorly impacted by having a human in the cockpit. But the limitations this places on how many ‘g’ an aircraft can pull is a problem. Even marginal differences can determine whether an aircraft is taken out or not. The presence of a human pilot imposes physiological limits on an aircraft's manoeuvrability, particularly regarding g-force tolerance. While modern fighter jets are structurally capable of executing manoeuvres exceeding 9g, human pilots typically can withstand only up to 9g for short durations, even with the aid of g-suits designed to mitigate the effects of high acceleration forces. Exceeding this threshold risks G-LOC (g-force induced Loss Of Consciousness), compromising mission effectiveness and pilot safety. Unmanned aircraft, free from these human limitations, can exploit the full aerodynamic potential of their designs, allowing for more aggressive manoeuvres.
Maximum load factor (‘g’) is not the only important factor in air combat manoeuvrability that full autonomy aids. An AI would like hold an aircraft at the edge of its performance envelope much more effectively. A further transformation will come from the AI quarterback & autonomous systems’ ability to execute precisely coordinated, multi-aircraft tactics—complex three-dimensional manoeuvres performed with superhuman timing, precision, and strategic coordination. Uncrewed aircraft collaborating dynamically can preserve formation energy, manipulate adversaries into vulnerable positions, and optimise defensive manoeuvres to evade threats with accuracy no human team could manage.
Enhanced agility - individual and formation - can be crucial in combat scenarios, where even marginal improvements in manoeuvrability can significantly reduce the probability of being hit by enemy missiles: high-g manoeuvres in the end game can be the difference between the aircraft being destroyed or the missile being evaded. There’s a reason pilots and intelligence officers scrutinise ‘doghouse plots’ that illustrate an aircraft's performance envelope vs those of adversary aircraft and adversary missiles. It’s the difference between life and death. Mission success and mission failure.If having a pilot in the cockpit imposes even small limits on manoeuvrability, then the cost of having a human quarterback is reduced probability of mission success.
Then there is the question of the wisdom of a programme that takes 10-20 years to deliver a capability in an age of exponential technical progress. It was only 14 years ago that US Venture Capitalist Marc Andreesen argued ‘software was eating the world’ and it was edgy and futuristic – now everyone accepts this, and companies such as Anduril are challenging Lockheed and Boeing – while increasingly software driven drones are transforming warfare and rendering many capabilities obsolete. Twenty year-old assumptions are likely to be wrong, and costly – and they are likely to be more wrong 20-years in the future, than they were 20-years ago to today. Today progress is accelerating faster than ever, and yet still poised to leap to even greater speed. We are on the brink of Dario Amodei’s “…‘compressed 21st century’ the idea that after powerful AI is developed…[Amodei suggests in 1-3 years]…we will in a few years make all the progress in biology and medicine that we would have made in the whole of the 21st century.” We think Amodei chose biology and medicine as fields that are media palatable – this would be just as true across the sciences, and thus weapons and military equipment development.
Is it safe to bet on a human ‘quarterback’ in an aircraft in 2040, and change later if that turns out to be wrong? No, there are consequences. Getting this bet wrong, means spending money on developing a platform that will be rendered obsolete at some point prior to its deployment in 15-years' time, or within the next 45 years between now and the aircraft’s projected end of service life.
The risk is that we are wasting money investing in the wrong things, and will waste more in the future, since changing designs is a major cause of MOD cost over-runs.
Then there is the issue of opportunity costs. The MOD has budgeted £12bn over ten years for FCAS. Deepmind costs Google between ~£164M-£477M a year, or ~£10M per study.[8] Imagine what we could do if we invested similarly well.
AGI, Defence & The Future of Warfare
The claim made on the podcast —“We’re prepared for the time when AGI catches up”— implies Defence possesses concrete plans and capabilities for effectively managing advanced, human- or beyond human- level artificial intelligence – either our own, or that deployed by an ally or adversary. Yet the evidence presented thus far suggests the opposite.
The current Defence position—that by 2040, a human operator in an aircraft like Tempest will outperform AI in processing data, coordinating multiple platforms, situational awareness, and rapid decision-making—demonstrates precisely how unprepared we are. As shown earlier, AI systems already equal or surpass human performance in complex strategic tasks, and the trajectory of improvement strongly favours AI dominance. Even if human pilots retain marginal advantages today in certain areas, these advantages are diminishing rapidly, soon to disappear entirely.
The claim thus resolves to, in 2040, UK defence believes it is very likely that a human systems operator will outperform an AI in processing data, and controlling subordinate platforms, a human operator will have better situational awareness, and be able to react faster.
On any reasonable appraisal, as we have shown, this seems overwhelmingly unlikely: indeed, given AI’s current performance in air-to-air combat and strategy relative to humans, it may not be true today: processing and analysis of data, control of networks and speed of reactions are measures where humans do worse.
Let’s push this logic further. If we hold the platform and support constant, which dominates:
1. Human-only control, or
2. Human-plus-AI control, or
3. AI-only control
Whichever proposition of the above is true will win. Tempest is a bet on the second. All the evidence to date suggests that not long after human-machine teaming wins, the AI beats the human machine team. Our planning is stuck short of the logical conclusion of its own assumptions.
Defence might understandably prefer retaining human pilots from a moral cultural or aesthetic preference for a pilot in the cockpit. But what matters is whether your adversaries share your preferences. If they don’t, as soon as one adversary picks truth over beauty, you must change, or you will lose. It is that simple. Some might object that adoption will be uneven. This is true and also irrelevant. The moment any credible power demonstrates a battlefield advantage via AI, the pressure to replicate it becomes existential. The delay is strategic suicide, not insulation. Unlike many previous candidates, this is a true horse-to-tank moment, except an order-of-magnitude more so.
Which one will be dominant in the 2040s cannot of course be known. But humans aren’t getting significantly better and artificial intelligences are. Why do you believe humans will win this race?
The implications extend well beyond fighter aircraft. The AI frontier will be jagged – we think air-to-air combat will be fully autonomous first, then air-to-ground, with other domains slower. Air warfare offers straightforward reward functions (seeing what’s there, shooting things down, blowing things up, moving things around), in *relatively* uncluttered environments (compared to ground and sea – surface and subsurface – warfare) – different capabilities will be automated at different time horizons. But unless the trajectory changes, what is true of fighter aircraft is true of every other military system. AI is likely to be better at operating all of them: there isn’t one domain or system where it is obvious that humans will retain decisive advantage, or any advantage at all.
AGI makes people advantage redundant everywhere, from pilots to operators, commanders, to intel staff. In doing so, it removes demographic inferiority – a nation with less well trained, well educated, or well-motivated troops, a nation with fewer people, is no longer less able to defend itself.
Our argument then is that betting against these clearly visible trends—hoping humans will somehow maintain a meaningful edge—is dangerously naive. If Defence truly believes human superiority can persist, it must explicitly justify why human capabilities will indefinitely outperform rapidly improving AI systems. Moreover, it must convincingly refute or challenge the well-supported forecasts placing AGI’s arrival around 2028.
On the other hand, if Defence accepts the logic and evidence we have laid out, it should be racing towards our AI and autonomous future to fulfil its duty to protect the nation.
Humanity’s Last Exam: Existential Risk
We now make a larger claim. If you accept the logic of the argument so far, then our generation is sitting humanity’s final exam as a civilisation. To pass this exam, we must resolve at least one of the following critical issues:
Globally prevent further AI development.
Remove the possibility that advanced AI can access advanced weaponry or highly dangerous materials.
Eliminate war entirely as a political instrument.
Solve advanced AI alignment.
If you can’t do that, the logic takes you to a civilizational test for humanity: what happens when the optimal move is to hand over our most destructive artefacts to non-human intelligences?
Dystopian? Yes, we agree. So what are we going to do about it? Our options are clear:
1. Stop Developing Minds Smarter Than Humans. Humanity can stop developing powerful artificial intelligences. But that has to be all of it, any defection at all prevents this strategy from working. This seems very unlikely, and perhaps simply not possible.
2. Remove Tools That Smarter Minds Could Use Decisively. Accept the development of smarter-than human artificial intelligences but eliminate, globally, all advanced military technologies and crucially, the underpinning civilian technologies. In practice, anything that an AI could take control of and use against us. Again, it can’t just be done with pinky promises, it has to be done by everyone, everywhere with no defections. Misaligned AGI could theoretically weaponise almost anything - from disease to climate to the economy - against us, but clearly, handing it our most advanced military capabilities is an order of magnitude more immediate threat.
3. End War. Never go to war again, solve all problems by politics. Clearly, not happening.
4. Solve AI Alignment. Solve AI alignment so you don’t care if AI controls humanity’s most destructive weapons, because it will always do your/your elected representative/your elected AI leader’s, bidding. This might be done by luck – maybe AI always remains a tool of humans, aligned to us, despite being able to out-think us. Or, assuming this is possible, it might be done through deliberate and careful research and development to put in place regulation and guardrails etc. In any case, even if you think it unlikely (we have split views between us at Cassi), if it is not a zero chance, we still need to do it. After all, we insure against war by building a military, against all kinds of low-probability high impact risk in our day-to-day lives. This is what ‘prepared for the time when AGI does catch-up’ means in practice. Hope is not a strategy.
5. Cede Control of Destructive Systems to misaligned Non-Human Intelligences. Get killed by Skynet.
There aren’t actually any other options. That is your lot.
Can we please start taking this seriously?
So What?
We are not remotely ‘prepared for the time when AGI does catch-up’. Defence’s persistent underestimation and misunderstanding of AI is a national security risk, right now.
We need to address the issue of under-estimating, and insufficiently understanding, AI for the sake of the security of our nation.
Military leaders must recognise that their duty is twofold:
First, to decisively invest in and prepare for an AI-driven future of warfare;
Second, to candidly explain the Strangelovian escalation trap humanity faces. Achieving AI alignment is not primarily a problem to be solved by UK Defence – the excellent UK AI Security Institute should remain the lead for this. But nor is it an abstract concern that Defence can ignore—it is a strategic imperative that Defence’s leaders have a moral and ethical obligation to explain given their part in it. Failure to act and speak out now risks more than defeat; it risks existential disaster.
[2] Episode 14 – Meeting UK operational needs | Future Horizons: The Tempest Podcast (it is all worth a listening, but the most relevant sections are at the start and then from c.40 minutes in).
[3] Ignoring that for a sufficient level of intelligence or skill, everything is auto-complete…
[5] And I am not sure any actual human could individually do well on it: 2500 challenging questions on a 100 different subjects.
[7] This footnote is for the nerdiest of technical pedantry. The measuring how “general” AI is at solving tasks is a very difficult challenge. You cannot test on anything a human might be asked to do, that’s an infinite test. What represents a good cross section of all the tasks you would expect a human to be able to do? Which human? In how long? How do you measure success? There are a million confounding variables. METR took on the approach by using length of time it normally takes a human as a proxy for task complexity. Does that mean that ‘being a pilot’ and ‘running a tactical battle’ was in that list? - No. Does that mean you can ignore the work? Not sensibly if you want to understand how generality in AI is improving.
[8] This (2017) article https://qz.com/1095833/how-much-googles-deepmind-ai-research-costs-goog suggests costs of £164M, this 2020 article https://www.cnbc.com/2020/12/17/deepmind-lost-649-million-and-alphabet-waived-a-1point5-billion-debt-.html buts the figure at £477M, individual papers/studies are said (2024) to cost ~$13M, calculated here https://www.reddit.com/r/MachineLearning/comments/1ej5h4b/d_calculating_the_cost_of_a_google_deepmind_paper/