Why military AI needs urgent regulation

Julia Williams

Author:   Julia Williams

Subscribe to Diplo's Blog

For the first time, in 2023, the UN Security Council discussed the implications of AI on world peace and security confirming what experts had already warned about for years: AI is going to change every aspect of war, from “defense innovation, industry supply chains,  civil-military relations, military strategies, battle management, training protocols” to “forecasting, logistical operations, surveillance, data management, and measures for force protection” (Csernatoni, 2024). For years, international forums, such as the Convention on Certain Conventional Weapons (CCW) and the Group of Governmental Experts (GGE), have centred discussions on the challenges associated with lethal autonomous weapon systems (LAWS). However, military technologies rapidly develop from human-led autonomous to artificial intelligence-focused, raising ethical and legal concerns. From the United States’ Project Maven, which integrates machine learning into military surveillance, to Israel’s use of AI target algorithms in Gaza operations, and Ukraine’s deployment of semi-autonomous drones, the inclusion of AI in warfare is no longer theoretical but operational in modern conflicts. Since these technologies are already in use, the challenge lies in determining when and how they should be utilised, and under what circumstances ethical, legal, and accountable AI usage is possible. Without these assurances, the question remains: to what extent are regulatory gaps and operational risks making military AI a threat to global security?

 Adult, Male, Man, Person, Face, Head, Photography, Portrait, Accessories, Glasses, Performer, Leisure Activities, Music, Musical Instrument, Musician, Elzhi
Project Maven, Illustration by Staff Sgt. Alexandre Montes

Operational risks 

While military AI is intended to increase precision, efficiency, and reduce risk to personnel and civilians alike, it introduces uncharted risks into military operations. Two of the significant operational risks are the “black box nature of AI decision-making” and “lack of good quality data impacting efficiency and accuracy of AI algorithms,” which can lead to bias. Black-box decisions refer to unexplainable outputs by an AI system. For example, an AI may assign features to a target or calculate a score for suspect analysis without understandable logic for the system’s conclusion. Meanwhile, bias refers to a systematic margin of error that can lead a military AI system to make varying decisions based on race or gender. 

The apparent risk of black-box decisions is a lack of transparency, which can lead to faulty decision-making and have fatal consequences in a battlefield setting. However, the more concerning result of an AI being unable to explain how it reached a decision is the cognitive impact on the user, causing them to either distrust the system or strongly rely on it without understanding why. AI decisions need to be explainable to assess the quality and improve future outputs, but this sentiment assumes that an individual has the time to understand and evaluate the explanation. In a high-stakes context, such as warfare, an individual would most likely not have the time to examine how an AI reaches a decision, but instead would assess the validity of the output only. The issue is that without the explanation, a human cannot discover whether the AI has made a biased decision. 

Bias in AI systems is inevitable. From the data they are trained on to the design choices engineers make and how users deploy them, bias can be embedded into every layer of an AI system. Experts agree on three major categories of bias: societal, algorithmic, and automation. Societal bias is produced by cultural, historical, or structural inequalities in the societies where an AI system is developed and implemented through data or design. AI is trained on data that reflects the typical social conditions in the environment it will operate within. In a military context, the AI is usually trained on data from surveillance footage, behavioural patterns, and biometric databases, which can be skewed by profiling based on race, religion, or geography. If trained on faulty data sets, the AI is taught to perpetuate these biases. One of the most commonly cited examples is of specific facial recognition programs, particularly those developed for identification or surveillance purposes. Many early systems were developed using internet data, which overrepresented light-skinned individuals and lacked racial and ethnic diversity. As a result, when these facial recognition models were used, some exhibited a strong bias against individuals with darker skin. In specific use cases such as policing, this has led to disproportionately high rates of false positives for people with darker skin, compared to light-skinned individuals. However, not all facial recognition systems are equally flawed. Some systems trained on more representative datasets or applied in less sensitive contexts show reduced bias. Nevertheless, the development and deployment of these systems continue to raise ethical and legal concerns, particularly when used in high-stakes environments such as threat identification.

The second type of bias in AI military weapons is algorithmic bias. Even if the data used to train the AI is unbiased, which might be realistically impossible, algorithms can still produce bias due to feedback loops or imperfect logical reasoning. The AI can misinterpret an environment, behaviour, or pattern that does not accurately reflect the operational reality needed to complete its task, leading to unpredictability. 

The final type of bias is automation bias. Automation bias refers to how human operators trust and interpret the AI system’s outputs. Human operators often experience a cognitive distortion of reality in high-pressure situations, while AI systems provide their outputs with a confident tone, attributing credibility to their recommendations. This can then cause the human operator to adopt a reliance and over-trust in AI outputs, which may lead to flawed decisions. Even when there is operational oversight, bias may occur through the assumption that the AI output is more accurate simply because it is computing a decision in less time. Even if the AI output is trained to reflect a tone of uncertainty in its answers, in high-stress environments such as conflict zones or time constraints, humans are still likely to overtrust its recommendations. This particular type of bias poses a conundrum in military AI usage. Even if the regulation forces militaries to keep humans in the loop or legally have meaningful human control, human influence may have little effect in averting flawed AI decision-making. 

Regulatory gap

AI technology is unlike any disarmament challenge the international community has faced because of the accessibility for companies, civilians, states, and non-state actors alike and unknown development ceilings. Currently, the discussions happening in the CCW and GGE particularly focus on LAWS. For years, they have struggled to reach a consensus on an operational definition of meaningful human control. Meaningful human control, a legal safeguard intended to maintain human oversight over autonomous systems, is becoming a buzzword that lacks definitional clarity and operational use. Some experts believe that regulating AI is unnecessary, while others argue that while it is needed, international regulation is currently impossible. Under our current war-norm structure, military AI technologies go uncontrolled, except when they make mistakes. The international community’s first question is how to measure and assign accountability. 

There are many ways that human control is present in AI systems. Still, what constitutes meaningful human control, the legal term used throughout the CCW and GGE,  is ambiguous and case-dependent. This ambiguity has allowed states and their weapons manufacturers to apply varying amounts of human involvement in newly developed technologies, making accountability difficult. A human can technically interact with an autonomous system without having any substantive moral, legal, or operational oversight. 

Accountability through recording usage or logging users does not account for a mistake made because of embedded bias. Is the developer or the user responsible? The international community does not seem comfortable with assigning accountability to a machine. Some countries, for instance, have called for a framework that would enforce transparency throughout the building process, require training for personnel to fully comprehend system capabilities, and have a transparent human chain of command to implement human liability for machine decisions. It is clear that accountability is necessary for dealing with the real-world consequences of using AI military technology, and without it, these systems will proliferate in a responsibility vacuum. 

Geopolitical implications 

While international forums are yet to find consensus on key issues, many states are straying further from regulation to ensure their competitiveness. Currently, states lack the ability to assess one another’s AI capabilities. It is the unknown between states that is fueling an AI arms race. At the centre of this race are the United States and China, with the increasing influence of non-state actors and corporations. The AI arms race is about who can produce the best systems, integrate them the fastest, and control how they are proliferated. 

In October 2022, the US Department of Commerce revealed a new export control on semiconductors and computing chips – materials for AI production. The following summer, China responded to the restriction by placing an export control on germanium and gallium, metals used for producing semiconductors. Both China and the United States want to be recognised as global powers in AI, and they are working in this direction; the export controls mentioned above are just one example. For both countries to remain competitive, they require private sector innovation and cooperation. 

In an illustration of this, in 2024, OpenAI, one of the United States’ biggest civilian AI companies, edited its company guidelines to allow its technology to be used for military developments. 

Meanwhile, less technologically advanced states fear using military AI against them when they cannot develop this technology themselves. Additionally, these countries fear that whoever dominates this sphere may proliferate this technology to non-state actors, sowing greater tensions and even triggering conflicts. Arguably, an AI-enabled geopolitical landscape could be riddled with unintended escalation, due to machine failure or human misinterpretation of system outputs, and speed up the tempo of warfare. The faster the speed of war, the less likely it is to de-escalate. The point of attack is no longer just physical land, but essential civilian infrastructure. 

The development and deployment of military AI is a global security crisis that is continuing to grow in urgency. Fueled by widening regulatory gaps and operational risks, the challenges and risks of military AI are already visible in contemporary conflicts. Life-and-death decisions are increasingly delegated to systems with operational hazards. From black box decision-making to algorithmic bias, international law principles of accountability and discrimination are undermined through a system’s design and usage. 

Equally concerning is the regulatory gap enabling these technologies to proliferate. Humans are present at every stage from development to deployment, but where responsibility is assigned is blurry. If states could agree on legislation that reflects a lifecycle approach, accountability would not just be a legal principle to follow, but would become embedded in system operations. Still, the fact remains that many states are unwilling to be constrained from development, so strong governance mechanisms for AI military technology are missing. The ongoing AI arms race between global superpowers further amplifies the implications of this crisis. The US and China are pouring resources into developing military AI capabilities, making private sector tech companies defence contractors. Another deepening geopolitical fear is the proliferation of these technologies to non-state actors or rogue states. This fear could be a future point of negotiation for international disarmament among even the major powers. 

The new era of warfare will likely be dominated by states that never slow the growth of AI’s military capability, learning to innovate and integrate the fastest. Suppose peace is not made the central goal of using military AI. In that case, this technology will not just change the nature of war but accelerate us towards uncontrollable conflicts, with consequences we may be unable to survive. 

Tailor your subscription to your interests, from updates on the dynamic world of digital diplomacy to the latest trends in AI.

Subscribe to more Diplo and Geneva Internet Platform newsletters!