The Republic of Agora

Rethink Napoleonic Staff


Agentic Warfare and the Future of Military Operations

Benjamin Jensen and Matthew Strohmeyer | 2025.07.17

The Napoleonic staff model is holding the U.S. back from innovating decisionmaking. To prevail in the era of agentic warfare, the U.S. requires smarter, faster command structures including Networked, Relational, and Adaptive models to outpace China and seize the advantage at machine speed.

The United States’ 200-year-old “Napoleonic” staff system is no longer fast enough for an era in which artificial intelligence (AI) agents can sense, decide, and act in milliseconds. This report, produced by the CSIS Futures Lab, explores how the Department of Defense (DOD) can replace today’s industrial-age military staff organizations with smaller, nimbler, AI-enabled command elements able to out-cycle peer adversaries such as China. The United States cannot out-innovate authoritarian rivals with nineteenth-century staff designs. Fielding adaptive, AI-enabled staffs—backed by robust networks, skilled people, and relentless experimentation—will let U.S. forces think and act faster than any adversary, preserving decision superiority in twenty-first-century conflict.

Why It Matters

  • Agentic warfare is here. While AI agents have been a reality for decades, they are now able to automate intelligence fusion, refine threat models, and recommend courses of action, compressing decision timelines from days to minutes. Without new staff designs, U.S. commanders will lose the race for decision superiority.

  • China is optimizing its forces to attack U.S. critical vulnerabilities. People’s Liberation Army (PLA) doctrine aims to disrupt U.S. decision networks by winning the information high ground and paralyzing the United States through a mix of cyber, electronic, and long-range strikes; staffs that are not distributed and capable of dynamic reconstitution will be ineffective in combat, if not destroyed outright in the opening salvo.

Methodology

  • Analytical baseline: Researchers reviewed divergent concepts of how people interact with machines and form knowledge networks to provide options for rethinking the staff outside of the traditional military literature.

  • Mixed human-machine method: Researchers combined retrieval-augmented large-language models (e.g., ChatGPT, Claude, and Gemini) with three workshops of experienced officers to generate and stress-test alternatives.

  • Real PLA scenarios: Each staff concept was run against three demanding missions drawn from Chinese doctrine—a joint blockade, a joint firepower strike, and a joint landing campaign.

image01 ▲ Table 1: Three Agentic Staff Options Examined

Key Findings

  • Adaptive wins. Across all models and scenarios, the Adaptive Staff scored highest and showed the greatest resilience to PLA system disruption tactics.

  • People still matter. Success hinges on a new cadre of officers fluent in AI orchestration, able to audit algorithmic recommendations, and ready to assume control when networks degrade.

  • Tempo beats mass. Smaller, feedback-driven staffs consistently generated more viable options faster than larger legacy organizations, turning speed into a surrogate for force size.

Strategic Challenges

  • Computational Infrastructure: There is no future for AI agents and new staff organizations unless the U.S. military invests in more computational infrastructure to power command decisionmaking.

  • Coordination at Machine Speed: Decentralized agents can fragment efforts if a common command logic is absent, creating a need for deeper human expertise on smaller agentic staffs.

  • Cyber-Resilient Networks: AI dependence widens the PLA’s attack surface unless zero-trust, deception-resistant architectures are baked in from the start.

  • Human Capital: Officers must learn AI literacy, data visualization, and enough basic statistics and computer science to understand machine-generated insights, not just traditional staff processes rooted in historical cases and strategic studies.

Priority Recommendations for Congress, the DOD, and the Services

  • Launch a multiyear experimentation campaign. Direct the Chief Digital and AI Office (CDAO) to run iterative wargames, building on the Global Information Dominance Experiment (GIDE), that pit competing staff designs against common threat scenarios, with mandated annual reports to Congress.

  • Close the compute gap. Fund a classified, distributed high-performance cloud and edge architecture able to host AI agents during contested operations; commission an independent study to size the requirement and options to meet it.

  • Accelerate AI-ready professional military education. Embed explainable AI, agent orchestration, and prompt engineering modules across professional military education and create fast-track “AI facilitator” qualifications for mid-grade officers.

  • Harden decision networks. Resource redundant communications, adversarial learning defenses, and audit trails that trace how agents reach recommendations, enabling commanders to trust—but verify—machine advice.

  • Institutionalize rapid lessons learned. Pair every major exercise with data pipelines that retrain agents and update tactics in weeks, not years; require synthetic data from AI-driven simulations to feed this loop continuously.

Long-Term Outlook

  • Operational art would combine human creativity and judgment with AI agents fusing data and making predictions. Smaller staffs have the potential to out-cycle legacy military organizations, generating tempo and giving commanders more time to think beyond the forward line of troops.

  • Continuous agentic wargames and simulations would make war plans more adaptive and relevant. Smaller staffs could run iterated simulations using AI agents that refine the DOD’s ability to deter a global collection of authoritarian regimes and fight dispersed networks of combined joint task forces across multiple scenarios. These continuous games, guided by AI agents, would drive human and machine learning while identifying new deterrence options and operational concepts.

  • New curriculum would revolutionize professional military education. Officers should study historical campaigns alongside seminars on data science, modern weapon systems, and AI fundamentals to understand how to fight alongside AI agents fusing information. Schools could then capture data and support fine-tuning these agents through techniques like reinforcement learning with human feedback.

Introduction

For all intents and purposes, the staff organizations of militaries around the world today would be recognizable to French military legend Napoleon Bonaparte. While there are enduring aspects of warfare, it is hard to imagine that eighteenth-century staff structures are optimal for twenty-first-century warfare. The question is: What are the alternatives?

This report explores how advances in AI are changing strategy and statecraft. This study combines leading network relational theories with the emerging agentic artificial intelligence capabilities for three possible agentic military staff structures. It outlines the results of several wargames that applied these three structures to possible crisis and conflict scenarios to determine the strengths and weaknesses of the staff options. This report complements ongoing research programs in the CSIS Futures Lab and a broader intellectual community that sees bundles of algorithms and distributed mosaic networks as the defining trend changing warfare. As new applications diffuse across the national security enterprise, the question is no longer if AI will change war, but how, and which state using it will emerge as the leading military power.

That race will be decided by a mix of infrastructure investments, espionage and counterespionage, and a willingness to experiment with how military organizations conduct planning and operations. While analysis abounds on the race between China and United States to build the computational infrastructure required to support further advances in AI and the game in the shadows to steal and protect intellectual property, less attention has been paid to how to structure a military staff for a new era of warfare.

While the Napoleonic staff model has served for over 200 years, its end is near, owing to the rise of agentic warfare, a new paradigm in military competition where AI agents operate alongside human commanders to accelerate decisionmaking and outpace adversaries. As the speed and complexity of modern conflict surpasses human cognitive limits, traditional industrial-age staff structures struggle to keep pace. Agentic warfare distributes autonomous AI agents across land, sea, air, space, and cyber domains and joint functions, enabling them to continuously collect intelligence, anticipate adversary intent, and recommend adaptive courses of action in real time. These AI agents are meant not to replace human judgment but to enhance it, compressing decision timelines from days to minutes and allowing commanders to act faster and with greater precision. By integrating AI at the tactical and operational levels, agentic warfare reshapes the character of operational art, ensuring U.S. forces maintain decision superiority in twenty-first-century battlespaces. However, realizing this vision requires a fundamental shift from hierarchical staff structures to networked, AI-enabled command architectures designed for speed, adaptability, and real-time operational synthesis.

Rethinking this legacy staff structure in view of leading network relational theorists results in three distinct options for an AI-enabled agentic staff of the future:

  • a Networked Staff that retains the functional focus of the current staff structure but applies AI agents to those functional areas for better and faster output;

  • a Relational Staff that aligns functionally trained AI agents into teams (termed “netdoms”) that are tasked and managed by cross-functionally–focused humans (termed “switchers”); and

  • an Adaptive Staff that aligns on the iterative decisionmaking process of a staff and organizes humans to shepherd agents through that process.

Option 1: The Networked Staff

Drawing on the tenets of actor-network theory, this decentralized staff architecture fuses human and nonhuman actors (e.g., AI agents, sensors, and data-driven tools) into an integrated web of capabilities that shapes, refines, and executes decisions in real time. This network is functional, with smaller staff elements working through functional agents to fuse data. In effect, each staff section has its own AI agent leading to independent intelligence, movement and maneuver, fires, command and control, force protection, logistics, and information nodes.

image02 ▲ Figure 1: The Networked Staff

In the Networked Staff model, based on Bruno Latour’s concept of actants, smaller staff elements work through functional agents ingesting data including live updates, doctrine, history, and military theory. These functional agents also adjudicate inputs from other functional agents. For example, a small G2 team interacts with its functional agent to develop possible enemy courses of action. This data is fed directly to the fires agent, which assesses possible target areas of interest based on named areas of interest, to start developing a concept of fires and cross-checking it against inventories of munitions to create notional essential fire support tasks and even attack guidance matrices. The logic is functional, preserving elements of the legacy staff system, but potentially brittle if the links between functional agents are broken. Humans serve as expert reviewers of agentic inferences offered to commanders. This provides a human-in-the-loop check, but could slow data flows.

To reorient legacy staffs to a networked structure, DOD leaders should consider the following recommendations:

  • Invest in computational infrastructure. Build a robust backbone of high-performance computing, data storage, and networking on classified networks to seamlessly integrate AI agents into each node for real-time analysis and planning.

  • Train and integrate AI agents. Develop, refine, and deploy machine learning models that empower distributed decisionmaking, ensuring AI-driven insights are woven throughout the staff network.

  • Build human capital. Equip the radically reduced number of people in the network (“super-empowered personnel”) with the higher-order synthesis skills to operate in a decentralized, AI-driven environment—encompassing data analysis, AI systems operation, and collaborative decisionmaking.

  • Establish new procedures. Develop clear protocols for data sharing, communication, and decentralized planning that harness AI-driven inputs and maintain coherent, agile operations in a setting likely to see fewer staff layers between the battlefield and the commander.

Option 2: The Relational Staff

Drawing on relational sociology, this staff design treats the military organization as a dynamic web of relationships shaping planning and operations. These webs are effectively clusters of existing staff functions that interact independent of humans. The resulting netdom—inspired by Harrison White’s sociological theory of networks—becomes its own staff, where AI agents tailor recommendations and plans based on discrete datasets. Multiple netdoms support the commander and command element, or human switchers, allowing them to see competing perspectives in near real time. The switchers bridge these netdoms, forging critical links and fostering negotiation among alternatives. This model has fewer people than the Networked Staff model and has a different AI agent configuration as well as a different role for humans in decisionmaking. In a Relational Staff, human commanders adjudicate among competing netdoms, whereas in the networked model above, functional staff play a larger role.

image03 ▲ Figure 2: The Relational Staff

In the Relational Staff model, based on Harrison White’s concept of actants, a human switcher—likely the commander—evaluates different options generated by competing clusters of functional agents aligned with traditional staff sections. Each of these netdoms runs entire planning cycles within seconds and can be tuned (i.e., temperature) to adjust variance. The netdoms could even be aligned, based on the commander with his/her personal Retrieval Augmented Generation (RAG), a set of references ranging from historical cases to a treatise on how they envision fighting their unit, using methods like RAG. As a result, the commander/command element has competing agentic planning teams (i.e., netdoms) allowing them to always explore alternative options and even red team AI agentic clusters using other netdoms. Return to the intelligence fires example: The commander has one netdom develop a course of action based on its best judgment and his desire to conduct an area defense in which the netdom will explore how to combine canalizing terrain with indirect fires to prioritize engagement area development and attrition as the enemy moves through the battlespace. He tasks the other AI agent to focus on a preemptive fires-based spoiling attack. The different course of action (COA) concepts, each developed by the command element, allow the commander to more rapidly reach a decision and issue warning orders to subordinate formations. The logic is command-centric and turns legacy staff sections into AI agents. It is brittle at the point of the commander. If the leader does not have the necessary depth of military experience, education, and judgement, they will struggle to ask the netdoms the right questions.

In addition to investments in computational infrastructure, the DOD should implement the following recommendations to reorient legacy staffs to a relational structure:

  • Map the organizational web. Conduct a thorough network analysis to uncover existing relationships, clusters, and information flows, laying the groundwork for distinct netdoms aligned with functional communities of practice (e.g., command and control, information, fire, maneuver, intelligence, protection, and sustainment).

  • Stand up functional netdoms. Create specialized nodes (e.g., operations, intelligence, and logistics) where AI agents recursively iterate among themselves to provide tailored insights, ensuring each community generates unique perspectives while remaining interconnected.

  • Train human switchers. Develop leaders adept at bridging netdoms, facilitating negotiations, and fostering collaborative problem-solving across diverse communities and AI-driven analytics.

  • Embed AI agents in each community. Integrate machine learning tools at the netdom level to process relational data, refine strategies, and deliver real-time recommendations that spark cross-domain synergy.

Option 3: The Adaptive Staff

Inspired by Andrew Abbott’s approach to adaptive planning, this staff model is built around a series of interconnected processes (e.g., planning, execution, and assessment) that continuously adjust to shifting operational realities but are rooted in a deeper context connected to military history and theory. AI agents embedded within each process furnish real-time data and insights, while human facilitators ensure alignment with higher-level objectives. This fusion of machine-driven analysis and human oversight enables a fluid, iterative decision cycle that rapidly evolves with changing circumstances. The defining feature that differentiates the model from the network and relational models is the reliance on feedback loops that help human staff and commanders adapt their plans to changing circumstances.

image04 ▲ Figure 3: The Adaptive Staff

The Adaptive Staff model is based on Andrew Abbott’s insights about non-linear dynamics governing how people create knowledge. In this perspective, the entire planning and operations process is seen as an agent-informed evolutionary system. There is still a small staff element, albeit smaller than the Networked Staff model. This staff interacts with a planning agent, which integrates data on doctrine, lessons learned, history, and military theory. These planning agents, based on staff prompting, inform operations agents. Like in the netdom concept, the operations agents work at a pure agentic level across functional agents, providing feedback that adjusts the planning agents’ output to provide more refined, up-to-date options for the commander. This allows the commander to steer the staff, creating in effect two key feedback loops: (1) human-agent and (2) agentic. The result is an accelerated decisionmaking process that makes planning and operations more tightly coupled. Decisions still reside in human judgment but adjust to rapid changes in the environment mediated by agentic insights. Back to the example, the commander tasks the staff to develop a defensive plan along a ridgeline and river (i.e., tie into terrain). The staff works through the planning agent, adding insights from historical cases and defensive tactics, to update possible courses of action, which the operations agent adjusts based on readiness, logistics, and estimates of enemy avenues of approach. The feedback loops are the defining feature, balancing human-agent interaction and supporting more rapid adjustments to changing context. This aligns the approach most with perspectives that see war on a complex, nonlinear system.

In addition to investments in computational infrastructure, to adopt an adaptive structure, the DOD should consider the following recommendations:

  • Operationalize interconnected processes. Structure planning, execution, and assessment as continuous loops that feed into each other, ensuring real-time synchronization among all phases of an operation.

  • Embed AI-driven analytics. Integrate AI agents into each process to provide near-instant insights, detect emerging patterns, and support iterative decisionmaking under fluid conditions.

  • Empower human facilitators. Train staff to guide, validate, and interpret AI recommendations across processes, keeping the overall mission in focus while harnessing data-driven inputs.

  • Integrate continuous feedback loops. Implement real-time tracking and assessment tools that automatically inform subsequent planning cycles, allowing for seamless course corrections.

  • Cultivate an iterative mindset. Foster a culture of adaptive thinking and rapid learning, encouraging experimentation and flexible responses to evolving operational realities.

Each of the three options offers a viable path into the future and a guide for how to rethink everything from the structure of current military staffs to how to educate future leaders. While the purpose of this report is to define pathways toward agentic warfare, the Adaptive Staff emerged as the preferred option according to model-based evaluation. When three foundational models—ChatGPT, Gemini, and Claude—were tasked with evaluating each of the three options against a mix of PLA campaigns, the consensus choice was the Adaptive Staff.

Despite this consensus across foundational models, each model offers a unique approach for experimentation. More than anything else, the DOD needs to start an aggressive campaign of experimentation that cuts across legacy service interests and equities to imagine the future of agentic warfare. This effort is less about getting the tech right than understanding human-machine interactions at scale and how to build agentic staff organizations. This will require an iterated approach that uses a common set of threat scenarios used to evaluate how best to restructure staffs—similar to the proof of concept explored below using PLA campaigns. This campaign of learning will also likely require external evaluation beyond the services and the DOD to ensure objectivity and accountability, to possibly include required congressional reporting. Regardless of its final form, which can be mapped by organizations like the CDAO, among others, the experiments must intentionally explore alternatives to legacy staff structures that have changed little in hundreds of years and are poorly aligned to take advantage of advances in AI and machine learning.

This report proceeds by establishing the historical foundation of the modern staff. From this vantage point, it adapts prominent ideas from political science and sociology to propose alternatives to the legacy Napoleonic staff for waging agentic warfare. Given these alternatives, the report shifts to stress-testing them, using common threat scenarios adapted from PLA doctrine to support algorithmic analysis, with three foundational models (ChatGPT, Gemini, and Claude) used to analyze how an agentic staff would perform. The report concludes with an inventory of recommendations that include launching a campaign of experimentation and building the infrastructure and human capital required to wage agentic warfare.

The Evolution of Military Staff Organizations

The modern military staff is a central component of operational and strategic decisionmaking in armed forces worldwide. They essentially manage complexity by automation, using a division of labor to coordinate and synchronize key inputs to commanders. This process is essential to modern command and the estimative processes that support it (i.e., military planning). Put simply, without a staff organization and coordinating process, military operations become kids’ soccer: There is no game plan and everyone rushes to the ball, compounding chaos.

In their current configuration, these organizations typically operate under a common framework that divides responsibilities into specialized functions. For example, in NATO and U.S. doctrine, the typical staff structure is divided as follows:

  • J1/G1–Personnel (Manpower and Human Resources): Manages personnel policies, manpower allocation, and morale.

  • J2/G2–Intelligence: Collects, analyzes, and disseminates intelligence to inform decisionmaking.

  • J3/G3–Operations: Plans and directs current operations.

  • J4/G4–Logistics: Ensures sustainment and support, including supply chains and mobility.

  • J5/G5–Plans: Develops future operational plans and strategic initiatives.

  • J6/G6–Communications and Cyber: Manages communications, cybersecurity, and digital infrastructure.

  • J7/G7–Training and Education: Works on training plans and concepts.

  • J8/G8–Force Structure and Budget: Handles financial planning and capability development.

  • J9/G9–Civil-Military Affairs: Engages with civilian agencies and multinational partners.

These staff sections largely align with military functions. In U.S. military doctrine, warfighting functions are how commanders group core capabilities to integrate, synchronize, and direct operations. In U.S. military doctrine, these functions vary by service but largely include command and control, information, intelligence, fires, movement and maneuver, protection, sustainment, and information. The core concept is that combat power emerges from integrating these functions. Wars are won not just by shooting bullets or firing missiles, but by combining fire, maneuver, force protection, and intelligence.

The concept of a military staff has evolved from its early iterations in ancient and medieval warfare to the highly bureaucratized organizations seen in modern armed forces. While there are historical antecedents, the modern staff confirmation is largely a function of the eighteenth and nineteenth century. During the French Revolution, the ideas of Paul Thiébault informed a series of reforms that led to the état-major général (general staff ) and staff organization employed by Napoleon. This staff system allowed for greater operational flexibility, rapid planning, and the synchronization of large-scale maneuver warfare.

The Prussians further refined this model in the nineteenth century, institutionalizing the general staff system, which became a hallmark of modern military planning. Over time, the system evolved to address increased scale and complexity, including large continuous fronts and mass mobilization in World War I and deeper joint and coalition integration in World War II.

Multiple authors have concluded that the Napoleonic staff system holds foundational characteristics of the industrial age that thwart flexibility and adaptability in both planning and execution. Staffs are often seen as too large, too resistant toward adaptation, and too top-heavy, resulting in what will likely be decision disadvantage in future conflict. Traditional approaches treat the commander in an overly individualistic manner that tends to result in centralized decisionmaking at the expense of dynamically responding to change.

Furthermore, it is not clear that an eighteenth-century military process is ideal for twenty-first-century warfare. Despite their evolution, modern military staff organizations struggle to manage mass volumes of data, synchronize multi-domain effects, and integrate coalition and interagency partners. By continuing to break down complex tasks into subordinate processes, the legacy staff has grown over the years, leading to larger, often cumbersome, bureaucratic organizations. In place of mission command—centralized planning and decentralized execution—large military staffs attempt to serve both functions in a futile effort to manage complexity. This tendency toward inflexibility leads forward-deployed units to continue to experiment with augmenting the staff process to gain efficiencies.

As units integrate multi-domain effects, “the increased speed in the division’s main command post (the central orders and targeting hub) can often outpace support from sustainment and protection warfighting functions concentrated in satellite command nodes.” The DOD needs a more rapid planning cycle that allows staffs to both integrate these effects and work across larger distributed geographic areas. These military organizations can further accelerate tempo by becoming flatter and managed more by algorithm than by hierarchical chains of command devoid of context at the edge.

Decision support is enhanced when functional expertise from across the staff and from external mission partners is brought together in direct support of the commander’s decision requirements. This idea was central to the notion of the “strategic corporal” and experiments with integrating long-range fires with distributed maneuver units as part of the Sea Dragon experiments in the 1990s. It also was a central feature of the concept of Force XXI in the U.S. Army.

From the Iron Cage to Networks

The military staff system reflects a larger nineteenth-century push toward rational legal authority and modern bureaucracy. The objective is control, rather than creativity, creating an “iron cage.” This concept dominates military innovation literature, where the goal is to define when, how, and why any novel idea escapes from its cage to change warfare.

Modern sociology and political science no longer focus just on large institutions, bureaucratic mazes, and their role in shaping policy. Increasingly, studies focus on networks and how people—as agents—connect with each other and their environment to exchange information and catalyze change. This process has utility for visualizing and describing options for adapting legacy military staff organizations to wars that will be increasingly waged through algorithms. In other words, calls for data-centric warfare will stall if they do not simultaneously seek to reimagine what a military staff is and how it operates.

Three authors provide maps for escaping the iron cage of the Napoleonic staff and waging agentic warfare: Bruno Latour (1947–2022), Andrew Abbott (1948–present), and Harrison White (1930–2024).

A French academic, Bruno Latour explored how knowledge is produced. The central idea is seeing scientists not as individuals but as actors in a larger network that connects people and things (termed “actants”) to form networks, a concept known as actor-network theory. Latour’s work lends a perspective that characterizes the military planning process and larger legacy staff organization as a series of “translations,” where each actor tries to enroll others into their own vision of the operation. This process is central to knowledge production and, by extension, decisionmaking, as actants negotiate and compromise, often leading to novel outcomes. More importantly, the process leads to what Latour called inscriptions: ideas and strategies that are translated into maps, documents, simulations, and orders, which then shape how, for example, military plans and orders are understood. These inscriptions are not neutral; they carry their own biases and limitations, implying that a military staff casts a long shadow across its orders even when the officers are not present. Flows of information and influence are capable of bending rigid hierarchies. AI agents allowed to exchange information could create network effects that undermine traditional staff structures. However, as these plans and orders take shape, they often fall victim to “black-boxing”: when the logic is opaque and complexity is hidden behind simplified representations. The price of control is plans lost in translation.

Latour’s critiques of modern science can help shape the design parameters for adapting the Napoleonic staff for agentic warfare. First, seeing data flows as a translation process and understanding inscription help architects design more efficient and explainable systems. Agentic staffs—likely smaller than current military staffs—will need to be calibrated to the types of inscriptions that fuse tactical data with a knowledge ontology linked to military history and operational art, but at the functional level. This simple idea is where multiple automation schemes tend to fall down. They view military planning from the perspective of an engineer and optimize tagging data to support faster and more efficient workflows as opposed to treating military history, doctrine, and even structured lessons learned as forms of tacit knowledge that should inform how any AI model analyzes, routes, and synthesizes large data flows. Actor-network theory also brings to light a possible unintended consequence of the application of AI agents to the military staff: The same model training, fine-tuning, and resulting weights will produce an agentic form of Latour’s inscription. Staff officers should be aware that the training method for a frontier AI model could influence staff decisions as much as a well-crafted Concept of Operations briefing slide can shape the direction of a military plan.

Methods for better integrating humans in an agentic military decisionmaking network will require deeper AI literacy and deliberate design choices. Super-empowered staff officers—fewer in number but interacting with more information—will need to move beyond basic concepts like AI prompt engineering to more complex tasks, such as how to scaffold multiple AI agents together to complete a planning workflow. They will require higher-order skills, including sensing Latour’s inscription process in their staff and mapping it to the AI agents in a way that will maximize the AI’s effectiveness in the organization. This task requires users to have the ability to not only surface relevant staff data for an AI context but also recognize how the information they provide to the model will shape its output. Similarly, these staff officers should be aware of more complex AI interaction concepts such as graph theory, in which the structure of language itself is used to extract key ideas and better optimize how models search large volumes of information.

Harrison White was an American sociologist central to the rise of network sociology. In his major work, Identity and Control, White sees relational networks as the defining feature of human social systems in which individuals and groups assume roles (i.e., identities) and seek to influence connections (i.e., control). This process creates distinct sites of social action where specific roles prevail as a form of emergent (i.e., nonlinear) behavior, or what White calls “netdoms.” Actors switch among netdoms to shape how other actors across the network perceive the world around them and thus their interests and choices.

Applied to imagining alternatives to the Napoleonic staff for agentic warfare, White’s relation networks suggest that designers will need to enable—and understand—the switching process. The concept of switching likely implies that human agents will need to switch between peer work and working with multiple functional agent clusters, with each generating alternative military options. Furthermore, it implies a need to resurrect explainable AI efforts and find novel methods for allowing human actors in an agentic network to express causal logic and understand how algorithms generate answers (i.e., explainable AI, also known as XAI).

A third model emerges from the work of Andrew Abbott, an American sociologist famous for his work studying professions and the sociology of knowledge production. A central feature in Abbott’s work is how interactions among actors forge processes that unfold over time rather than static structures, creating in effect a nonlinear depiction of how social organizations—such as professions—evolve. Social phenomena tend to be shaped more by contingency (e.g., unexpected events) and unique context (e.g., emergent behavior), which makes traditional approaches to science ill-suited to explain the course of human history. The world cannot be reduced to simple cause-and-effect modeling.

This perspective is particularly relevant to the almost certain blurring of lines between traditional military staff roles and AI agents on the horizon, and the challenges with modeling complex systems such as war. Furthermore, it is not enough for AI agents to process large amounts of information and complete a doctrinal planning checklist. They need to be balanced against deeper knowledge ontologies that help interpret context (known as contextual embedding) and integrate human feedback (known as reinforcement learning with human feedback, or RLHF) as a way of capturing contingent events and their interpretation. These measures will help overcome modeling predictions based on standard transformer processes that have emerged over the last 10 years and defined the latest AI breakthroughs.

Outlines of an Agentic Future

These three models serve as a guide for thinking about applying AI agents to military staff processes and planning. An AI agent is an autonomous system capable of perceiving its environment, reasoning about it, and taking actions to achieve specific goals. At its core, an agent is defined by the agent function—a mapping from data inputs to goal-driven actions—that operates based on internal models, learning mechanisms, and contextual information. While the foundational concept of AI agents dates back to the 1970s and 1980s, the modern understanding emerged based on research in the early 1990s that defined agents as “anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators . . . [establishing] a fundamental perception-action loop that characterizes agent systems.”

The architecture of AI agents often reflects a balance between reactive and deliberative components. Early models, like the subsumption architecture proposed by Rodney Brooks, emphasized reactive behaviors layered hierarchically to produce intelligent action without centralized planning. In contrast, classical planning approaches rooted in classical symbolic AI—which handles representation and manipulation of explicit knowledge using symbols and rules—rely on a detailed internal model and forward simulation to generate plans that achieve specified objectives. Hybrid models, such as those proposed by Michael Georgeff and Amy Lansky, blend these approaches—allowing agents to react in real time while also engaging in strategic planning when time permits.

The modern usage of the term “AI agent” implies a range of attributes that describe an ability to operate intelligently and autonomously in a defined environment. “Intelligent” does not mean general human intelligence but an understanding of particular rules, goals, and inferences that enable AI agents to make decisions based on awareness of the environment. Furthermore, agents should be able, with the right architecture, to interact with other agents and humans to create more complex workflows. This means AI agents should be able to draw logical inferences that support planning and adapting to changes based on processing new data. As data changes, the agent should learn and adjust how it executes core tasks.

Recent developments in large language models (LLMs) have expanded the concept of agency to include language-based agents that combine reasoning, memory, and tool use. Approaches like ReAct integrate explicit reasoning traces with application programming interface calls, allowing agents to interweave thoughts and actions in complex tasks. Agents can be embedded in simulations that allow for agentic observation, planning, and reflection to better simulate human behavior and synthesize stored data (i.e., memory). These developments mark a shift from static models to agents capable of open-ended learning and complex interaction, skills required for dealing with the fog and friction that define military planning and operations.

Yet this agentic future is not without costs and concerns. If the adage is true that military plans are worthless but planning is priceless, agents put at risk the value gained through the painful planning process. Also, the reasoning of a model (assuming there is something beyond deep pattern recognition), if abstracted or blocked from human oversight, could induce reasonable concerns of AI hallucination or fabrication of information even when results are reliable. As models become more performant, alignment of the model to human intent (whether perceived or actual) will be an increasing concern, and in the process, will renew calls for mission command. These issues offer even more reasons to increase experimentation and learning to ensure the United States fields the most performant and trustworthy AI agents supporting the national security enterprise.

To imagine possible staff organizations for agentic warfare, the CSIS Futures Lab used a mix of data-driven AI methods and qualitative judgment from workshops with military professionals.

image05 ▲ Figure 4: Research Design

As seen in Figure 4, first, the team structured datasets on each theorist, the history of military staff organizations, modern U.S. joint doctrine, and texts of Chinese military doctrine. The research team then used Scale AI’s Donovan platform to create a combined corpus of over 200 texts as a basis for retrieval-augmented generation (RAG). RAG is a method for grounding LLM-generated results based on key external sources of knowledge that provide context to the model’s representation of information. By integrating external references into the generation process, RAG not only enhances factual correctness but also provides interpretable, context-aware insights, aiding more rigorous and transparent analysis. Chinese military doctrine in the dataset emphasized joint campaigns and systems confrontation and destruction. The method ensured the team used a consistent structure of questions against a common set of structured data (i.e., knowledge ontology).

The research team developed a prompting strategy based on chain of thought: a method that breaks down complex tasks into intermediate reasoning steps in order to refine the results from LLMs. These intermediate reasoning steps verify the model’s decisionmaking process, enabling more transparent analysis. Applied to the curated dataset referenced above, the major steps consisted of the following:

  1. Verifying that the LLM understood each theorist (i.e., Latour, White, Abbott).

  2. Using this understanding of the theorist to imagine how to adapt legacy staff organizations for a new era of agentic warfare using the definition offered in the introduction.

  3. Having the LLM catalog possible strengths and weaknesses of the new staff structure (networked, relational, and adaptive).

  4. Assessing how well each new staff structure optimized for agentic warfare would perform against major PLA scenarios, including the following:

    a. Joint Blockade: The coordinated use of all branches of the military to isolate a target, typically Taiwan, by severing its maritime and aerial connections through a combination of naval, air, missile, space, and information effects ranging from electronic warfare and cyber operations to propaganda and subversion.

    b. Joint Firepower Strike: Systematically degrading an adversary’s operational capabilities by delivering precise, overwhelming strikes against high-value targets across multiple domains.

    c. Joint Landing Campaign: The integrated use of amphibious, airborne, and special operations forces—to include unconventional and subversive elements—to seize key terrain and establish a lodgment for follow-on major combat operations.

  5. Conducting a workshop analysis comparing the different approaches to counter the PLA.

The team used the same prompts with three different foundation models—ChatGPT, Gemini, and Claude. This approach ensured a larger set of observations, but the results are still only illustrative and designed to explore different options for changing legacy staff structures. The findings below reflect the research team’s comparative analysis of the cross-model outputs based on workshop discussions with military professionals. Three workshops were held in winter 2024 and consisted of small groups of on average 10 officers, each with over 10 years of military experience, discussing the model-generated results. The results of the workshops inform the qualitative assessments of each approach (networked, relational, adaptive) that follow, while the statistical appendix uses AI-enabled scoring.

Option 1: Using Actor-Network Theory to Imagine a Networked Staff

This approach reimagines traditional staff processes as a network of interconnected human and AI actors collaborating to generate operational knowledge and decisions. The concept aligns with Bruno Latour’s early work, which sees knowledge production not as the work of isolated individuals, but as a process of negotiations among people, data, and material artifacts. Unlike the hierarchical general staff (G-staff ) model, the Networked Staff is built around a series of “translations” at the level of functional nodes (i.e., command and control, intelligence, movement and maneuver, fires, force protection, sustainment, and information). AI agents embedded in each node dynamically process large-scale data flows, propose recommendations, and refine functional plans based on real-time feedback from across the network. This makes the network model dependent on functional agent to functional data exchange. This design also requires functional model fine-tuning and other techniques like semantic indexing and metadata enrichment.

There are key trade-offs in this approach. For one, planning is not neutral. Each staff process and functional node, whether human led or AI enabled, transforms knowledge into inscriptions that shape how decisions are understood and executed. These inscriptions—maps, orders, intelligence estimates, and wargaming simulations—carry biases and constraints that influence operational outcomes. Moreover, as plans are formulated, they risk falling victim to black-boxing, where the logic behind decisions becomes opaque, hidden within layers of AI-generated recommendations or simplified doctrinal templates.

To counter these risks, the Networked Staff requires a fundamental shift in military planning that prioritizes adaptability, explainability, and AI-driven agentic augmentation of a series of human decisionmakers rather than merely accelerated workflows. The key structural differences in a Networked Staff include the following:

  • Flat, Decentralized Structure: Nodes operate independently but share information dynamically, ensuring that decisionmaking is distributed rather than top-down. This design could fundamentally alter rank structures in military headquarters and staffs. This shift has the potential to increase the speed and accuracy of decisionmaking and enable new forms of collaboration that would be impossible in rigid, hierarchical command structures.

  • AI-Driven Planning Integration: AI agents analyze incoming functional data, identify patterns, and generate real-time recommendations, allowing for more agile, adaptive planning cycles. They become, in essence, the members of an operational planning team serving a smaller set of top military thinkers crafting courses of action.

  • Seamless Cross-Node Collaboration: Unlike traditional staff sections that operate in silos, the Networked Staff fosters continuous interaction among functional agents, each with a human interlocutor. This data flow will need to be managed with alerts and augmented information requirements in a similar but not fully Bayesian manner to ensure humans in the network are not overwhelmed and subject to common decisionmaking pathologies.

  • Enhanced Situational Awareness: AI models synthesize and visualize complex operational data, making it easier for planners to interpret battlefield dynamics and assess risks. The challenge will be dimensionality and determining at what level of abstraction to assess complex and nonlinear systems that define competition and combat.

For the Networked Staff to reach its full potential, the U.S. military needs to invest in computational infrastructure, deeper AI integration, and human capital development—not as a faddish tech fetish, but as a means of out-cycling adversaries like China that are almost certain to have advantages in time, space, and forces. Tempo becomes the theory of victory. The force able to generate viable options and maintain freedom of action is more responsive and resilient even if it is outnumbered.

The Networked Staff is a step toward faster, more adaptive functional decisionmaking, but without deliberate safeguards, it could collapse under its own complexity. Coordination without command authority risks turning a synchronized fight into a series of disconnected actions, where misaligned priorities and conflicting decisions slow the tempo rather than accelerating it. AI-driven planning floods the system with data, but more information is not the same as context and better understanding—without structure, staff officers risk drowning in insights, leading to cognitive overload, hesitation, or misplaced confidence in an incomplete picture of the fight. Cybersecurity is another Achilles’ heel—by linking every functional node in a decentralized web, the Networked Staff expands the attack surface, making it a prime target for PLA system disruption warfare, AI poisoning, and deep-penetration cyber intrusions. Overreliance on AI introduces yet another risk: If operators do not understand how AI reaches its conclusions, they will either blindly trust or outright dismiss its recommendations, both of which can lead to failure at machine speed. To make the Networked Staff a true force multiplier, the U.S. military needs to tame the chaos—structuring data flows, reinforcing cybersecurity, and integrating explainable AI to ensure that humans remain in command, not just in the loop.

STRESS-TESTING OPTION 1: COUNTERING PLA CAMPAIGNS WITH A NETWORKED STAFF

As discussed in the previous section, the research team evaluated the different options for replacing legacy staff structures for a new era of agentic warfare through three foundation models and compared the results to generate the human analysis below. Specifically, the research team did multiple iterations of the same prompt structure against each model. A statistical analysis is available in the Statistical Appendix. The team then analyzed these to identify where they converged and diverged as well as to synthesize insights on the ability of a military using a Networked Staff to counter each PLA campaign.

While all models agreed that the Networked Staff concept provided increased tempo, adaptability, and decisionmaking driven by intelligence, surveillance, and reconnaissance (ISR), outputs differed on how resilient it would be in the face of PLA system disruption tactics and how well it would maintain operational cohesion in complex, high-intensity fights. In some of the model runs, the joint firepower strike and blockade scenarios exposed the Networked Staff’s vulnerability to cyber and electronic warfare, while the landing campaign scenario raised concerns about synchronization and the risks of decentralized functional decisionmaking in large-scale, multi-domain operations. Across all perspectives, the need for structured AI oversight, resilient communications, and human-machine synchronization protocols remained a universal recommendation to ensure the Networked Staff does not become a liability in the face of PLA system warfare doctrine.

  • Countering a PLA Joint Blockade

In a PLA joint blockade scenario, the Networked Staff’s ability to rapidly synthesize functional nodes gave the United States and its allies a critical edge in situational awareness. The model outputs assumed AI-driven analytics could identify patterns in PLA naval and aerial movements in a manner that enabled friendly forces to anticipate blockading maneuvers, detect maritime chokepoints, and optimize blockade countermeasures in real time. The decentralized structure would allow independent nodes to generate concepts of support and staff estimates simultaneously, reducing delays inherent in traditional G-staff decisionmaking. Additionally, the LLM outputs—across models—emphasized how cross-domain functional integration enhanced the U.S. military’s ability to disrupt PLA communications, jam targeting systems, and exploit weaknesses in China’s electronic warfare operations, eroding the blockade’s effectiveness.

However, some LLM outputs across models noted that conducting a counter-blockade operation required unity of effort in a manner that could prove challenging given the decentralized character of the Networked Staff alongside the concepts of translation and actants. Unlike a rigidly hierarchical PLA command structure that centralizes control over blockade operations, the Networked Staff’s distributed functional decisionmaking could lead to discrepancies in enforcement, inconsistent engagement criteria, or competing operational priorities across naval, air, and cyber domains. The resulting friction would undermine combat effectiveness. Additionally, the PLA’s system destruction warfare doctrine would target the network’s reliance on shared information flows, using cyber and electronic warfare attacks to inject false intelligence, disrupt node communication, or create decision paralysis through AI manipulation.

  • Surviving and Responding to a PLA Firepower Strike

In a PLA joint firepower strike scenario, the aim would be to systematically degrade U.S. and allied operational capabilities through precision multi-domain attacks targeting high-value assets such as command centers, logistics hubs, and integrated air and missile defense systems. In such an environment, the Networked Staff’s adaptability and resilience depend on the extent to which functions nodes survive the blow. Across models, the LLM outputs assumed the decentralized structure would enable forces to disperse key functions across multiple nodes, preventing a single-point failure that could cripple operations. Similarly, model outputs highlighted that AI-driven analytics allow real-time battle damage assessment and automated reallocation of surviving assets, ensuring that command continuity is preserved and friendly forces can manage maintenance and logistics required to replace combat loses and generate counterattack options.

Despite these advantages, the Networked Staff’s heavy reliance on real-time data and translations among functional nodes (i.e., actants) makes it a prime target for PLA electronic warfare and cyber operations. The PLA would likely jam communication links, inject deceptive targeting data, or disable AI-driven decision support systems, disrupting the Networked Staff’s ability to rapidly respond and coordinate a defense. Additionally, AI-generated battlefield assessments may overwhelm human operators or even produce hallucinations, leading to decision fatigue or confusion in identifying the most pressing threats during a fast-moving, high-intensity campaign.

  • Defending Against a PLA Joint Landing Campaign

In a joint landing campaign in which the PLA seeks to seize key terrain and establish a lodgment for follow-on combat operations, the Networked Staff’s ability to dynamically allocate forces, synchronize multi-domain fires, and disrupt adversary landing sequences becomes a decisive advantage. Multiple LLM outputs stressed that ISR nodes could continuously track PLA amphibious and airborne movement, fusing data from intelligence functional nodes with fires nodes to preemptively strike critical transport, logistics, and command elements before they reach the shore. In addition, some model outputs emphasized how AI-driven functional wargaming simulations would support instant adaptation to new threats, ensuring that counter-landing operations are fluid and responsive rather than rigid and predetermined.

However, large-scale defensive operations require tight sequencing and precise coordination, areas where multiple model outputs stressed that the Networked Staff’s flat structure presents risks. Without clear hierarchical arbitration—whether via legacy or algorithmic approaches to mission command—different functional nodes might develop conflicting interpretations of the battlefield, leading to misaligned fires, redundant asset allocation, or missed opportunities to disrupt PLA landings before they gain momentum. The PLA’s electronic and cyber warfare operations would specifically target battlefield communications, further compounding this friction and disrupting the ability of friendly forces to conduct combined arms. The effect would be a digital defeat, severing ties among ISR, logistics, and maneuver elements to isolate defending forces and create openings for Chinese forces to establish a beachhead. Additionally, an overreliance on AI-generated assessments may fail to capture the true tactical complexity of landing operations, leading to misguided countermeasures based on incomplete or manipulated data.

  • The Networked Staff’s Role in Defeating PLA Offensive Campaigns

The Networked Staff provides the functional coordination needed to counter multiple PLA military campaigns. By leveraging functional fusion—actants coordinating intelligence, fires, and logistics—it enables forces to counteract PLA blockades with real-time maneuvering, survive and recover from firepower strikes, and dynamically respond to landing threats before they establish a foothold. However, its greatest strength—its reliance on a highly networked, AI-enhanced decision ecosystem linked to warfighting functions—is also its greatest vulnerability. The PLA’s system destruction warfare strategy is designed to sever these networks, inject deception, and create decision paralysis that could render decentralized operations ineffective. Unlike the legacy G-staff model, which emphasizes hierarchical control and redundancy, the Networked Staff must actively compensate for its structural weaknesses by incorporating resilient communication networks, deception-resistant AI models, and structured synchronization protocols to maintain operational coherence under contested conditions. Done right, the Networked Staff can turn the PLA’s strengths—precision strikes, cyber warfare, and centralized execution—against itself, forcing the adversary into a fight it is neither optimized for nor prepared to win. But this outcome would require significant investments in computational infrastructure and training.

Option 2: The Relational Staff

Drawing on Harrison White’s theory on relational sociology, this staff design moves beyond static roles and rigid chains of command, instead functioning as a constellation of interdependent functional planning agents—called netdoms—that continuously interact, adapt, and generate operational insights. A human switcher adjudicates among the output of multiple agentic netdoms, each in effect conducting parallel planning. Unlike the Networked Staff, where functional agents work alongside human staff to form actants, each agentic netdom contains multiple functional AI agents independent of staff oversight. This difference implies a trade-off between deep functional dives managed by human staff (networked) and humans playing the role of switchers and comparing outputs from multiple AI agent clusters (relational). The idea is that the human staff officer will see a wide range of variation based on multi-agent interaction rather than deep functional dives guided by staff. In other words, there are fewer people in a Relational Staff approach, and they analyze whole outputs rather than functional snapshots.

Furthermore, rather than relying on a linear, top-down flow of information, the Relational Staff fosters emergent decisionmaking based on evaluating how different netdoms approach solving military problems. This process has the potential to create divergent and novel findings but also to generate noise. In fact, it could create a signal-to-noise ratio challenge as it merges real-time information from the battlefield with deep contextual ideas about warfighting functions and even the history of operational art depending on how programmers train agents.

As a result, the humans in the middle of the Relational Staff will likely have to play the role of switchers. Officers will connect netdoms agents managing typical staff functions (e.g., maneuver, fires, intelligence, logistics) to ensure cross-functional collaboration that mitigates the signal-to-noise ratio challenge. Their role will be to translate insights, negotiate competing priorities, and synthesize perspectives across different domains so that staff decisions reflect an integrated real-time assessment of the battlespace. That is a tremendous amount to ask any staff officer, implying the human agent acting as switcher will need deeper training and possibly even entirely new skill sets compared to those currently taught in professional military education.

Unlike the Napoleonic G-staff, where functional elements operate in bureaucratic isolation, the Relational Staff thrives on continuous dialogue, leveraging AI agents to illuminate trade-offs, risks, and opportunities across joint functions in ways that traditional processes cannot match. Key differences between the Relational Staff and legacy G-staff model designs include the following:

  • Interconnected Netdoms: Unlike G-staff sections, which operate in isolation, netdoms are continuously linked by human switchers and AI-enhanced data flows, ensuring real-time information sharing.

  • High-Connect References: Unlike the Networked Staff, the Relational Staff will need to balance constant negotiations among netdoms as a source of information against deep embedding and high-context data. This difference will create deeper training for different agents. The result will likely be ontology-enhanced representations that help agents learn.

  • Flexible Orientation and Operations: The G-staff requires manual coordination to reconfigure itself for different missions, whereas the Relational Staff dynamically reshapes itself in response to battlefield conditions through the process of negotiation. The main effort can shift quickly based on analytical insights.

  • Human-Centered Analysis: Given the pivotal role of human switchers, the Relational Staff is the least likely to see full autonomy in operations. Highly trained staff officers and commanders will be required to meditate negotiations among netdoms processing large volumes of information and analyzing against high-context datasets. While this staff model may have fewer people, it places a higher premium on their knowledge base, since they will have to analyze netdom output that factors in multiple warfighting functions.

A core strength of this model is its ability to incorporate human intuition into AI-driven recommendations. The Relational Staff breaks from the slow, sequential logic of legacy command structures, treating decisionmaking as an ongoing negotiation rather than a bureaucratic approval process. By networking intelligence, operations, and logistics into interconnected netdoms, it dismantles the stovepipes that delay response times, replacing them with parallel, AI-enhanced collaboration that allows planning cycles to move at machine speed. Instead of drowning planners in raw data, AI agents embedded within each netdom continuously filter, refine, and prioritize insights, ensuring decisionmakers focus on strategy rather than information management.

Despite this promise, the Relational Staff introduces risks that, if left unchecked, could turn agility into chaos. Without a clear arbitration mechanism to assist human switchers in guiding how netdom agents negotiate priorities, decisionmaking could spiral into misalignment, competing priorities, or outright fragmentation. AI-driven insights could flood the system with real-time updates, filtered by domain-specific context that may obscure more than it clarifies, leaving planners struggling to interpret information. The result would be cognitive overload, decision fatigue, and delays in identifying what truly matters. Without explainable AI, commanders risk falling into one of two traps: blindly following AI outputs without question or dismissing machine-driven insights altogether, undermining the very advantage this model is designed to deliver.

STRESS-TESTING OPTION 2: COUNTERING PLA CAMPAIGNS WITH A RELATIONAL STAFF

The Relational Staff, with its AI-enhanced netdoms and human switchers, offers a flexible approach to military decisionmaking. However, its effectiveness against a PLA joint blockade, firepower strike, or landing campaign depends on how well it its agentic planning clusters—the netdoms—create viable courses of action to counter system disruption warfare.

  • Countering a PLA Joint Blockade

Across models, the Relational Staff was described as posing a distinct advantage in multi-domain coordination, where air, naval, space, and cyber operations must operate in concert, orchestrated by a human-centered switcher, to counter a blockade force. Across multiple LLM outputs, there was an assumption that unlike a traditional staff with rigid silos, the Relational Staff fosters a more integrated approach, with maritime, air, and cyber netdoms continuously sharing intelligence, adapting force posture, and countering PLA maneuvers. Model outputs tended to assume that AI-driven insights within each netdom would analyze PLA movements, optimize force placement, and anticipate potential breaches, while human switchers would bridge information gaps, ensuring synchronization among communities. This structure would allow for dynamic adaptation, meaning the blockade does not operate as a static perimeter but as a fluid, evolving operational framework.

However, this very decentralization introduces coordination challenges in a prolonged blockade. Unlike the PLA’s hierarchical command system, which enforces unified action, the Relational Staff relies on negotiation among netdoms, which may lead to disjointed enforcement efforts if human switchers are disconnected or uncoordinated. The staff needs to either make a single commander the central switcher and authority, or assume the risk that multiple switchers might engage in negotiations between netdoms and each other, slowing down the flow of information and leading to conflicting objectives, and ultimately creating exploitable gaps. Additionally, the PLA’s system destruction warfare doctrine would likely target human switchers and AI-enabled netdoms, seeking to jam communications, disrupt network integrity, and inject false data into operational planning. This targeting would likely also start before a conflict and could include targeted assassinations, blackmail, and other unconventional methods to restrict the effectiveness of an AI-enabled conventional force. Highly trained switchers will be few in number, making them both a critical requirement and a critical vulnerability.

  • Surviving and Responding to a PLA Firepower Strike

A preponderance of model outputs described a Relational Staff that excelled in distributing decision making authority to survive a PLA joint firepower strike designed to systematically degrade command structures, ISR assets, and logistics nodes. Yet this approach was the most dependent on human operators, implying that this distribution relies on switchers surviving and quickly reconnecting data links to support the negotiation process central to a Relational Staff approach. Models tended to assume that, unlike in a traditional command structure, where a successful strike on a key headquarters could cripple operations, the Relational Staff dispersal of expertise across multiple netdoms would reduce single points of failure. That assumption depends on battle networks and concepts like mosaic more than different staff design principles, and it assumes that military organizations can rapidly replace dead switchers with new commanders.

However, while decentralized execution enhances survivability, it can also introduce vulnerabilities in synchronization and strike timing. Effectively responding to a multi-domain firepower strike requires tight sequencing among ISR, targeting, and engagement elements. The Relational Staff’s dependence on human switchers to mediate algorithmic negotiations could slow down kill-chain execution to human decisionmakers, vice machine at the cost of speed. Furthermore, the PLA’s ability to target the connective tissue of the Relational Staff—namely, its information-sharing architecture centered around the human switcher—creates a major weakness. By jamming netdom communications, launching cyber-intrusions into AI decision support systems, and attacking key switchers in targeting and fires domains, the PLA could degrade the cohesion of U.S. counterstrikes, creating windows of opportunity for follow-on attacks. Lastly, because of the centrality of the human switcher, the PLA could tailor cyber-enabled information operations to skew decisionmaking.

  • Defending Against a PLA Joint Landing Campaign

Across the LLM outputs and model runs, PLA joint landing campaigns were characterized as requiring complex, cross-domain defense, where logistics, airpower, ISR, and maneuver forces are synchronized to deny PLA forces a foothold. The Relational Staff was well-suited to coordinate these intricate operations, as its netdoms foster multi-domain and functional integration, allowing defenders to rapidly adapt to PLA landing strategies as fast as switchers can select agentic options from competing netdoms. Unlike a rigidly hierarchical staff, which may struggle to rapidly adjust to preplanned defensive postures, the Relational Staff’s negotiation-based decision making enables it to flexibly shift resources and adapt force deployments in response to PLA maneuvers.

However, this tightly coupled defense relies on both the quality of the human switcher and the netdom’s high-context training. A landing campaign demands tight sequencing of defensive fires, maneuver forces, and sustainment logistics, and if different netdoms interpret PLA movements differently or prioritize conflicting responses, operational cohesion could break down at a critical moment. Human switchers are also fallible to bridging the negotiation-based decisionmaking.

Knowing these limitations, the PLA would target communication nodes among netdoms, using electronic warfare and cyber operations to disrupt coordination among logistics, ISR, and operational planning. If switchers become overwhelmed with competing information flows, decision latency could prevent defenders from concentrating combat power at decisive points, allowing PLA forces to establish a lodgment before the defense can fully respond.

  • The Relational Staff’s Role in Defeating PLA Offensive Campaigns

The Relational Staff’s AI-enhanced netdoms and human switchers provide a dynamic approach to countering PLA campaigns, but its effectiveness hinges on balancing adaptability with coordination under system disruption warfare. Against a joint blockade, its multi-domain integration ensures real-time intelligence sharing and force posture adjustments, making the blockade fluid rather than static. However, the decentralized negotiation process introduces risks in synchronization, as human switchers—critical for bridging netdoms—become high-value targets for PLA cyber, electronic, and even unconventional attacks. In a joint firepower strike scenario, the Relational Staff’s dispersal of expertise reduces single points of failure, increasing survivability, yet this advantage depends on switchers rapidly reestablishing network links to coordinate counterstrikes. The PLA’s ability to degrade communications and AI decision support could slow kill-chain execution, creating exploitable windows for follow-on attacks. In a joint landing campaign, the Relational Staff’s flexibility would enable rapid adaptation to PLA maneuvering, allowing independent netdoms fusing intelligence, fires, and logistics data to continuously adjust defensive plans. However, this approach relies heavily on the competency of switchers to bridge decision gaps, and PLA electronic warfare and cyberattacks could sever these critical links, causing decision latency at decisive moments. While the Relational Staff’s networked design fosters agility, its dependence on human mediation for synchronization introduces vulnerabilities that PLA system destruction warfare is specifically designed to exploit.

Option 3: The Adaptive Staff

Drawing from Andrew Abbott’s insights into how professions and organizations evolve through nonlinear interactions rather than rigid structures, the Adaptive Staff approach envisions a continuously evolving system built around feedback loops connecting human and machines, as well as clusters of agents to support faster adaption to shifting operational realities. As a model for replacing the Napoleonic staff, it sits between the previous approaches, envisioning more agent autonomy than the Network Staff and less reliance on human switchers than the Relational Staff. Unlike traditional hierarchical staff models, which rely on static roles and bureaucratic workflows, or even the Relational Staff, which depends on switching, the Adaptive Staff structures decisionmaking around iterative cycles of planning, execution, and assessment that operate faster than legacy processes, essentially accelerating and fusing mission command and deliberate planning. This iteration requires higher degrees of model fine-tuning in relation to human feedback and key context including military theory and history independent of functional considerations.

Whereas the Networked Staff disperses decisionmaking across functionally independent nodes, and the Relational Staff emphasizes human switchers adjudicating clusters of functional agents, the Adaptive Staff treats decisionmaking as a continuous, iterative process, where AI agents and human facilitators work in tandem to generate, refine, and adjust plans based on real-time data. Each process—whether intelligence fusion, operational planning, logistics coordination, or fires management—is linked to others in a way that ensures synchronization without rigid command hierarchies.

However, AI is not merely a tool for processing vast amounts of information—it must be embedded within deeper knowledge ontologies that allow for contextual interpretation of contingent events. Abbott’s work underscores the limitations of cause-and-effect modeling in social phenomena, a critical insight for military organizations seeking to integrate AI into complex decisionmaking processes. Rather than simply analyzing historical patterns, AI must be paired with reinforcement learning with human feedback to capture emergent battlefield dynamics and unforeseen variables based on evolving human judgment and collective intelligence. This approach prevents AI from becoming overly deterministic or optimizing for efficiency at the expense of adaptability. Key differences between the Adaptive Staff and the Networked and Relational Staff include the following:

  • Dynamic, Process-Oriented Structure: Unlike the Networked Staff, which relies on decentralized, function-specific nodes, or the Relational Staff, which organizes around netdoms mediated by human switchers, the Adaptive Staff operates through interdependent processes that continuously refine decisionmaking based on real-time feedback loops that change the system. Unlike the G-staff, which is organized around functional sections that operate in parallel but often in isolation, the Adaptive Staff structures decisionmaking around fluid, interacting processes that continuously feed information into one another, preventing stovepipes and lag in operational assessments but also creating the risk of chaos.

  • Iterative Decisionmaking Over Static Hierarchies: While the Networked Staff flattens decisionmaking into distributed nodes exchanging information (i.e., actants and translation), and the Relational Staff prioritizes collaboration among netdom agents, the Adaptive Staff continuously revises its courses of action through structured iteration, ensuring that no decision is ever final but instead always subject to improvement based on feedback from the environment, agentic analysis, and even simulated outcomes derived from autonomous wargames run thousands if not millions of times in search of optimal conditions. While traditional staffs rely on step-by-step sequential planning (e.g., intelligence feeds operations, which in turn direct logistics, and so on), the Adaptive Staff allows for real-time iteration, recognizing that warfare does not unfold in a linear cause-and-effect sequence. This dynamic puts a premium on agent-generated branches and sequels and generating large numbers of options, often invisible to the human eye, to test possible scenarios. These simulated tactics and even whole courses of action provide reference points (i.e., synthetic data) to guide how agents respond to inevitable change in the environment. The value of this approach only increases when the adaptive staff integrates AI agents trained and weighted toward a perspective outside of traditional military planning, such as economic and diplomatic actions. Integrating these whole-of-government approaches in an Adaptive Staff would lead to uniquely effective options, emerging from a staff thinking in wholes rather than a myopic military lens.

  • AI as a Cognitive Enabler, Not Just a Data Processor: While all three models integrate AI to enhance decisionmaking, the Adaptive Staff differs in that AI is embedded within processes, not just as a tool for automation but as a mechanism for refining, contextualizing, and learning from human decisions. AI agents are not just tools for information processing but integrated actors within the decisionmaking cycle, influencing planning and execution in real time, unlike the G-staff model, where intelligence products are generated in discrete cycles.

  • Emergent Order over Predefined Structures: The Networked Staff functions as a decentralized collection of nodes that interact horizontally, and the Relational Staff operates through negotiated relationships between netdoms, but the Adaptive Staff actively reconstructs itself in response to operational contingencies, ensuring that no plan is ever static. Rather than relying solely on pattern-matching from historical datasets, AI within the Adaptive Staff is trained to incorporate human feedback, operational art, and historical analogies, ensuring that contingency and emergent behavior are factored into decisionmaking.

The Adaptive Staff enhances agility and responsiveness, making it ideal for complex, multi-domain conflicts where rapid shifts in the operational environment demand continuous reassessment. By operationalizing interconnected processes, it ensures that planning, execution, and assessment operate as continuous loops rather than discrete, linear steps. This avoids the pitfalls of both traditional staff models, which often suffer from bureaucratic inertia, and the Networked Staff, which may struggle with synchronization due to negotiations among functionally independent nodes. AI-driven analytics enhance situational awareness, detecting patterns, shifts in adversary behavior, and potential opportunities or threats faster than human planners—the switchers in the Relational Staff. This removes potential bottlenecks caused by misalignment across netdoms while still maintaining flexibility and adaptability.

While the Adaptive Staff is designed to enhance resilience and agility, it introduces risks that must be actively managed to prevent fragmentation, cognitive overload, and susceptibility to cyber threats. Unlike the Relational Staff, which is rooted in human switchers mediating how netdom agents exchange information, or the Networked Staff, which operates through translations among predefined function-specific nodes, the Adaptive Staff lacks a fixed structure—presenting a double-edged sword. Its fluidity allows for rapid iteration, but in high-intensity conflicts, coordination failures could emerge if facilitators fail to align operational priorities across interdependent processes. If competing AI-generated recommendations are not reconciled effectively, decisionmaking could become chaotic rather than adaptive. This challenge is compounded by the existing military education, training, and professional-development paradigms that emphasize static hierarchical structures. Emergent, rhizomatic organizational structures stand in near diametric opposition to military operation paradigms. This mismatch creates significant risk to adopting the Adaptive Staff design.

An additional challenge is managing information overload. While AI enables real-time intelligence processing and course-of-action refinement, it also generates vast volumes of data that could overwhelm human facilitators shifting through background options and a collage of courses of action generated by agents searching alternative futures for advantage. Unlike the Networked Staff, which allows for compartmentalization within function-specific nodes, the Adaptive Staff processes are deeply interconnected, increasing the likelihood of decision fatigue if information flows are not properly managed and presented to human decisionmakers. The Adaptive Staff will require more humans than the Relational Staff, but they will have to be just as well trained and educated to interact with multiple functional agents across planning and execution considerations.

STRESS-TESTING OPTION 3: COUNTERING PLA CAMPAIGNS WITH AN ADAPTIVE STAFF

The Adaptive Staff’s process-driven, iterative approach provides a unique advantage in countering a PLA joint blockade, surviving a joint firepower strike, and defending against a joint landing campaign. Across LLM outputs, the Adaptive Staff was consistently characterized as the most capable of leveraging real-time monitoring, AI-enhanced assessment processes, and dynamic force reallocation to maintain the operational advantage against PLA system destruction warfare. Unlike the Relational Staff, which relies on human switchers negotiating among netdoms, or the Networked Staff, which disperses decisionmaking across independent function-based nodes, the Adaptive Staff continuously synchronizes planning, execution, and assessment, ensuring that responses evolve dynamically to meet battlefield conditions. However, this adaptability introduces vulnerabilities, as the reliance on iterative decision cycles and feedback loops creates risks of operational fragmentation, cyber manipulation, and decision latency under contested conditions. Effectively countering PLA campaigns with the Adaptive Staff requires robust redundancy measures, AI deception resistance, and clear thresholds for decisionmaking under uncertainty.

  • Countering a PLA Joint Blockade

The vast majority of LLM outputs across the models characterized the Adaptive Staff as best suited to leverage real-time monitoring and rapid adaption across multiple domains to counter a PLA joint blockade. The outputs characterized AI-enhanced planning processes as continuously generating a range of new options based on changing maritime, air, and cyber conditions, allowing facilitators to dynamically adjust interdiction strategies in response to PLA maneuvers, electronic warfare tactics, and missile threats. Unlike the Relational Staff, which relies on human switchers to negotiate among netdom agents, or the Networked Staff, which operates through translation among functionally independent nodes, the Adaptive Staff maintains an iterative, process-driven decision cycle, ensuring that planning, execution, and assessment continuously inform one another. This continuous feedback loop prevents bureaucratic lag and allows the blockade to function as an adaptive operational framework rather than a static defensive perimeter.

However, constantly responding to feedback loops could undermine unity of effort. Furthermore, it could create an information tax as agents awaiting updates thus delay tempo, a situation the PLA could compound through cyber and electronic attacks. The Adaptive Staff’s strength in fluid decisionmaking could become a vulnerability if the PLA systematically degrades key AI-human integration points, jams real-time synchronization mechanisms, or introduces false data streams into AI-driven assessments to create disjointed operational responses. Without structured redundancy and resilient data validation measures, along with conditional logic gates that determine when environment feedback is sufficient for action, the Adaptive Staff risks decision fragmentation, allowing the PLA to exploit gaps in blockade enforcement.

  • Surviving and Responding to a PLA Firepower Strike

Model outputs also consistently presented the Adaptive Staff as well suited to respond to a joint firepower strike. LLM responses across the three different foundation models (i.e., ChatGPT, Gemini, Claude) characterized this approach as denying the PLA a single set of high-value targets. This assumes, again, a battle network architecture that is distributed, resilient, and has sufficient computational power and connectivity to update AI models and analyze possible planning scenarios. This architecture allows the Adaptive Staff to rapidly restructure command relationships and authorities based on environmental feedback—for example, optimizing fires allocation based on battle damage assessment.

However, this fluidity introduces risks in decision clarity and synchronization of fires across multiple domains. The PLA’s doctrine of systems confrontation is designed to exploit vulnerabilities in process-dependent networks, and if AI-driven recommendations overwhelm facilitators with conflicting targeting priorities, the strike response could become desynchronized. Moreover, the reliance on AI-generated insights in a contested electromagnetic environment could introduce cyber vulnerabilities, with PLA forces actively jamming data flows or injecting false targeting recommendations that cause strange loops as the Adaptive Staff’s model of the battlefield changes.

  • Defending Against a PLA Joint Landing Campaign

Multiple models and LLM outputs also described how the Adaptive Staff would enable a force to deny a joint landing campaign. A PLA joint landing campaign represents one of the most complex operational challenges, requiring multi-domain coordination to deny beachheads, disrupt airborne forces, and neutralize supporting fires. The Adaptive Staff’s strength lies in its ability to dynamically synchronize defensive operations, ensuring that ISR, logistics, fires, and maneuver forces adjust in real time to PLA actions as feedback loops, essentially aligning options against possible futures as they emerge. Unlike the Relational Staff, in which negotiation across netdom agents could slow rapid countermoves, the Adaptive Staff allows for immediate shifts in operational priorities, keeping pace with PLA advances and logistical constraints.

However, this high degree of adaptation comes at the cost of operational cohesion. A landing scenario requires precise sequencing of defensive actions, and the Adaptive Staff’s decentralized processes and dependence on feedback loops may create decision bottlenecks in moments that demand intuition and rapid top-down command authority. Additionally, the reliance on AI-driven assessments could introduce transparency issues, as commanders may struggle to trace how specific defensive recommendations were generated and whether critical battlefield adjustments align with strategic intent.

  • The Adaptive Staff’s Role in Defeating PLA Offensive Campaigns

While the Adaptive Staff’s ability to rapidly process environmental feedback and restructure decisionmaking cycles provides an asymmetric advantage against PLA military campaigns, its strength in adaptability can also create vulnerabilities in coordination, tempo, and resilience in the face of information warfare (e.g., cyber, electronic, and propaganda operations). Against a joint blockade, the Adaptive Staff optimizes interdiction through dynamic force adjustments, but without safeguards against cyber and electronic warfare, it risks decision fragmentation and exploitable operational gaps. In a firepower strike scenario, its distributed nature enhances survivability, yet overreliance on AI-generated recommendations in a contested electromagnetic environment could lead to synchronization failures. Finally, in a joint landing campaign, its fluid decisionmaking enables rapid countermoves, but the absence of structured command authority may slow critical responses during high-stakes engagements. To fully leverage its advantages while mitigating its risks, the Adaptive Staff must integrate structured redundancy, deception-resistant AI, and explicit decision thresholds to prevent operational paralysis under adversarial disruption.

Comparative Evaluation of AI Agentic Warfare Staff Models

Expanding on the original research findings, the research team conducted a comparative analysis evaluating how three leading LLMs—ChatGPT, Claude, and Gemini—assessed the performance of the Networked, Relational, and Adaptive Staff models against notional PLA campaigns. These assessments, paired with a structured analysis of variance (ANOVA), not only confirmed the overall superiority of the Adaptive Staff but also offered critical insights into where different AI agentic warfare staff structures would likely succeed or fail based on campaign demands.

Across all three models, the Adaptive Staff consistently outperformed its alternatives. It achieved the highest mean score (4.39) compared to the Relational Staff (3.90) and Networked Staff (3.59), a difference that was statistically significant according to ANOVA results (p < 0.05). This quantitative superiority reflects the Adaptive Staff’s underlying design principles: a dynamic, process-oriented structure emphasizing real-time feedback, contextual learning, and iterative decisionmaking. These attributes align closely with the challenges posed by PLA system confrontation warfare, particularly the need for rapid operational reconfiguration under cyber and electronic duress.

In the joint blockade campaign, Gemini’s analysis aligned with Claude’s findings in recognizing the Adaptive Staff’s strength, emphasizing its real-time monitoring and dynamic adjustment capabilities linking planning and execution. However, Gemini placed greater emphasis on the vulnerabilities of the Networked Staff, particularly its challenges in maintaining operational cohesion under stress, a concern that was less pronounced in Claude’s evaluation. Claude highlighted the Relational Staff’s ability to better enable multi-domain coordination, but Gemini’s analysis noted the risk of disjointed enforcement due to the Relational Staff’s reliance on human switchers. ChatGPT similarly aligned with Claude, stressing the Adaptive Staff’s superior responsiveness and decisionmaking capacity. Notably, ChatGPT provided an even stronger endorsement of the Adaptive Staff model, offering it a clear numerical advantage over other models. In contrast to Claude’s moderate concerns about Networked Staff cybersecurity risks, ChatGPT categorically downgraded its viability. Likewise, while Claude identified coordination challenges in the Relational Staff, ChatGPT more severely criticized its limited operational effectiveness under blockade conditions.

In the joint firepower strike campaign, both Gemini and Claude again identified the Adaptive Staff as the superior performer, particularly for its assumed ability to adapt, making it harder for the PLA to concentrate fires against high-value targets. Gemini placed more emphasis on the dangers of decision fatigue from high-volume AI-generated battlefield assessments than Claude did. Similarly, ChatGPT concurred with Claude on the Adaptive Staff’s strengths, especially its distributed resilience. However, ChatGPT rated the Networked Staff’s performance lower than Claude, emphasizing greater concerns about synchronization failures and information saturation. Unlike Claude, which acknowledged some relational model contributions, ChatGPT marked the Relational Staff’s performance as distinctly secondary, stressing the risks of human bottlenecks in rapidly evolving battlespaces.

In the joint landing campaign, the Gemini analysis diverged slightly from Claude. Claude strongly favored the Relational Staff, highlighting its ability to facilitate complex, cross-domain defensive operations through robust human-AI teaming. Gemini, however, viewed the Adaptive Staff as a near-equal competitor, emphasizing its dynamic synchronization capabilities and ability to adapt under evolving battlefield conditions. Gemini also raised concerns about potential decision bottlenecks in the Adaptive Staff model due to its reliance on feedback loops, although these concerns were less prominent in Claude’s evaluation. ChatGPT largely agreed with Claude in recognizing the Relational Staff’s strength in coordinating defenses during large-scale amphibious assaults. However, ChatGPT, like Gemini, rated the Adaptive Staff nearly as highly, emphasizing its structured feedback-driven adaptability. Importantly, ChatGPT more explicitly highlighted the potential vulnerabilities of the Adaptive Staff model, specifically the risk of feedback loop dependencies leading to decision delays in high-intensity, decentralized operations—a subtle but important distinction compared to Claude’s assessment.

However, an antithesis emerges when considering the importance of human judgment in the early stages of a blockade. Historical examples, such as the Cuban Missile Crisis and the restraint shown by Soviet submarine commanders, underscore the critical role that human decisionmakers can play in preventing escalation. In scenarios characterized by ambiguous rules of engagement and high political stakes, human switchers embedded within the Relational Staff could provide vital discretion at the tactical edge, ensuring that localized actions do not inadvertently escalate into broader conflicts. While the AI-driven Adaptive Staff offers speed and dynamic adjustment, it may lack the intuitive judgment necessary to interpret subtle political cues or apply proportional responses under strategic ambiguity. Thus, relational models, despite their vulnerabilities, offer an irreplaceable safeguard during the initial, unstable phases of blockade operations.

While the Relational Staff’s human-mediated discretion is invaluable under certain political-military conditions, the broader operational environment demands an organization be capable of adapting to an emergent complexity at speed and scale. The Adaptive Staff, with its dynamic, iterative structure and continuous feedback loops, best meets these demands. Its ability to process real-time information, adjust operational designs fluidly, and maintain resilience under systemic adversarial disruption ensures that it can both mitigate escalation risks when necessary and transition rapidly to decisive operational maneuver once clarity emerges. Therefore, despite the critical role of human judgment early in crisis scenarios, the adaptive staff remains the optimal foundation for future military operations characterized by complexity, speed, and uncertainty.

While the comparative analysis highlights the operational advantages of the Adaptive Staff, translating these insights into practical implementation demands deliberate policy and organizational action. As adversaries such as the PLA continue to refine strategies aimed at degrading decisionmaking and disrupting operational tempo, the DOD must ensure that future command structures are not only technologically advanced but also resilient and adaptable under contested conditions. An agentic staff model without corresponding investments in cybersecurity, synchronization protocols, and human-machine teaming architectures risks replicating the very vulnerabilities it seeks to overcome.

Moreover, even the most adaptive technical systems must be complemented by human personnel trained to operate in complex, degraded, and ambiguous environments. Building future staffs requires a dual focus: enhancing technological capacity while cultivating adaptive cognitive frameworks among leaders and operators. The ability to balance algorithmic recommendations with human judgment, particularly in high-stakes, politically charged scenarios, will define the effectiveness of AI-integrated military operations in the coming decades.

Finally, strategic investment decisions must account for the dynamic nature of agentic warfare itself. The iterative, feedback-driven nature of the Adaptive Staff suggests that future operational designs will not be static. Instead, they will require continuous experimentation, rigorous red-teaming, and the rapid institutionalization of lessons learned. Organizations that fail to build these adaptive learning cycles into the very fabric of command and control risk falling behind faster-evolving adversaries.

Recommendations

As the DOD and Congress consider the future of AI-enabled military decisionmaking, investment in resilient, adaptive, and secure staff structures is critical. The Networked Staff, Relational Staff, and Adaptive Staff models each offer advantages in speed, agility, and operational integration, but also introduce new risks that must be actively managed. Across these models, three core challenges emerge:

  • Ensure coordination and synchronization in decentralized, AI-enhanced decisionmaking environments. The findings consistently showed that while decentralized decisionmaking structures like the Networked and Relational Staff models enhanced tempo and flexibility, they introduced substantial risks of fragmentation, conflicting priorities, and desynchronization—especially under adversary pressure during blockade, firepower strike, and landing scenarios. These risks were most pronounced in scenarios where distributed nodes operated without clear arbitration mechanisms, as observed in the expert discussions on system disruption vulnerabilities. Coordination protocols, synchronized command logic, and robust human oversight mechanisms must be deliberately designed to counteract these risks, ensuring unity of effort even in highly distributed operational environments. These efforts will require assigning an agency and common protocols for evaluating agentic experiments. Standards matter and must be governed in the military to ensure a framework for adapting and scaling alternatives to the Napoleonic staff construct.

  • Build cybersecurity resilience against adversary system destruction warfare, particularly PLA cyber and electronic attacks. Across all campaign scenarios, the LLM analyses and expert evaluations stressed that PLA system destruction doctrine would target the network dependencies, information-sharing architectures, and real-time data flows critical to agentic staffs. This threat was particularly acute for the Networked Staff, where reliance on real-time AI inputs and decentralized decision nodes created expanded attack surfaces. Findings indicated that without hardened communication nodes, deception-resistant AI models, and resilience by design in information networks, even highly adaptive structures like the Adaptive Staff could be systematically degraded. Investments must therefore prioritize survivability of communications, deception filtering, and cyber hardening to ensure decision continuity under contested electromagnetic conditions. Sadly, these are often afterthoughts in budget submissions, meaning the DOD will need to play a more proactive role in requiring deeper cybersecurity investments.

  • Train personnel to operate effectively in AI-integrated command structures while maintaining human oversight and adaptability in degraded conditions. This study has repeatedly highlighted the crucial role of human facilitators (in the Adaptive Staff) and human switchers (in the Relational Staff) as critical nodes ensuring synchronization, context-driven decisionmaking, and error mitigation. Across all models, the success of AI-human integration depended on personnel trained not only to interpret AI outputs but also to actively manage feedback loops, prioritize conflicting insights, and preserve operational cohesion under dynamic conditions. Without deeper professional military education reforms emphasizing AI literacy, agent orchestration, critical reasoning under uncertainty, and dynamic decision control, future staffs risk either over-trusting or underutilizing AI systems, amplifying vulnerabilities exposed in scenarios like firepower strikes and contested landings. Thus, human capital development must be a central pillar of agentic warfare transformation. Given that this will extend across services, the Undersecretary of Defense for Personnel and Readiness likely has a large role to play in mandating changes to policy along these lines.

Taken together, these insights frame the following recommendations, which outline key investments and policy measures needed to operationalize AI-enabled staffs while mitigating their vulnerabilities.

1. Launch a multiyear campaign of experimentation.

The single most important thing the DOD can do is sustain aggressive experiments that test different approaches to building new staff structures better suited to agentic warfare than their Napoleonic precursors. This will require an agile approach to experimentation that, consistent with the CDAO’s Global Information Dominance Experiment (GIDE), uses live user feedback to iteratively develop and field prototypes.

  • Launch a follow-on to GIDE that tests different staff organizations against a common set of threat scenarios. These experiments should take the form of wargames, ideally linked to standing PLA campaigns, at both unclassified and secret levels. Using a mix of open and closed games will support a broader network of evaluation and analysis.

  • Create and fund a congressionally mandated reporting cycle that ensures a mix of service, DOD, and external evaluations are presented to DOD leadership, likely the deputy secretary of defense, with Congress tracking progress. These evaluations should focus on benchmarking AI models relative to different military missions and provide external evaluation of progress made in deploying AI agents.

  • Leverage opportunities through the J7/G7 to connect the campaign of experimentation to professional military education to ensure both a common pool of players in game-driven experiments and opportunities to connect the campaign to ongoing student research. These games should provide both a test bed and a feedback mechanism for fine-tuning models and optimizing AI agents.

2. Invest in computational infrastructure.

A robust computational backbone is essential to ensure seamless integration of AI across decisionmaking nodes, enabling high-speed, secure, and resilient communication even in contested environments. Despite calls for broad budget cuts, it is not clear the DOD currently has the depth of computational infrastructure required to support agentic warfare in peacetime, much less in contested wartime environments.

  • Launch a study to estimate the following: (1) current DOD computational infrastructure; (2) estimates for how much more computational power might be required to wage agentic warfare; and (3) options for addressing any gap.

  • Invest in high-performance computing and distributed AI processing to allow for real-time analysis at the edge of operations, even when disconnected from centralized networks. These investments should be linked to the aforementioned study and include cooperative efforts with entities like the National Science Foundation and Department of Commerce to ensure there is a robust, resilient network of infrastructure required to both run an information-age economy and support the emergence of agentic warfare.

  • Prototype scalable cloud and edge computing solutions that enable staff structures to operate autonomously in degraded environments on classified networks.

  • Enhance secure, high-speed networking architectures to maintain decision continuity under adversary cyber and electronic warfare attacks. This should include parallel red-teaming events during the campaign of experimentation that explore both traditional cybersecurity and new concepts to combat adversary bots that seek to deliberately poison data used by agents.

3. Train and integrate AI for military decisionmaking.

To ensure AI serves as a force multiplier rather than a liability, models must be carefully trained, integrated, and refined through operational feedback. This effort will require a broad-based educational renaissance across the U.S. military. Assuming that the United States can wage agentic warfare without warriors who understand it would be a recipe for disaster akin to historic failures like the 1942 Battle of the Savo Strait, when sailors did not understand how to use new radars.

  • Invest in explainable AI initiatives that ensure AI models provide transparent, auditable recommendations rather than black-box outputs. These investments should build on earlier Defense Advanced Research Projects Agency (DARPA) efforts in this area and create new ways of better integrating university-aligned research.

  • Sustain a campaign of experimentation across professional military education that refines AI models through wargaming and operational exercises, leveraging real-world feedback to improve model reliability and contextual adaptability. Just as interwar education focused on decision games and map exercises, there is an opportunity to help officers learn about modern joint all-domain warfare while generating new approaches to agentic warfare in the classroom.

  • Work with the J7/G7 and key combatant commands like U.S. Indo-Pacific Command to leverage the iterated wargames discussed above to propose entirely new approaches to generating staff estimates and plans. This process will likely include reimagining the deliberate planning process and finding better ways to align crisis and contingency planning. This study should directly engage schoolhouse experimentation and identify key gaps including adjustments to professional education as well as command and control systems and computational infrastructure.

  • Build and test new decisionmaking protocols for agentic staffs. This effort should include mechanisms for building an AI-enabled decision audit trail and testing protocols for dealing with comms-degraded environments almost certain to impact the performance of AI agents by limiting information updates.

4. Build resilience against PLA system disruption warfare.

The PLA’s doctrine of system confrontation specifically targets AI-reliant networks, making agentic staff and their vulnerability surface and cybersecurity standards a foundational requirement for any AI-integrated staff model.

  • Develop new concepts for defending agentic networks at both the service and combatant command levels. This should include, but is not limited to, (1) deploying redundant communication systems and AI deception detection measures to ensure decisionmaking continuity under contested conditions; and (2) employing network segmentation and zero-trust security protocols to prevent adversary lateral movement within staff systems.

  • Train AI models using adversarial learning techniques, ensuring they can identify and counteract PLA efforts to deceive and manipulate an agentic staff. This effort to train agents to detect deception should extend to the aforementioned campaign of learning and iterated approach to learning through games to design and fight an agentic staff.

5. Enhance human capital for AI-driven decisionmaking.

Personnel remain the critical link between AI-driven insights and strategic execution. Training programs must equip staff officers with the skills necessary to operate in decentralized, AI-augmented environments.

  • Develop AI literacy programs focused on data analysis, visualization, and AI optimization techniques (e.g., few-shot learning and retrieval-augmented generation). This should include training officers in structured decision arbitration, ensuring they can synchronize competing AI recommendations without creating decision bottlenecks. While the J7/G7 can mandate this training, the DOD should coordinate with the services and Congress to ensure it is properly resourced.

  • Train personnel in AI-human teaming strategies, ensuring they can interpret, challenge, and validate AI-driven recommendations rather than blindly accepting them. This training will need to be continuous and mix individual and collective training events. For example, imagine a U.S. Army soldier taking an online course on prompt engineering and asking tactical questions and analyzing model-generated content using red-teaming techniques. This process can include learning agents tailored to the soldier that aggregates data to help the commander better understand his formation and support mission command. Similar to the previous recommendation, while the J7 can mandate a requirement, the services and Congress need to be brought into the discussion to ensure that new education requirements are matched with resources.

  • Create new approaches to lessons learned that capture synthetic data and establish real-time tracking, after-action assessments, and automated feedback loops to continuously improve AI performance and synchronization mechanisms. This process should be built from the bottom up, starting at service-level lessons learned and merging into joint structures. To be effective, it will require funding a mix of data scientists and computational infrastructure to support the effort, much of which could build on the existing Advana system deployed across the DOD.

Conclusion

Agentic warfare is becoming a reality. AI-enabled military staff structures offer transformational advantages in speed, adaptability, and operational integration, but their success depends on strategic investment in infrastructure, cybersecurity, and human-AI collaboration. Without deliberate safeguards, the PLA’s system destruction warfare approach could exploit AI dependencies, disrupt synchronization, and degrade decisionmaking effectiveness. To ensure U.S. military staffs remain resilient and operationally dominant, the DOD and Congress should prioritize investments in computational capacity, AI training, cyber resilience, and structured decision frameworks. By embedding redundancy, explainability, and continuous optimization into AI-driven decisionmaking, the U.S. military can leverage AI as a strategic advantage rather than a potential vulnerability in twenty-first-century warfare.

Statistical Appendix

The tables below provide the data documentation for each scenario run by model. The first table reports the complete scoring across models and PLA scenario types. All scoring was based on providing the same inputs to the models as discussed in the report to ensure consistent prompts and context that supported comparative statistical analysis of the outputs. The subsequent tables analyze how each AI agent staff performed using analysis of variance (ANOVA) single-factor analysis. In this study, ANOVA provides insights on statistically significant differences across the models in general, and in particular scores the AI agents’ performance across different PLA campaign scenarios.

Each table contains a summary section and an ANOVA Table. The summary table section is organized into the following columns:

  • Groups: The different models being compared (Networked, Relational, Adaptive).

  • Count: Number of observations (i.e., how many scores were collected per group; here, it is 90 for each model).

  • Sum: Total of all the scores for each model.

  • Average: Mean score for each model.

  • Variance: Statistical variance showing how spread out the scores are within each model group.

The ANOVA table is organized into the following columns:

  • Source of Variation: Divides the variance into “Between Groups” (differences between models) and “Within Groups” (variance inside each model).

  • Sum of Squares (SS): Measures variability.

  • Degrees of Freedom (df ): Number of values free to vary.

  • Mean Square (MS): SS divided by df.

  • The F-statistic (F): (MS Between / MS Within) used to determine if group means are statistically significantly different.

  • P value: Critical for interpreting ANOVA—if p < 0.05, the difference between group means is considered statistically significant.

  • F critical: The critical value from F-distribution tables to compare against the F-statistic.

If the P value is less than 0.05, it suggests that there are statistically significant differences between at least some of the model ratings. In addition, if the F statistic is larger than F critical, it confirms that the differences between groups are statistically significant.

In an ideal scenario of AI-enabled scoring, there would be robust differences in scoring across the agentic staff approaches. This would include statistically significant differences between networked, relational, and adaptive designs, both overall and across different PLA campaign scenarios; and minimal differences between models, with either no statistically significant differences in the mean scores or minimal differences (i.e., a small difference in F statistic and F critical). Combined, these data trends would indicate more robust support for one staff design over its counterparts.

image06 ▲ Table 2: ANOVA: Single-Factor Combined Analysis Across PLA Scenarios

As seen in Table 1 above, all three AI foundation models (i.e., ChatGPT, Claude, and Gemini) rated the Adaptive Staff the highest across all three PLA campaign scenarios. It had the highest average performance score (4.39), and the F statistic is greater than the F critical, reinforcing the finding that the means are different between groups (networked, relational, adaptive) and the difference is statistically significant. Though generated by AI-enabled scoring, these insights reinforce the qualitative assessment that the Adaptive Staff is the best candidate for initial agentic staff experimentation.

image07 ▲ Table 3: ANOVA: Single-Factor Combined Analysis of the Joint Blockade Scenario

Table 2 reports the findings just for blockade scenarios. Statistical analysis using a single-factor ANOVA confirmed significant differences in AI scoring across the staff models. The Adaptive Staff achieved the highest mean score of 4.07, outperforming the Relational Staff (mean 3.60) and the Networked Staff (mean 3.37). The F statistic is greater than the F critical, reinforcing the finding that the means are different between groups (Networked, Relational, and Adaptive) and the difference is statistically significant. These results reinforce the broader findings: The Adaptive Staff maintained its assessed advantage across multiple evaluation runs, providing robust support for its prioritization in future agentic warfare force design.

image08 ▲ Table 4: ANOVA: Single-Factor Combined Analysis of the Joint Firepower Strike Scenario

In the joint firepower strike scenario, single-factor ANOVA results demonstrated clear performance differentiation among the staff models. The Adaptive Staff again achieved the highest mean score (4.4), edging out the Networked Staff (4.10) and significantly outperforming the Relational Staff (3.50). The F statistic is greater than F critical—though the magnitude was smaller than in the joint blockade analysis—reinforcing the finding that the means are different between groups (networked, relational, adaptive) and the difference is statistically significant.

image09 ▲ Table 5: ANOVA: Single-Factor Combined Analysis of the Joint Landing Scenario

In the joint landing campaign scenario, the ANOVA results highlight a notable shift compared to the blockade and firepower findings. While the Adaptive Staff dominated performance in both the blockade and firepower scenarios, the Relational Staff emerged as the top performer in the landing scenario, achieving the highest mean score of 4.60. This significantly exceeded the Adaptive Staff (4.23) and the Networked Staff (mean 3.30). The F statistic is greater than the F critical, reinforcing the finding that the means are different between groups (networked, relational, adaptive) and the difference is statistically significant. These findings suggest that unlike in blockade or firepower campaigns, where dynamic adaptation and distributed decisionmaking are paramount, complex cross-domain operations like an amphibious landing may benefit more from the human-machine teaming, coordinated judgment, and synchronized operational integration that characterize the Relational Staff model.

image10 ▲ Table 6: ANOVA: Single-Factor Analysis of Networked Staff Agents by Model

The Networked Staff had varying results across models. Claude gave the AI agentic approach the highest score (3.83), and the result was statistically significant. Furthermore, the F statistic is greater than the F critical—albeit only slightly—reinforcing the finding that the means are different between groups (networked, relational, adaptive) and the difference is statistically significant. In other words, there are small but statistically significant differences in how AI foundation models scored the Networked Staff across the three PLA campaign scenarios.

image11 ▲ Table 7: ANOVA: Single-Factor Analysis of Relational Staff Agents by Model

As seen in the table above, unlike with the Networked Staff, there were no statistically significant differences in how the three foundation models scored the agentic warfare approach across the three PLA campaign scenarios. The F Statistic was also smaller than the F critical, reinforcing this finding.

image12 ▲ Table 8: ANOVA: Single-Factor Analysis of Adaptive Staff Agent by Model

The table above analyzes how the three foundation models compare in their ratings for the Adaptive Staff approach across the three PLA campaign scenarios. Similar to the Networked Staff scoring, there are small but statistically significant differences across the models, with ChatGPT scoring the adaptive approach slightly higher (4.6) than Claude and Gemini. All three foundation models scored the Adaptive Staff higher than the alternatives despite the small difference in mean scoring across them. Furthermore, the F statistic is greater than the F critical—albeit only slightly—reinforcing the finding that the means are different between groups (networked, relational, adaptive) and the difference is statistically significant. In other words, there are small but statistically significant differences in how AI foundation models scored the Adaptive Staff.


Benjamin Jensen is director of the Futures Lab and a senior fellow for the Defense and Security Department at the Center for Strategic and International Studies (CSIS) in Washington, D.C.

Matthew Strohmeyer is the Director of Agentic Warfare and Strategy at Scale AI. He previously served as Director for the Global Information Dominance Experiments (GIDE) at the OSD Chief Digital and AI Office (CDAO) and is a 2021–2022 CSIS Military Fellow.

Made with by Agora