Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Calhoun, Institution...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
addClaim

ADVERSARIAL EXPERIMENTATION IN ATLATL: LEVERAGING SENSITIVITY ANALYSIS, NEIGHBORHOOD SEARCH HEURISTICS, AND PROBABILISTIC SCENARIO GENERATION TO EXPOSE AI WEAKNESSES

Authors: Smith, Timothy J.;

ADVERSARIAL EXPERIMENTATION IN ATLATL: LEVERAGING SENSITIVITY ANALYSIS, NEIGHBORHOOD SEARCH HEURISTICS, AND PROBABILISTIC SCENARIO GENERATION TO EXPOSE AI WEAKNESSES

Abstract

Modern military decision aids must remain reliable under adversarial conditions that typically exceed their developer’s testing regimen. This thesis presents a reproducible experimentation framework built atop the Atlatl hex-grid wargame, which probes artificial intelligence (AI) vulnerabilities through probabilistic scenario generation, global sensitivity analysis, and local adversarial search. To test the framework, three reference agents are evaluated on a small scenario: NAMaiV5 and NAMaiV9 (scripted AI) and Pascal (a neural network trained on the test scenario). Latin Hypercube Sampling generates 20,000 diverse scenarios, each evaluated using a score differential between Blue-vs-Red and Red-vs-Red matches, from which Sobol indices isolate influential parameters. A neighborhood search heuristic procedure then degrades model performance by up to 65%, outperforming differential evolution in efficiency while achieving better score differential reduction. Behavioral heatmaps reveal consistent spatial biases, particularly when perturbing terrain near the map center. Results show that the scripted AIs fail most under force imbalance and opponent variation, while the neural network is more sensitive to scenario length and unseen terrain clusters. This testbed provides a scalable and interpretable process and tool for adversarial validation of military AI systems, offering actionable insight into operational robustness. Distribution Statement A. Approved for public release: Distribution is unlimited. Outstanding Thesis Lieutenant, United States Navy

Keywords

DFO, reinforcement learning, JSON, RL, JavaScript object notation, LHS, adversarial machine learning, ONR, DE, AML, stochastic gradient descent, Monte Carlo tree search, CSV, Office of Naval Research, derivative-free optimization, SGD, differential evolution, empirical risk minimization, MCTS, comma-separated values, artificial intelligence, central processing unit, ML, DRM, machine learning, ERM, AI, diametrical risk minimization, Latin Hypercube Sampling, CPU

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
Related to Research communities