Navy - 25.1 SBIR - Active Scenarios Learning of Evolving Situations, Multimodal Counterfactual Reasoning, and Explanations Toward Artificial Intelligence-assisted Wargaming

Active Scenarios Learning of Evolving Situations, Multimodal Counterfactual Reasoning, and Explanations Toward Artificial Intelligence-assisted Wargaming

Navy SBIR 25.1- Topic N251-065
Office of Naval Research (ONR)
Pre-release 12/4/24 Opens to accept proposals 1/8/25 Closes 2/5/25 12:00pm ET [ View Q&A ]

N251-065 TITLE: Active Scenarios Learning of Evolving Situations, Multimodal Counterfactual Reasoning, and Explanations Toward Artificial Intelligence-assisted Wargaming

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Human-Machine Interfaces;Integrated Sensing and Cyber;Trusted AI and Autonomy

The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.

OBJECTIVE: Develop a multimodal Artificial Intelligence (AI)-based scenario learning technology that continually adapts to the formation of emerging situations. Develop counterfactual augmentation machine reasoning and explanation techniques to correct human feedback behaviors that may cause bias in scenario learning. Scenarios forewarn risks, elicit decisions, and induce human-AI collaboration to exploit vulnerabilities. Apply large language models to explain scenarios, risks, recommend decisions, and course of action. These explanations serve as a crucial tool for evaluating the efficacy of human-AI wargaming collaboration.

DESCRIPTION: Creating unbiased adaptive scenarios as situations unfold is crucial for effective wargaming and conflict simulations. The aim is to predict events and trends that could have a significant impact on U.S. National Security Interests. It requires decision-makers to focus on various situational details, such as adversary strength, leadership temperament, past and present operational performance, logistics, and exploitation opportunities for friendly cross-domain actions and effects. Currently, a diverse team consisting of decision-makers, analysts, and warfighters invests significant time and resources into anticipating adversarial strategies and tactics through wargaming and brainstorming. However, this human-centric approach is vulnerable to costly errors, biases, and omissions, which can seriously undermine the assessment of evidence, statistical analysis, and the understanding of cause and effect.

To achieve the objectives, this SBIR topic will develop the following technologies:

Multimodal active AI-scenario generator that continually monitors, assesses, and exploits all-source-INT (ASI) datasets and streaming ISR data to detect, understand, and reason about hostile activities, interactions, and operational changes over time. It tracks and identifies assets including deceptive decoys based on their distinct deployment patterns. It provides warning signals as events develop and evaluates the potential consequences. The system remains impartial and helps to reduce human cognitive biases through counterfactual reasoning. It calculates various engagement options and outcomes for human consideration that may not have been recognized or properly understood. Additionally, it assesses the risk of escalation, identifies potential triggers of escalation, and helps with preparations. This capability is critical when it's too uncertain to rely solely on human judgments about potential engagements and their implications.
Collaborative human-AI course of action learning, reasoning, and explanations of engagement plans. It is a collaborative interplay of machine-to-machine (M2M) prediction of alternative futures integrated with the human-to-machine (H2M) supervisory system that examines and validates the end-state scenario risks. H2M interactive path allows for joint sensemaking, contextual reasoning, logical consistency checks, and Q&A query to probe AI generated scenarios and observed warning signs. The key technology development components are as follows:

Machine learning (ML) to uncover opponent’s multimodal assets (people, places, things), movements, and activities.
AI Red Team Reason-and-Act (ReAct) agents that discern and simulate opponent’s Tactics, Techniques and Procedures (TTP) and reactions.
AI Blue team ReAct agents, as a collaborative human-AI team executing strategic and tactical plans and maneuvers.
Counterfactual augmentation multimodal ML to prevent human perception biases influencing the course of action.
Apply large language models to explain multimodal events (text, voice, video, electro-optical/infrared (EO/IR) imagery, acoustics, synthetic aperture radar (SAR), etc.), decision points, course of action, interactions between Red vs. Blue teams, Player’s behaviors, and engagement outcomes are coherently expressed in natural language.

Wargaming applications may include scenarios that capture a joint military and commercial mobilization activities or exercise activities to control contested waters such as amphibious landing and sea-lane blockade.
Analytic tools that support the development include wargaming databases, engagement rules for all players, whether human or machine, and a multimodal exploitation gaming environment.

PHASE I: Determine the technical feasibility of designing and developing collaborative human-AI wargaming and AI-generated scenarios technologies as described in the Description section. Testing and demonstrations may use datasets from the Department of the Navy (DoN), Marine Corps Warfighting Laboratory (MCWL), Automatic Identification System (AIS) maritime traffic, commercial satellite imagery, and open-source intelligence (OSINT). The wargaming datasets and engagement rules need to take into consideration the littoral maritime environment and seaside terrain, including weather, view/geo-effects, routes; the maritime order-of-battle and movement; engagement rules/doctrine; engagement attrition; victory, standoff, and defeat conditions and status; logistics and supply demands, etc. Utilize associative data mining techniques for entity extraction (people, places, and objects) and related transactional activities. Accuracy metrics for ingesting and classifying multimodal data: structured data mining and interpretation - accuracy of 95% over 98% captured content; unstructured data mining and interpretation – accuracy of 90% over 95% captured content.

Software validation and verification must assess AI scenario structuring and logic-tree performance, consistency, and credibility as it relates the initial scenario states to the final scenario states through intervening events and processes. Performance criteria must include sensitivity (true-positive rate), specificity (true-negative rate), precision (positive predictive value), miss rate (false negative rate), false discovery rate, and false omission rate. Conduct performance assessment on the following human sensemaking and decision-making:

TTP Confidence on engagement plans, options, and risk reduction associated with the ups and downs of encounters.
Cause and effect sensitivity analysis on contextual understanding of AI-generated scenarios.
Efficiency gains in human responsiveness through timely decision-making, chain-of-actions, and resources spent.

Deliverables include end-to-end initial prototype technology, T&E, demonstration, a plan for Phase II, and a final report.

Note 1: Phase I will be UNCLASSIFIED and classified data is not required.

Note 2: Awardees must provide appropriate dataset release authorization for use in their case studies, tests, and demonstrations, and certify that there are no legal or privacy issues, limitations, or restrictions with using the proposed data for this SBIR project.

PHASE II: Develop a prototype of the candidate technologies. Test and demonstrate the prototype with representative operational data sources. Assess the prototype’s performance against the metrics detailed in Phase I. Conduct an end-user satisfaction assessment, on a scale of 0 to 5, on the following matters: a) Situational understanding for events that go dark, disguised activities and maneuvers, and dormant targets; b) Alignment with formal warning signals; c) Alignment with prioritized deterrence and engagement options; and d) Timeliness for responsive decision-making across different domains and collaborating effectively. Deliver prototype software, systems interface requirements for mobile and stationary devices, design documentation, source code, user manual, and a final report. Additionally, develop a plan for the Phase III transition into a program of record.

Note 3: Work produced in Phase II may become classified. However, the proposal for Phase II will be UNCLASSIFIED. The prospective contractor(s) must be U.S. owned and operated with no foreign influence as defined by 32 U.S.C. § 2004.20 et seq., National Industrial Security Program Executive Agent and Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Counterintelligence and Security Agency (DCSA) formerly Defense Security Service (DSS). The selected contractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances. This will allow contractor personnel to perform on advanced phases of this project as set forth by DCSA and ONR in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material during the advanced phases of this contract IAW the National Industrial Security Program Operating Manual (NISPOM), which can be found at Title 32, Part 2004.20 of the Code of Federal Regulations.

Note 4: If the selected Phase II contractor does not have the required certification for classified work, the Office of Naval Research (ONR) or the related DoN Program Office will work with the contractor to facilitate certification of related personnel and facility.

PHASE III DUAL USE APPLICATIONS: Advance these capabilities to TRL-7 and integrate the technology into the Maritime Tactical Command and Control Program of Record (POR) or Intelligence, Surveillance and Reconnaissance (ISR) processing platforms at the Marine Corps Information Operations Center. Once conceptually and technically validated, demonstrate the dual-use applications of this technology in the video gaming industry.

REFERENCES:

1. Robinson, E.; Egel, D. and Bailey, G. "Machine Learning for Operational Decision-making in Competition and Conflict, A Demonstration Using the Conflict in Eastern Ukraine." RAND Corp, 2023

2. Johnson, B.; Miller, S.; Green, J.M.; Godin, A.; Nagy, B.; Lee, B.; Badalyan, R.; Nixt, M.; Graham, A. and Sanchez, J. "Game Theory and Prescriptive Analytics for Naval Wargaming Battle Management Aids." Naval Postgraduate School, Monterey, California, NPS-SE-22-002, Oct. 2022

3. Wilner, A.S. and Babb, C. "New Technologies and Deterrence: AI and Adversarial Behavior." Springer, Dec 2020

4. Schechter, B.; Schneider, J.G. and Shaffer, R. "Wargaming as a Methodology: The International Crisis Wargame and Experimental Wargaming." Simulation & Gaming, 52(4), 2021, pp. 513-526.

5. Yannakakis, G.N. and Togelius, J. "Artificial Intelligence and Games." Springer, 2018

6. "National Industrial Security Program Executive Agent and Operating Manual (NISP), 32 U.S.C. § 2004.20 et seq. (1993)." https://www.ecfr.gov/current/title-32/subtitle-B/chapter-XX/part-2004

KEYWORDS: Artificial Intelligence; Machine Learning; Machine Reasoning; Scenario; Multimodal; Counterfactual; Wargame; Bias; Explanations

** TOPIC NOTICE **

The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoD 25.1 SBIR BAA. Please see the official DoD Topic website at www.dodsbirsttr.mil/submissions/solicitation-documents/active-solicitations for any updates.

The DoD issued its Navy 25.1 SBIR Topics pre-release on December 4, 2024 which opens to receive proposals on January 8, 2025, and closes February 5, 2025 (12:00pm ET).

Direct Contact with Topic Authors: During the pre-release period (December 4, 2024, through January 7, 2025) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. Once DoD begins accepting proposals on January 8, 2025 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period.

DoD On-line Q&A System: After the pre-release period, until January 22, at 12:00 PM ET, proposers may submit written questions through the DoD On-line Topic Q&A at https://www.dodsbirsttr.mil/submissions/login/ by logging in and following instructions. In the Topic Q&A system, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing.

DoD Topics Search Tool: Visit the DoD Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoD Components participating in this BAA.

Help: If you have general questions about the DoD SBIR program, please contact the DoD SBIR Help Desk via email at [email protected]

Topic Q & A

1/5/25	Q.	What specific types of multimodal data (e.g., ISR data, EO/IR imagery, SAR, text, audio) are expected to be prioritized in the initial phases? Are there particular datasets or formats the tool must be compatible with? What level of granularity is expected for counterfactual reasoning? Should it include detailed scenario rewinds or only high-level alternative outcomes? Are there specific requirements for the human-to-machine (H2M) supervisory system? For instance, should it support natural language interaction, and if so, are there preferred languages or frameworks? Beyond accuracy metrics for structured and unstructured data mining, what specific benchmarks will be used to evaluate the AI scenario generator’s performance in reducing human bias and improving decision-making efficiency? What level of detail is expected from scenario explanations? Should the system generate visualizations (e.g., timelines, maps) in addition to natural language descriptions? Will representative datasets from sources like the Marine Corps Warfighting Laboratory (MCWL) or AIS be provided during Phase I? Are there restrictions on using simulated or publicly available datasets for initial demonstrations? For commercial applications, such as video gaming, are there specific functionalities or use cases (e.g., interactive scenario design or player strategy explanations) that should influence the design?
	A.	A broad range of data modalities for multispectral imagery and text representation is important. There is no requirement for data formats. Machine-reasoning technology must be traceable and supported by fine-grained organic evidence to ensure minimum false-alarm rates. Scenarios and potential outcomes must be supported by the evidence at all levels. Natural language processing and visualization (geospatial and temporal) tools for human-to-machine interaction. Commercial translation and communication technologies can be utilized. Minimum performance requirements and metrics benchmarks must be followed as detailed in the topic. However, depending on scenario applications, each performer must detail their proposed performance measures to ensure accuracy, timely processing, and minimization of false alarm rates. Read the answers to the questions 2 and 3. Simulated and publicly available data sources are acceptable. Performers having access authorization to DoD datasets must provide a letter of support from appropriate agencies for the Phase-I development. Interactive scenarios, collaborative human-AI gaming, player strategy, tactics, and operations explanations are critical capabilities for both national security and commercial applications!

[ Return ]