Machine Clustered and Labeled Decision Tracks Derived from AI-enabled Intent Recognition
Navy SBIR 2020.1 - Topic N201-077
ONR - Ms. Lore-Anne Ponirakis -
Opens: January 14, 2020 - Closes: February 26, 2020 (8:00 PM ET)


TITLE: Machine Clustered and Labeled Decision Tracks Derived from AI-enabled Intent Recognition


TECHNOLOGY AREA(S): Human Systems, Information Systems


The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with section 3.5 of the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.

OBJECTIVE: Develop a watchfloor decision aid service, enabled by recent advances in gaming artificial intelligence (AI),that, to be operationally relevant, must regulate the frequency of recommendations and improve their explainability; and that will identify clusters of sub-decision tracks within a decision track for an AI-enabled game plan in which a similar objective or state was met.

DESCRIPTION: The goal of this SBIR topic is to understand the mechanisms of AI-enabled game play in order to produce optimal strategies for multiple objectives and game states. As the Navy moves towards leveraging AI for decision support, maturing intelligent algorithms for execution plans and explainable AI is imperative. AI algorithms have been shown to produce not only optimal or close to optimal solutions, but also a larger set of eclectic strategies otherwise not derived by humans. An understanding of decision tracks leading to differing solutions/strategies will enable the Navy to be strategic given different mission states. The Navy seeks AI that recommends plans that consist of a set of clustered micro-tasks that optimally lead to the achievement of a specific objective.

Advances in deep reinforcement learning [Ref 1] have enabled agents to take low-level actions at a very high pace in support of higher-level plan execution. Researchers have also shown near human level performance for full games [Ref 2] that involve decisions that cut across classic warfighting domains. For the Naval domain, AI that can act confidently but less often and at the plan level are desired since it is not feasible to send human-based forces commands at machine speeds. To accomplish this, a product that can learn clusters of sub-decision tracks (micro-tasks) within a decision track for an AI-enabled plan for which a similar objective or state was met. Given multiple objectives in an AI-enabled game, the topic’s challenge is to use machine learning (ML) to cluster subsets of decisions (micro-tasks) that produce a given objective. These clusters will enable labels for specific game states and provide explainability for an otherwise blackbox AI agent. Tracks of micro-tasks will be approved for execution at the plan level as required by an objective. Newly published methods suggest technical feasibility [Ref 3]. Re-planning will have to be done if the state of a mission significantly changes. Furthermore, within a cluster the mature product should be able to identify a ranking of optimal to suboptimal sub-tracks. While proposers may utilize any data sets where AI was used, it may be helpful to utilize already published Starcraft data [Ref 4]. Inferring explainability from the actions of agents in Starcraft is an active research area whose accomplishments can be leveraged [Ref 5].

Work produced in Phase II may become classified. Note: The prospective contractor(s) must be U.S. owned and operated with no foreign influence as defined by DoD 5220.22-M, National Industrial Security Program Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Security Service (DSS). The selected contractor and/or subcontractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances, in order to perform on advanced phases of this project as set forth by DSS and ONR in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material IAW DoD 5220.22-M during the advanced phases of this contract.

PHASE I: Demonstrate the feasibility of developing operationally relevant techniques to cluster and label decision tracks as plans in an AI-enabled game. Conduct a detailed analysis of literature, commercial capabilities, and state-of-the-art AI/ML techniques relevant to this topic. Identify and begin to mitigate key technical risks to a Phase II prototype. Demonstrate progress. Develop Phase II plans with a technology roadmap, development milestones, and projected Phase II achievable performance.

PHASE II: Move development of prototype techniques from a commercial game to a military simulator such as JSAF, OneSAF, or NGTS. Agent interfaces using JSON messaging can be leveraged. Develop and test against an increasingly complex mission plan that spans all warfighting domains. Develop metrics for decision track clustering and similarity measures. Attempt to identify or develop decision track rankings within clusters. Demonstrate an end-to-end AI-enabled capability at the plan level for at least 3 mission contexts (e.g., sea control or amphibious assault). Work with programs of record and training sites to transition the Phase II prototype.

It is probable that the work under this effort will be classified under Phase II (see Description section for details).

PHASE III DUAL USE APPLICATIONS: Produce a final prototype capable of deployment to training centers, operational command and control centers, and as a virtual application. Adapt the system to transition as a component to a larger system or as a standalone commercial product. Provide a means for performance evaluation with metrics for analysis (e.g., accuracy of assessments) and a method for operator assessment of product interactions (e.g., display visualizations). The Phase III system should have an intuitive human computer interface. The software and hardware should be modified and documented in accordance with guidelines provided by the engaged Programs of Record and any commercial partners. Technology development should be applicable to any domain that requires the training of end to end AI for a complex game or mission simulation.


1. Alghanem, Basel and Keerthana, P.G. “AsynchroStarnous Advantage Actor-Critic Agent for StarCraft II.”

2. Sun, Peng, Sun, Xinghai, Han, Lei, Xiong, Jiechao, Wang, Qing, Li, Bo, Zheng, Yang, Liu, Ji, Liu, Yongsheng, Liu, Han and Zhang, Tong.
 “TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game.”

3. Vezhnevets, Alexander (Sasha), Mnih, Volodymyr, Agapiou, John, Osindero, Simon, Graves, Alex, Vinyals, Oriol and Kavukcuoglu, Koray.
 “Strategic Attentive Writer for Learning Micro Actions.”  30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.

4. Lin, Zeming, Gehring, Jonas, Khalidov, Vasil and Synnaeve, Gabriel. “STARDATA: A StarCraft AI Research Dataset.”

5. Penney, Sean, Dodge, Jonathan, Hilderbrand, Claudia, Anderson, Andrew, Simpson, Logan and Burnett, Margaret. “Toward Foraging for Understanding of StarCraft Agents: An Empirical Study.”

KEYWORDS: Artificial Intelligence; StarCraft; Decision Support; Deep Reinforcement Learning; Machine Learning; Plans; AI; ML