DON26BZ01-NV010 — E-2D Large Language Model Entity (ELLMENT)
Award Maximum: $140,000 (Base) / $100,000 (Option) Period of Performance: 6 months (Base) + 6 months (Option) Phase Type: Phase I
OBJECTIVE: Develop and implement a traceable, explainable, referenced, and reasoned Large Language Model (LLM) that functions as an on-demand Natural Language Processing (NLP) decision-support assistant for Naval Flight Officers (NFOs) and mission crew aboard a carrier-based, all weather, tactical battle management, airborne early warning, and command and control aircraft.
DESCRIPTION: Artificial Intelligence/Machine Learning (AI/ML) technologies are transforming how complex data is understood and acted upon in operational environments. This SBIR topic seeks to explore the development of a domain-specific LLM system to support rapid insight generation from structured and unstructured documents (e.g., Tactics, Techniques, and Procedures [TTPs]), mission logs, communications, and other high-volume data sources relevant to tactical operations.
The goal is to deliver a modular, self-contained AI/NLP solution that can assist NFOs and mission crew by summarizing, reasoning over, and extracting meaning from dense operational material in real time. This LLM must be specifically designed to operate in a stand-alone configuration in accordance with information assurance policies, with mechanisms for traceability, where the information came from and how is it connecting to the goal, source attribution, and model transparency. The system must also support future extensibility to multi-modal data ingestion.
Work produced in Phase II may become classified.
PHASE I: Define and develop the foundational architecture and baseline capability for implementing Large Language Model Operations (LLMOps) in support of mission decision-aid tools for the E-2D platform. Activities include: (1) Security, Ethics, and Data Governance Planning — collaborate with relevant Navy civilian representatives to establish appropriate data classification levels, define a cybersecurity framework, and incorporate an ethical AI governance structure; (2) LLM Selection and Mission Alignment — select an appropriate LLM architecture based on mission-specific demands, with consideration for performance in tactical and technical language domains, model transparency and explainability, and compatibility with in-theater deployment constraints; (3) Corpus Curation and Model Training — train the selected LLM on an aircraft relevant corpus including mission-specific TTPs, doctrine documents, and communication logs, using prompt engineering, fine-tuning, and Retrieval-Augmented Generation (RAG); (4) Evaluation and Output Validation — assess model performance using a comprehensive metrics suite including response accuracy, relevance, bias detection, and trustworthiness; (5) Deployment Pathways and Phase II Readiness — evaluate and down-select hardware and software deployment options and develop a baseline implementation roadmap.
PHASE II: The developed LLM will be deployed to a stand-alone laboratory environment for rigorous evaluation in an Operator-in-the-Loop (OITL) configuration. NFOs and mission operators will engage with the LLM across representative command and control mission scenarios. Subject Matter Experts will conduct structured evaluations using predefined metrics. To support future scale-up, candidate computing architectures will be assessed, including emerging platforms such as quantum-accelerated processing. A lifecycle monitoring framework will also be established. Work in Phase II may become classified.
PHASE III DUAL USE APPLICATIONS: Upon successful completion of final V&V testing, the developed system will be authorized for transition to designated operational platforms. The capability has garnered interest from ONR Code 32, in connection with ASW mission domains. Examples of Dual-Use Applications include: Predictive Maintenance, Supply Chain Optimization, Threat Detection, and Security Auditing.
KEYWORDS: Large language model; LLMs; Natural Language Processing; NLP; Multi-modal approaches