Imitation learning for modelling air combat behaviour — an exploratory study

FFI-Report 2023

About the publication

Report number

22/02423

ISBN

978-82-464-3454-4

Format

PDF-document

Size

2.8 MB

Language

English

Download publication
Patrick Gorton Martin Asprusten Karsten Bråthen
Fighter pilots commonly use simulators to practice their required tactics, techniques and procedures. The training may involve computer-generated forces controlled by predefined behaviour models. Such behaviour models are typically manually crafted by eliciting knowledge from experienced pilots and take a long time to develop. Nonetheless, these behaviour models generally fall short due to their predictable nature and lack of adaptivity, and the instructors must spend time manually monitoring and controlling aspects of these forces. However, recent advances in artificial intelligence (Al) research have developed methods capable of producing intelligent agents that beat expert human players in complex games such as Go and StarCraft II. Similarly, one may use methods from Al to compose advanced behaviour models for air combat, allowing the instructors to focus more on the pilots’ training progression rather than manually controlling their opponents and teammates. Such intelligent behaviour must perform realistically and follow the correct military doctrines to prove useful for pilot training. One possible way of achieving this is through imitation learning, a machine learning (ML) type where agents learn to imitate examples given by expert pilots. This report summarizes work on optimizing air combat behaviour models using an imitation learning technique. These behaviour models are expressed as behaviour transition networks (BTNs) controlling the computer-generated forces, simulated by the Next Generation Threat System (NGTS), a military simulation application aimed mainly toward the air domain. An adapted version of the genetic algorithm Neuroevolution of Augmenting Topologies (NEAT) optimizes the BTNs to behave similarly to demonstrations of pilot behaviour. As with most ML methods, NEAT requires many consecutive behaviour simulations to yield satisfying solutions. NGTS is not designed for ML purposes, so a system was developed around NGTS that automatically handles simulation and data management and controls the optimization process. A set of experiments were performed in which the developed ML system optimized BTNs to imitate example behaviours across three simple air combat scenarios. The experiments show that the adapted version of NEAT (BTN-NEAT) produces BTNs that successfully imitate simple demonstrations. However, the optimization process took considerable time, up to 44 hours of computation or 92 days of simulated flight time. The slow optimization was mainly influenced by NGTS’s inability to run fast while remaining reliable. This reliability issue is caused by NGTS’s lack of time management, which would have associated the agents’ states with simulation time stamps. To achieve successful behaviour optimization with more complex scenarios and demonstrations, one should simulate the behaviours much faster than in real-time with high reliability. Therefore, we consider NGTS not to be well-suited for future ML work. Instead, a lightweight air combat simulation designed for ML purposes capable of running fast and reliably is needed.

Newly published