Paper
11 May 2015 Subset selection of training data for machine learning: a situational awareness system case study
M. McKenzie, S. C. Wong
Author Affiliations +
Abstract
Recent advances in machine learning with big data sets has allowed for significant advances in the optimisation of classification and recognition systems. However, for applications such as situational awareness systems, the entirety of the available data dwarfs the amount permissible for a training set with tractable machine learning optimization times. Furthermore, the performance of any optimized system is highly dependent of the training set correctly and completely representing the entire data space of scenarios. In this paper we present a technique to characterize the entire data space to ascertain the key factors for representation and subsequently select a subset that statistically represents the correct mix of scenarios. We demonstrate the effectiveness of these characterization and subset selection techniques by using a genetic algorithm to optimize the performance of a gunfire recognition system.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
M. McKenzie and S. C. Wong "Subset selection of training data for machine learning: a situational awareness system case study", Proc. SPIE 9494, Next-Generation Robotics II; and Machine Intelligence and Bio-inspired Computation: Theory and Applications IX, 94940U (11 May 2015); https://doi.org/10.1117/12.2176536
Lens.org Logo
CITATIONS
Cited by 4 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Data modeling

Genetic algorithms

Situational awareness sensors

Error analysis

Statistical analysis

Classification systems

Back to Top