This research compares alternative performance metrics to those more commonly used in target detection system performance evaluation. The alternative performance metrics examined here include the Fisher ratio, a modification of the Dice similarity coefficient, and the Youden index. These metrics are compared to metrics that have been previously introduced for such target detection system performance evaluation: the receiver operating characteristic (ROC) curve (and the related summary area under the ROC curve (AUC) value), and the confidence error generation (CEG) curve (and the related summary root square deviation (RSD) value). The ROC curve is a discrimination metric that measures the ability of a target detection system to distinguish between target and non-target. The CEG curve quantifies detection system knowledge of its own performance. An approach is presented that combines such metrics; this combination may be dynamically adjusted and updated based on current and future evaluation requirements for particular target detection systems.
The ability of certain performance metrics to quantify how well a target recognition system under test (SUT) can correctly identify targets and non-targets is investigated. The SUT, which may employ optical, microwave, or other inputs, assigns a score between zero and one that indicates the predicted probability of a target. Sampled target and nontarget SUT score outputs are generated using representative sets of beta probability densities. Two performance metrics, the area under the receiver operating characteristic (AURC) and the confidence error (CE), are analyzed. The AURC quantifies how well the target and nontarget distributions are separated, and the CE quantifies the statistical accuracy of each assigned score. The CE and AURC were generated for many representative sets of beta-distributed scores, and the metrics were calculated and compared using continuous methods as well as discrete (sampling) methods. Close agreement in results with these methods for the AURC is shown. While the continuous and the discrete CE are shown to be similar, differences are shown in various discrete CE approaches, which occur when bins of various sizes are used. A method for an alternative weighted CE calculation using maximum likelihood estimation of density parameters is identified. This method enables sampled data to be processed using continuous methods.
Probability densities for target recognition performance metrics are developed. These densities assist in evaluation of systems under test (SUTs), which are systems that predict the presence of a target after examination of an input. After such examination, a SUT assigns a score that indicates the predicted likelihood that a target is present. From scores for a series of many inputs, the suitability of a SUT can be evaluated through performance metrics such as the receiver operating characteristic (ROC) and the confidence error (CE) generation curve. The ROC is a metric that describes how well the probability densities of target and clutter scores are separated, where clutter refers to the absence of target. The CE generation curve and the corresponding scalar CE is a metric that evaluates the accuracy of the score. Since only a limited number of test scores (scores for which the truth state is known by the evaluator) is typically available to evaluate a SUT, it is critical to quantify uncertainty in the performance metric results. A process for estimating such uncertainty through probability densities for the performance metrics is examined here. Once the probability densities are developed, confidence intervals are also obtained. The process that develops the densities and related confidence intervals is implemented in a fully Bayesian manner. Two approaches are examined, one which makes initial assumptions regarding the form of the underlying target and clutter densities and a second approach which avoids such assumptions. The target and clutter density approach is applicable to additional performance metrics.
The ability of certain performance metrics to quantify how well target recognition systems under test (SUT) can correctly identify targets and non-targets is investigated. The SUT assigns a score between zero and one which indicates the predicted probability of a target. Sampled target and non-target SUT score outputs are generated using representative sets of Beta probability densities. Two performance metrics, Area under the Receiver Operating Characteristic (AURC) and Confidence Error (CE) are analyzed. AURC quantifies how well the target and non-target distributions are separated, and CE quantifies the statistical accuracy of each assigned score. CE and AURC are generated for many representative sets of beta-distributed scores, and the metrics are calculated and compared using continuous methods as well as discrete (sampling) methods. Close agreement in results with these methods for AURC is shown. Also shown are differences between calculating CE using sampled data and calculating CE using continuous distributions. These differences are due to the collection of similar sampled scores in bins, which results in CE weighting proportional to the sum of target and non-target scores in each bin. A method for an alternative weighted CE calculation using maximum likelihood estimation of density parameters is identified. This method enables sampled data to be processed using continuous methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.