The most basic problem of engineering is the design of optimal operators. Design takes different forms depending on the random process constituting the scientific model and the operator class of interest. For classification, the random process is a feature-label distribution, and a Bayes classifier minimizes classification error. Rarely do we know the feature-label distribution or have sufficient data to estimate it. To best use available knowledge and data, this book takes a Bayesian approach to modeling the feature-label distribution and designs an optimal classifier relative to a posterior distribution governing an uncertainty class of feature-label distributions. The origins of this approach lie in estimating classifier error when there are insufficient data to hold out test data, in which case an optimal error estimate can be obtained relative to the uncertainty class. A natural next step is to forgo classical ad hoc classifier design and find an optimal classifier relative to the posterior distribution over the uncertainty class—this being an optimal Bayesian classifier. |