PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
An OCR machine printed problem is selected as an example of a large class pattern recognition problem. We consider discrimination of alpha and numeric fields, recognition of all numbers, recognition of key words (street suffixes, personal titles), state/city/street names, etc. These operations are performed on destination address blocks (DABs) in the face of numerous variations in the type face (laser writer, dot matrix, typewriter, etc.), font, data drop out (due to printing errors), point size, +/- 5 degree(s) rotations, etc. An optical correlator with banks of distortion invariant hierarchical/inference filters appears to be an ideal adjunct to other OCR techniques (AI, parsing, context, use of lexicons, etc.).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Markov/Gibbs random fields have been used for posing a variety of computer vision and image processing problems. Many of these problems are then solved using a simulated annealing type of method which involves the varying of the temperature, a scale parameter for the model. In this paper we analyze the effect of temperature on random field texture patterns. We obtain new results relating structure in the texture co-occurrence matrix to temperature. We also show the existence of multiple transition temperatures which delimit regions of different bandwidth in the co-occurrence matrix, and hence can be used to control pattern formation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We address the restoration problem for noisy and degraded signals. Novel algorithms for suboptimal MAP estimates have been developed using the A* and genetic algorithms (GAs). The experiments carried out have shown suboptimal A* (SA*) and suboptimal genetic (SGA*) algorithms to be competitive with dynamic programming (DP) for MAP estimation, and that the use of GAs (in SGA*) provides limited gains over SA*. In terms of restoration quality, the suboptimal approaches yield a solution that on the average is only 5% worse than that provided by DP as the noise and/or signal size increase. Our experiments suggest that for limited amounts of noise (about 10%) suboptimal MAP estimates compare favorably against DP in terms of runtime complexity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a method which is robust to environmental conditions, that detects the existence of space filling objects such as cars, people, etc. This method is unaffected by uniform or local variations in image brightness induced by lighting conditions, weather, shadow of other objects such as buildings, trees, clouds, etc. Moreover, it does not depend on the position, size, or background pattern of the regions, or the shapes and color of the objects to be detected. The method consists of the following processes. Normalize brightness of the target and the reference image using mean and variance of the brightness in respective regions. The reference image is the image which represents the background scene without objects and is taken from the same camera position as the target images. Calculate normalized principal component features from both sets of normalized image brightness. Use the features to construct a classifier by statistical learning. The proposed features are determined by the variance and covariance of brightness of both images. They are a better measure of the correspondence of two images than conventional features such as the correlation coefficient of image brightness or statistics calculated from a difference image. The method is applied to car detection in a parking lot. The experimental images were collected over a one year period under various conditions. At least 98% of the cars were always correctly detected. Application to moving-car detection and person detection in a hall are presented. Since the proposed algorithm for object detection is robust under various environmental conditions and is object independent, it is well suited to a wide range of facilities offering automatic surveillance, automatic counting, automatic recognition of scene situation, etc.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Traffic flow measurement systems, which recognize vehicles and measure traffic parameters by image processing, are researched. We propose a new recognition method for moving vehicles using directional-temporal plane transform (DTT) for the system. DTT is a transformation from spatiotemporal data to 2-D data onto a directional-temporal plane. The transformation is done by projecting feature data of vehicles to a directional axis which is roughly parallel to the moving loci, and by placing the projected data stream side by side in temporal order. We also show an effective extraction algorithm of vehicles from the 2-D data obtained by DTT. Experimental results using real images demonstrate the effectiveness of this method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The role of features versus the whole in the learning of human facial expressions is explored. A pyramid-like modular network has been developed to learn and identify hand-drawn fKia1 expressions. Because of the nature of the network architecture, image size becomes less of an issue in network learning. The network exhibits a parallel learning capability which could be used to speed up the training process. An analysis of the hidden units of the network reveals that features are used in learning when there is commonality of facial features in the training patterns. We have also demonstrated attention focusing in the network by masking off specific areas of the face during testing. Our network model creates a "leaner" representation of the original fe object and classification is based on this representation. By including the leaner representation and separate key features in the final training set we can simulate a coarse-to-find search method, as in image processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a new approach for extracting features from face images that offer robust face identification against image variations. We combine the K-L expansion technique with two new operations that transform the face pattern into an invariant feature space. The two operations are the affine transformation which yields a standard face view from the input face image, and its transformation into the Fourier spectrum domain, which develops the property of shift-invariance. Although the basic idea of applying the K-L expansion to extract features for face recognition originates from the eigenface approach proposed by Turk and Pentland our scheme offers superior performance due to the transformation into the invariant feature space. The performance of the two schemes for face identification against various imaging conditions is compared.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic recognition of human faces is a frontier topic in computer vision. In this paper, a novel recognition approach to human faces is proposed, which is based on the statistical model in the optimal discriminant space. Singular value vector has been proposed to represent algebraic features of images. This kind of feature vector has some important properties of algebraic and geometric invariance, and insensitiveness to noise. Because singular value vector is usually of high dimensionality, and recognition model based on these feature vectors belongs to the problem of small sample size, which has not been solved completely, dimensionality compression of singular value vector is very necessary. In our method, an optimal discriminant transformation is constructed to transform an original space of singular value vector into a new space in which its dimensionality is significantly lower than that in the original space. Finally, a recognition model is established in the new space. Experimental results show that our method has very good recognition performance, and recognition accuracies of 100 percent are obtained for all 64 facial images of 8 classes of human faces.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A means for the identification of objects from contours despite affine transform induced distortions that takes the form of a linear signal space decomposition has been obtained. This new technique also yields robust estimates of the affine transformation from which the 3-D rotations of a near planar object may be obtained. The ability to determine object identity and orientation from a singe model representation without iteration or combinatorial search proceeds from the use of affine invariant differential measures that may be derived via Lie group theory. The resulting technique is extremely robust in the presence of noise (or nonplanarity of the object) owing to the error rejection properties of the signal space projection operations. The resulting algorithm is amenable to high-speed implementation with digital signal processing hardware architectures because it can be reduced to a sequence of linear 1-D signal processing operations. Included in this paper are a number of demonstration results that illustrate the resilience of the solutions in the presence of severe nonaffine distortion and pixelization error.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The interpretation and recognition of noisy contours, such as silhouettes, have proven to be difficult. One obstacle to the solution of these problems has been the lack of a robust representation for contours. In this paper, we present an analytical representation for contours. We introduce a smoothing criterion for the contour that optimizes the tradeoff between the complexity of the contour and proximity of the data points. We describe the computation of the contour representation, the computation of relevant properties of the contour, and the potential application of the representation and smoothing paradigm to contour interpretation and recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A simple method to recognize the printed alphanumerics is discussed. The proposed method is a simple rule-based structural method to recognize printed alphanumerics of image scanner data based on the thinning operation. This paper also presents major achievement made toward the development of a fast hierarchical recognition scheme for the printed and handwritten facsimile data. The conventional thinning techniques give good results for high-resolution image scanner data, but they suffer drawbacks for low-resolution data. Our scheme recognizes 55 characters per second on the IBM PC/386 environment and the recognition rate is 98%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A preliminary investigation confirmed the possibility of assessing the translation and rotation information content of simple binary images viewed at the output of a sensor array. In this paper, following a brief summary of the essence of the techniques used, we show how the translation and rotation component of the information are related to the overall information associated with a particular pattern. The overall information may be regarded as the information associated with a particular input pattern which may occupy any of the possible orientations and positions with equal probability. Simple rectangular patterns are used to illustrate the results, which are discussed in detail, but the technique is applicable to any shape.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The performance of the existing algorithms is generally assessed without any considerations to the architecture or technology that would be used for implementation. In many instances, the performance of the algorithms is degraded when they are implemented on a specific architecture. To date no formal assessment method exists that considers the effect of implementation constraints on the performance of the algorithms. In this investigation, the authors present a novel structured approach that can be used in assessing suitability of implementing algorithms on a specific architecture and in the comparison of performance of different architectures. The performance is measured in terms of a figure of merit that combines both accuracy of results and implementation efficiency. Some special considerations are mentioned that can further enhance the performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A hierarchical shape decomposition method called Convex-Hull Carving, derived from Sklansky's Concavity Tree and designed to accommodate the incorporation of human flexible resolution visual perception strategies in machine recognition, is proposed. The method characterizes an arbitrary complex shape at multiple hierarchical levels starting from a gross perspective of the entire shape itself, and progressing to decomposed and quantified convex sub-shapes, etc. Calculation complexity and the amount of data to be processed for object recognition applications are reduced. Sklansky's Concavity Tree is a hierarchical arrangement for describing nonconvex shapes. The concavity tree of a shape is defined as a tree describing the hierarchical arrangement of concavities; i.e., concavities within concavities. In the proposed Convex-Hull Carving method, the concavity tree structure is converted to a structure analogous to a chemical molecule. Tree components represent the `atoms' of the molecule and are characterized by their geometric position and a recently defined quantitative shape attribute called the shape quantifier. In addition, the number of hierarchical levels of shape description employed during recognition is driven by: (1) meeting `need to discriminate' criteria; or (2) the determination that all components (`atoms') are convex within predefined acceptance criteria (i.e., no further reduction is possible). The method was implemented to classify a set of two-dimensional aircraft shapes. Results showed that the method is stable with variation of rotation, scaling, and image resolution factors, as well as small viewing angle projection changes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A modification of the Hough transform has been devised and tested. It incorporates a post processing stage in which the voting edge points are tested according to perceptual criteria. The perceptual criteria are derived from the Gestalt psychologists work in characterizing the human vision system and include similarity in intensity, similarity in color, good boundary continuity, etc. Edge points which fail on these criteria are eliminated before the final vote in the Hough transform is taken. The method allows weak, but perceptually significant, information to be retained, even in the presence of noise. The method has been applied to the detection of curved boundaries in images of the human colon.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Transformations play an important role in image processing. This paper describes an algorithm for transforming a discrete gray level image from square to hexagonal pixel representation. Pixels in a square grid have four closest neighbors, while those in a hexagonal grid have six. The algorithm is explained and then presented in the language Dataparallel C -- a portable high level language being developed at Oregon State University to express data parallel algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Set erosion is an efficient algorithm which has been used to recognize shapes irrespective of orientation, translation, and scaling. The technique has successfully recognized complex shapes, even when two shapes overlap. The uncertainty in the measured estimate of scaling rose to 8% from the 2% figure obtained for separate shapes. The image picture is segmented between shape and background. The orientation and length of each side or arc on the perimeter of the shape is extracted using a chain code based technique and a set composed of the orientation and angle information formed. This set of data is then morphologically eroded with the orientation/angle spectra of each of the shapes in a predefined library of reference shapes, the reference shapes being scaled to the acquired image data. If the set of angle/weight reference data is contained within the acquired set, the reference shape is recognized as being part of the solution. The required shift of the reference spectrum to match the acquired spectrum yields the rotation of the shape relative to the reference data. Scale information is generated as part of the preconditioning of the reference data prior to the erosion process. Location data are generated by tagging extracted vertices within the chain code extraction of side data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We define the concept of size functions. They are functions from the real plane to the natural numbers which describe the `shape of the objects' (seen as submanifolds of a Euclidean space). We give two different techniques of computation of size functions and some actual examples of computation. Moreover, we present the concept of deformation distance between manifolds (i.e., curves, surfaces, etc.). It is a distance which measures the `difference in shape' of two manifolds. Finally we point out the link between deformation distances and size functions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is well known that edges have a wide variety of intensity profiles. Current edge detecion strategies are generally based on one or at most a very few intensity profiles. Some strategies require the use of a detector 'tuned' to the profile of the edge under consideration, thereby implementing a matched filter. Since the edge profile is unknown a priori, many detectors are used at each pixel. The problem is then to decide which detector output is to be believed. This is the response combination problem . Ifthe profile changes along the edge, these strategies either detect more than one edge, displaced from the actual edge location, or break the edge into segments, losing connectivity information. The problem with these edge detection approaches is that they invariably incorporate a profiledependent edge model, either implicitly or explicitly. Further, they make little use of the 2dimensional information inherent in the image. The proposed algorithm addresses both of these issues. Our edge model is based on three assumptions: 1) edge detection and edge localization are two distinct operations, 2) large magnitudes of second directional derivatives of intensity exist in the close neighbourhood of valid edges, and 3) the variation in orientation of edge elements (edgels) in the close neighbourhood of edges is very low since the directions which maximize the second directional derivatives at all pixels in a small neighbourhood of the edge are similar. Research in human visual psychophysics and constraints on the dynamic intensity range of practical imaging systems support assumptions 1 and 2. Assumption 3 is based on the spatial coherence of three-dimensional objects, and the corresponding spatial coherence of their images. Thus, in the proposed algorithm, edge detection is the task of locating narrow regions or edge ribbons containing edgels having sufficiently similar orientations at which the magnitude of the second directional derivative of intensity is maximized. The consistency of edgel orientation is shown to be a measure of the local signal to noise ratio. Edge localization is performed by cornputing the centroid of the distribution of second directional derivative magnitudes over segments of the edge ribbon which span its width, and for which edge! orientation is sufficiently similar. Use of the centroid for edge localization permits sub-pixel resolution. Psychophysical evidence supports this localization paradigm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mask matching is one method for edge detection which convolves patterns in various orientations with the given image. The orientation that gives the best match at a given point is decided as the edge orientation at that point, and the magnitude of this best match as a measure of the edge strength. However, detectors are usually designed heuristically or are designed based on some assumed distribution of pixels. Thus, how to select an appropriate edge detector for a specific type of images is by no means an easy job. This paper presents a connectionist procedure for learning the appropriate edge detector for a specific class of images. The delta learning rule is used to train a neural network and the effort of designing edge detectors is done automatically. The experimental results show that this learning approach is promising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There is considerable ambiguity in literature as to how much information can be derived from the occluding contour in shape-from-shading reconstruction. It is often argued that because the derivatives p equals (partial)z/(partial)x and/or q equals (partial)z/(partial)y are infinite on occluding contours, such contours cannot supply the initial slope values needed for integrating the image irradiance equation by the characteristic strip method. By this method surface shape along the strip is uniquely determined by image data and the initial slopes at just one starting point. By contrast, the variational approach to shape-from-shading claims to make full use of information contained in the shape of the occluding contour in the form of boundary values set for the appropriate Euler second order PDEs, which are solved as though the solution had a global dependence on its boundary values. To analyze this controversy, we transform the characteristic strip equations in surface gradients into equivalent equations in surface normals. A nice system of equations was obtained, with no singularities on the limb of the object, which, we hoped, could be integrated starting right from the occluding contour. It turned out, however, that for most reflectance functions (one exception is the familiar eikonal equation) the occluding contour is the envelope of the family of characteristics and is itself a solution to the characteristic strip equations, which at the occluding boundary turn out to have multiple solutions. This is the true reason precluding numerical integration of these equations from the occluding contour.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new method is proposed here termed weighted Hough transform (WHT). The advantage of the WHT is that it can be applied to the differential image directly without the need for thresholding. In the WHT the contribution of each pixel to the parameter domain is weighted according to its value. It is well known that the performance of the conventional Hough transform is dependent on the threshold value used. The new method is therefore a generalizing of the Hough transform to overcome this problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Segmentation paradigms are based on manipulating some distinguishing characteristic of the various objects that are present in the image being analyzed. These characteristics are often well known for given objects in a given class of images. The intelligent use of this knowledge can simplify the segmentation process without necessarily targeting it for a particular class of images. This paper outlines a segmentation paradigm that uses models which characterize the expected presentation of possible image objects. It explains how knowledge of the expected localization of certain objects can be used to refine the segmentation process, to optimize object extraction and identification, and to learn some invariant characteristics of the objects and their surroundings, for use by high level intelligent processes. We present results of experiments with MRI human brain scans, dental radiographs, and transmission electron microscope (TEM) serial sections of hemocytes (insect blood cells).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper addresses the boundary detection problem for textured images using weak continuity constraints on the local statistics by a constrained graduated nonconvexity (CGNC) method. A parameter vector consisting of a set of first- and second-order statistics of the textured image assumed to be a Gaussian Markov random field (GMRF) is estimated locally for each pixel. This vector is then compressed to a single parameter and this parameter is considered as the data value at that pixel. We assume a model for textured images where these data values are allowed to change smoothly within textures of the image and abruptly across texture boundaries. The problem can then be considered as the reconstruction of piecewise smooth parameter surfaces measured in noise. For the solution of this problem, we adopted a weak continuity constraints approach. The weak-membrane is specified by its associated energy function constrained by a line process that organizes the boundaries, and the estimates for the parameter values are obtained by minimizing this energy using a continuation method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Noise and digitization effects cause real concern in segmentation of 3-D images. There is a need for a technique which takes care of these effects without increasing the computational effort to a great extent. In this paper we present a new computationally efficient 3-D object segmentation technique suitable for noisy images. The technique is based on detection of edges in the image. The edges can be classified as belonging to one of the four categories: fold edges; semi-step edges; boundary edges; and smooth edges. The 3-D image is sliced to create equidepth contours (EDCs). Four types of critical points are extracted from the EDCs which indicate edge regions in the 3-D image. Sparse reliable edge pixels are extracted first using the critical points obtained from the EDCs. The edges are grown from these reliable edge pixels through the application of some proposed masks. The constraints of the masks can be adjusted depending on the noise present in the image. The total computational effort stays at a reduced level as the masks are applied only in the small neighborhood of critical points (edge regions), rather than on all the pixels in the scene. Further, the algorithm can be run in parallel as edge growing from different edge regions can be carried out independently of each other.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation, the partitioning of an image into meaningful parts, is a major concern of any computer vision system. The meaningful parts of a text image are lines of text, words, and characters. In this paper, the segmentation of pages of text into lines of text and lines of text into characters on a parallel machine are examined. Using a parallel machine for text image segmentation allows the use of techniques that are impractical on a serial machine due to the computation time needed. It is possible to use a parallel machine to segment text images of lines using spatial histograms with an accuracy of 97.9% at a speed of 30 milliseconds or less per character. Statistically adaptive rules based on dynamic adaptive sampling are used for line segmentation and also for improved accuracy of character segmentation. The segmentation of lines from a page can also be accomplished using a set of statistically adaptive rules which allow sloped lines of text to be segmented. The use of these statistical rules on a parallel machine increases processing time by no more than 1 millisecond per character. Using statistical rules in combination with knowledge about the printed style increases the segmentation accuracy to 99.2% correct for machine-printed text and 89.6% for hand-printed text.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The computation of structural descriptions of objects in images can contribute to many image analysis tasks including measurement, registration, and object formation and identification. Edge-based structural descriptions fail to support the creation of coherent objects. Medial descriptions provide better support for object formation and measurement. We demonstrate an artificial visual system that uses outputs of Gaussian derivative filters to infer a multiscale medial axis (MMA) in 2-D grayscale images. Properties of the MMA are illustrated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the application of regularization techniques to the problem of segmenting range images. We propose a new energy functional that varies the amount of smoothing according to the gradient of the data. An iterative application of reconstruction using this new functional improves the signal/noise ratio of the noisy input image with good preservation of discontinuities. By employing reconstruction using this new energy functional, the difficulty in applying regularization techniques to the segmentation problem due to smoothing over discontinuities is circumvented. The results indicate that the algorithm performs especially well on noisy range images. Reconstruction using the new energy functional shows the possibility of its application to the problem of image enhancement. An algorithm is described for the detection of zeroth order discontinuities and surface reconstruction. We also discuss how the same algorithm can be applied to detect first order discontinuities and be applied to gradient reconstruction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In their range image segmentation algorithm Besl and Jain proposed the use of erosion for extracting seed regions. Based on these seed regions the initial, coarse image segmentation is refined by an iterative region growing method. In this paper we suggest a simple distance transform for seed region extraction. We discuss the various ways of actually computing the distance transform. Particularly, we compare two algorithms for this purpose: the well-known two-scans algorithm and another algorithm based on a fast computation method for the morphological operation erosion. Both theoretical analysis and experimental results show that our new seed region extraction algorithm is more efficient than that of Besl and Jain.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new region segmentation method which uses a perceptual color system. In a general algorithm using color information, an input image is primarily divided into two parts with the following characteristic. One has regions with high saturation, the other with low. A single chroma thresholding operation is used to determine these two parts. However, for outdoor images, such as traffic scenes, in which chroma distribution are likely to be concentrated around low saturation values, this method is not useful. Perceptually speaking, it is not only chroma but also hue and brightness value which decide whether a region is highly saturated or not. We propose a new method that decides a variable chroma threshold value. A chroma function which has two parameters of hue and brightness value is constructed from the well-known Munsell color solid whose complex shape is determined by hue and brightness. Using this function, a chroma threshold value of a region is computed by hue and brightness value of the region. We provide experimental evidence for this method on outdoor images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new hybrid method that combines the scale space filter (SSF) and Markov random field (MRF) for color image segmentation. Using the scale space filter, we separate the different scaled histogram to intervals corresponding to peaks and valleys. The basic construction of MRF is a joint probability given the original data. The original data is the image that we get from the source and the result is called the label image. Because the MRF needs the number of segments before it converges to the global minimum, we exploit the scale space filter to do coarse segmentation and then use MRF to do fine segmentation of the images. Finally, we compare the experimental results obtained from using SSF only, or combined with MRF using iterated conditional mode (ICM) and Gibbs sampling.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation is one of the most important problems in computer vision. Recently, Fairfield proposed an interesting approach to image segmentation using toboggan enhancement followed by naive contrast segmentation, which is a noniterative, linear execution time method. The way it operates can be thought of as a man tobogganing in the first derivative terrain, i.e., the graph surface of a discontinuity measure computed by the first derivative of the image intensity. The segmentation results it produced appeared equal in quality to that of other complex optimal region growing methods. In this paper, an improved version of Fairfield's method, called keep-sliding toboggan segmentation, is presented. With our method, the toboggan will keep sliding on a plane in the derivative terrain, where the original toboggan method will stop sliding. Therefore, our method produces far less regions than the original. Other improvements achieved are as follows: Instead of being followed by contrast segmentation post-process, our keep-sliding tobogganing process is preceded by a prefiltering process which suppresses small fluctuations in the first derivative terrain. Because of this prefiltering operation, our tobogganing process can automatically merge the regions having small intercontrast. Also, a new discontinuity measure is proposed to allow the detection of small target regions without ever-segmenting the images. Experimental results indicate that the segmentations produced by the keep-sliding toboggan method are less noisy, and, therefore, it is more appropriate to use them as initial segmentations for higher level image segmentation techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An edge detection algorithm allows the edge points to be extracted which will be joined in a further stage to form lines. When performed on a single image of a scene, this edge detection is difficult because it is not easy to distinguish in the image what is the essential information (linked to the shapes to recover) and what are contingent details; then meaningful edge points may be missed or, on the contrary, some superfluous contours may be selected. If multispectral images were available, the separation between essential and contingent information would be easier to get. The essential information may be considered as common to the different spectral images, as if to characterize a `species' we get not only a member of the species but a population. The proposed method for edge detection of multispectral images relies on the fusion of statistics computed within small corresponding neighborhoods of the images. The statistics are related to similarities between images and the particularity of the method comes from the fact that these statistics are only meaningful when all the corresponding neighborhoods are considered. The same parameters for one single spectral image are not so meaningful and don't lead to the right decision. This hidden information at monochromatic levels, which is revealed at multispectral levels, is called genetic information. That is why the proposed method is called `genetic fusion'. As an example, genetic fusion is applied to edge detection of SPOT multispectral images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we describe a general-purpose segmentation module, defined as a set of basic operators with parameters. This module uses explicit knowledge about these operators, in order to select the ones that are best suited to the present treatment and to determine their parameters and thresholds dynamically, according to the class of the image to be segmented. Three examples of the segmentation process involved in different applications are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Research is underway to apply computerized tomography (CT) imaging to hardwood log inspection in the forest products industry. For this purpose, an intelligent vision system is being created that is aimed at locating, identifying, and quantifying the internal defects inside logs by analyzing their CT image data. This inspection system is designed to be wood species independent. It is composed of three components: a CT scanner-based data acquisition system; a low-level module for image segmentation; and a high-level module for defect recognition. Defect quantification is attained by computing the volume and orientation of each defect. This paper discusses the problems of segmenting CT image sequence and 3-D object detection by a rule-based expert system approach. Experimental results with real-world images of different hardwood log species are provided to show the usefulness, efficacy, and robustness of the proposed inspection system. This allows solutions to hardwood log inspection, as well as to problems in other nondestructive testing applications where image analysis plays an important role.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
New performance measures for evaluating fuzzy partitions obtained through c-shells clustering are introduced. It is shown that the conventional measures for the fuzzy partitions do not perform well for the FCS clustering. A new set of indices are introduced to evaluate the structure characterized by the FCS algorithms. Examples are presented to demonstrate the superiority of the criteria proposed over the existing ones.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Two fundamental requirements for the generation of support using incomplete and imprecise information are the ability to measure the compatibility of discriminatory information with domain knowledge and the ability to fuse information obtained from disparate sources. A generic architecture utilizing the generalized fuzzy relational database model has been developed to empirically investigate the support generation capabilities of various compatibility measures and aggregation operators. This paper examines the effectiveness of combinations of compatibility measures from the set-theoretic, geometric distance, and logic- based classes paired with t-norm and generalized mean families of aggregation operators.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a modular, unsupervised neural network architecture which can be used for clustering and classification of complex data sets. The adaptive fuzzy leader clustering (AFLC) architecture is a hybrid neural-fuzzy system which learns on-line in a stable and efficient manner. The system uses a conventional fuzzy K-means clustering algorithm as a learning rule embedded within a control structure similar to that found in the adaptive resonance theory (ART-1) network. AFLC adaptively clusters analog inputs into classes without a priori knowledge of the entire data set or of the number of clusters present in the data. The classification of an input takes place in a two stage process; a simple competitive stage and a distance metric comparison stage. It is shown that the definition of the distance metric can be adjusted as necessary to fit the characteristics of the input data. The AFLC algorithm using two different distance definitions is discussed and then the operating characteristics are described. The performance of the algorithm is presented through application of the algorithm to clustering computer generated normally distributed data, the Anderson & Fisher Iris data, and data generated from projections of 3-D objects in constrained motion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A hardware accelerator that performs fuzzy learning, fuzzy inference, and defuzzification strategy computations is presented in this paper. The hardware is based on two-valued logic. A universal space of 25 elements with five levels each is supported. To achieve a high processing rate for real-time applications, the basic units of the accelerator are connected in a four-level pipeline. The accelerator can receive two parallel fuzzy data as inputs. At a clock rate of 20 MHz, the accelerator can perform 800,000 fuzzy logic inferences per second on multidimensional fuzzy data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There are many methods of processing a digitized image. Some are local, such as edge finding, and some are global, such as contrast enhancement. There are frequency domain methods (using the Fourier transform) and spatial domain methods that process the direct values. Among all these methods, the maximum entropy (ME) method claims to do the best job, though sometimes at a computationally high burden. Adaptive Kalman filters are claimed to be as good as ME and computationally more attractive. Outside the analytic world, neural networks and fuzzy set theory have been applied with remarkable results. This paper presents a new methodology based on possibility theory that is in some ways analogous to the ME method of probability theory.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we introduce new hard and fuzzy clustering algorithms called the c-quadric shells (CQS) algorithms. These algorithms are specifically designed to seek clusters that can be described by segments of second-degree curves, or more generally by segments of shells of hyperquadrics. Previous shell clustering algorithms have considered clusters of specific shapes such as circles (the fuzzy c-shells algorithm) or ellipses (the fuzzy c-ellipsoids algorithm). The advantage of our algorithm lies in the fact that it can be used to cluster mixtures of all types of hyperquadrics such as hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids, and even hyperplanes. Several examples of clustering in the two-dimensional case are shown.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In designing systems which analyze temporal sequences of images, it is necessary to provide mechanisms which identify, track, and release objects over time. There is considerable uncertainty in the definition of objects in a natural scene as evidenced, for example, in forward looking infrared (FLIR) imagery. In this paper, we present a methodology, based on the theory of fuzzy sets, which can handle this problem. It contains a feature driven fuzzy correlator which integrates current and past information to update object histories, to detect new objects, and to determine when objects leave the scene. The intention is to use such a system in a surveillance mode, where there is reasonable time for computation. Examples are given from an automatic target recognition application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Whereas gray-scale morphology has been formally interpreted in the context of fuzzy sets, heretofore there has not been developed a truly fuzzy mathematical morphology. Specifically, mathematical morphology is based on the notion of fitting, and rather than simply characterize standard morphological fitting in fuzzy terms, a true fuzzy morphology must characterize fuzzy fittings. Moreover, it should preserve the nuances of both mathematical morphology and fuzzy sets. In the present paper, we introduce a framework that satisfies these criteria. In contrast to the unusual binary or gray-scale morphology, herein erosion measures the degree to which one image is beneath (which is a subset type relation) another image, and it does so by employing an index for set inclusion. The result is a quite different `fitting' paradigm. Based on this new fitting approach, we define erosion, dilation, opening, and closing. The true fuzziness of the theory can be seen in a number of ways, one being that the dilation does not commute with union. (The commutativity lies at the heart of nonfuzzy lattice-based mathematical morphology.) However, we do arrive at a counterpart of Matheron's Representation Theorem for increasing translation-invariant mappings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A main goal of robotics is to impart in machines an ability to intelligently interact with their environments. In this study we present a system (and an implementation of it) which evaluates the compatibility between objects and specific goal oriented action requirements for carrying out hand-actions. The system is an extension in the framework of fuzzy sets and possibility theory of our model of functional recognition of objects. The system operates on a hierarchical description of objects and action requirements which are mapped into a functional compatibility domain. We show that what computations are actually carried out by the algorithm depends on the application. When the system is used for inspection, it probably knows the identity of the object and its task is only to verify whether the object is in a state that allows its use in the prototypical action for which it has been designed. When the system is used for categorization purposes it needs to identify the existence of features compatible with the action requirements and it verifies that features which prevent carrying out the action cannot be identified. This latter case is particularly interesting and computationally challenging since the system is asked to compute the auxiliary, or possible functions of an object in addition to the prototypical function. The computation of the auxiliary functions relies on the description of the object's parts and subparts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Software-Orientated Techniques for Autonomous and Robotic Systems
The ability to act flexibly in an uncertain and dynamic environment is one of the key objectives of robotics. In previous work, we described an approach to this problem which we called the planner-reactor approach. This paper reviews that approach and presents our current implementation in depth.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A technique for improving stepped-frequency inverse synthetic aperture radar (ISAR) imagery via entropy minimization is presented. Image improvement is achieved in the frequency domain where the echo phase can be adjusted to compensate for radial motion. The computational algorithm which determines motion parameters that reduce entropy to an absolute minimum is based on the golden-section-search method and operates in a closed-loop mode. Using this technique one can efficiently focus the image generated by a fully automated high-resolution-radar system that evaluates its own performance in terms of image quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An object-oriented robot independent programming environment (RIPE) developed at Sandia National Laboratories is being used for rapid design and implementation of a variety of intelligent machine applications. A system architecture based on hierarchies of distributed multiprocessors provides the computing platform for a layered programming structure that models work cell tasks as a set of software objects. These objects are designed to support model-based automated planning and programming, real-time sensor-based activity, and robust communication. The object-oriented paradigm provides mechanisms such as inheritance and polymorphism which allow the implementation of the system to satisfy the goals of software reusability, extensibility, reliability, and portability. By designing a hierarchy of generic parent classes and device-specific subclasses which inherit the same interface, a robot independent programming language (RIPL) is realized. Prototype systems for handling nuclear waste shipping casks, underground storage tank cleanup, nuclear weapons disassembly, and glove box access are successfully implemented using this object-oriented software environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As a result of the proliferation of powerful and low cost computer systems, traditional tools of manufacturing are now imbedded in an environment of automated subsystems. The focus is now on utilizing computer technology for computer integrated manufacturing and managing these integrated subsystems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Monitoring the state of health of systems is an increasingly difficult problem as the complexity of systems grows. This paper presents an algorithmic approach based upon a variation of multidimensional scaling which allows for the integration of multiple sensor values into composite values for system state monitoring purposes. Whereas the number of composite values is much less than the number of raw sensor values, the task of a human state monitor is facilitated. Configural display techniques that are used to present the resulting composite values to the operator are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper explores alternative models of the interpretation tree (IT), whose search is one of the dominant paradigms for object recognition. Recurrence relations for the unpruned size of eight different types of search tree are introduced. Since exhaustive search of the IT in most recognition systems is impractical, pruning of various types is employed. It is therefore useful to see how much of the IT will be explored in a typical recognition problem. Probabilistic models of the search process have been proposed in the literature and used as a basis for theoretical bounds on search tree size, but experiments on a large number of images suggest that for 3-D object recognition from range data, the error probabilities (assumed to be constant) display significant variation. Hence, the theoretical bounds on the interpretation tree's size can serve only as rough estimates of the computational burden incurred during object recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image understanding is a broad field of image processing where the goal is to classify the elements of a scene. In this paper we describe an approach to image understanding based on the matching of structure graphs. The structure graph of the input image is composed of `nodes' (primitives extracted from the image, e.g., regions, line segments) and `edges' (relationships between primitives in the image). The goal of our algorithm is to find the best match between this graph and a prototype graph, representing the knowledge about the expected scene. We formulate the graph matching problem as a consistent labeling problem, where the nodes of the prototype graph are considered labels. We then search for a labeling of the input structure graph that is optimal in the sense that the nodes and edges of the input graph are consistent with the labels and relationships represented in the prototype graph. A `quality of fit' measurement is derived for the matching, and a genetic algorithm is used to find the optimal solution. The advantages of this method of inexact (or fuzzy) matching include its graceful degradation (robustness) in the presence of noise and image deformation, its parallelism, and its adaptability to a variety of domains. We complete this work with the discussion of experimental results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Constraints are mathematical mapping functions which transform from an attribute or feature space onto a score or measure of plausibility. The term plausible is used because this paper assumes one is looking to support a hypothesis rather than refute it. In this paper, a system is described which allows the algorithm developer to easily incorporate domain knowledge into an interpretation process through the graphical creation and editing of constraints. These constraints can be applied to multiple sets of data through the use of application programs. Groupings or spatial relationships such as collinearity or nearness are also attributes which may be constrained in an attempt to interpret image data. Model matches may likewise be written as constraint mappings. Primitive constraints may be combined to form compound constraints, and differing compounding weights may be assigned to primitive constraints. If these weights are written as functions dependent upon other information, the a system developed with this process can be made adaptive.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A model-based object recognition system, predicated on logic as a method of modeling, describing, and identifying objects, is proposed. Users supply the object recognition system with models of known object classes in the form of production rules. The system describes each instance of an object found within an image scene as a collection of facts. The modeled rules act upon these facts in a Prolog environment to obtain an interpretation of the original image scene. Since users supply the object models and the Prolog environment supplies the inference mechanism for interpretation, the primary task of the object recognition system is the description process. For each object instance identified within the image scene, declarative statements are formulated which represent observed components, features, or attributes of that object. The description of an object instance is restricted to its geometric components which are derived from a skeleton or stick-figure representation of a 2-D silhouette portrayal of an object found in the original image scene. Therefore, object classifications can be modeled in a general way so that the size and orientation of the object is independent of that model. Sample results are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Motivated from problems in computer vision, in this paper we introduce a new class of problems in system theory, which we call, perspective problems in system theory. The word perspective is derived from vision problems wherein feature points on an object are assumed to be projected perspectively on a screen during the process of imaging. Consequently, the basic problem of perspective system theory that we consider in this paper is to observe the initial condition or to identify the parameters of a dynamical system with the aid of a perspective observation function. Based on this concept, we present a new approach for solving the problem from point and line correspondence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a high level vision expert system designed to identify objects found in 2-D vertically photographed images. The goal of the system is to identify the image contents by assigning interpretations to detected objects. The system assumes that the objects have already been detected by a low level vision system. Each object's location and boundaries are then identified. The interpretation process starts by segmenting the 2-D object into its components. The descriptive features (geometrical features) for each region are then defined. The confidence factors for each feature are anticipated in the object using Bayesian probabilities principles. The relational predicates for the object's model regions are then defined. Finally, the regions knowledge is represented in a frame-like representation. The system uses the object's knowledge and model's knowledge, both represented as frames, to infer the type of the unidentified object. The models are stored in classified databases relating the object's models, confidence values, and features ranges. The inference process matches the object's features to those of the models and generates plausible hypothesis of the models that coarsely identify the object. Forward reasoning is used in the plausible hypothesis phase and backward reasoning is used in the hypothesis verification phase. The system is composed of an analysis part, inference engine, a supervisor, short and long term memories (STM & LTM resp.), and an input/output section.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Intelligent Materials Handling and Vision Systems I
The problem of learning correct decision rules to minimize the probability of misclassification is a problem of supervised learning in pattern recognition. The problem of learning such optimal discriminant function is considered for the class of problems where little is known about the statistical properties of the pattern classes. This paper describes the application of a machine learning technique called the genetic learning algorithm to the problem of learning the optimal discriminant function. Several variations of the algorithm are investigated to determine which generates the best solution. Simulation results and examples are presented. The main advantages offered by the genetic algorithm are generality and fast learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The task of 3-D object recognition can be viewed as consisting of four modules: extraction of structural descriptions, hypothesis generation, pose estimation, and hypothesis verification. The recognition time is determined by the efficiency of each of the four modules, but particularly on the hypothesis generation module which determines how many pose estimates and verifications must be done to recognize the object. In this paper, a set of high-order perspective-invariant relations are defined which can be used with a neural network algorithm to obtain a high-quality set of model-image matches between a model and image of a robot workstation. Using these matches, the number of hypotheses which must be generated to find a correct pose is greatly reduced.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A decision support system is developed for personnel scheduling in a multiple warehouse environment. The system incorporates current manpower level, historical data of workers used, empirical load distributions, and performance standards to generate manpower requirements for a specified planning horizon. The software has been developed to be easily adaptable to varying situational details, therefore is widely applicable in different warehouse settings. The system offers personnel managers a valuable tool for evaluating alternative schedules and making intelligent decisions regarding personnel scheduling in warehouses.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new set of collision checking and obstacle avoidance algorithms has been developed and implemented in both hardware and software. The method allows for unlimited vector checks against an unlimited set of objects. Dependent upon the application, the single card hardware performance ranges from 1 million line sorts per second to hundreds of millions. Therefore, due to the high algorithm speed, the overall system performance only becomes limited by the choice of processor and the speed of the interface. The hardware is presently configured to process large blocks of objects and data (8K) at a sorting rate of one point against eight objects at a rate of 200 million points per second. The trade-off between choice of algorithm and performance is discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Laboratory for Space Teleoperation and Robotics is developing a neutrally-buoyant robot for research into the automatic and teleoperated (remote human) control of unmanned robotic vehicles for use in space. The goal of this project is to develop a remote robot with maneuverability and dexterity comparable to that of a space-suited astronaut with a manned maneuvering unit, able to assume many of the tasks currently planned for astronauts during extravehicular activity (EVA). Such a robot would be able to spare the great expense and hazards associated with human EVA, and make possible much less expensive scientific and industrialization exploitation of orbit. Both autonomous and teleoperated control experiments will require the vehicle to be able to automatically control its position and orientation. The laboratory is developing vision-based vehicle navigation system that works by tracking features in video images from cameras mounted on the vehicle and trained at a special target fixed in the environment. The methods are adaptable to a variety of video-based tracking systems, and are based on a linearized vision model, receiving as inputs image feature coordinates at each time step This paper includes a description of the underwater vehicle and the vision system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Intelligent Materials Handling and Vision Systems II
A redundant manipulator can be defined as a manipulator that has more degrees of freedom than necessary to determine the position and orientation of the end-effector. Such a manipulator has dexterity, flexibility, and the ability to maneuver in the presence of obstacles. This paper presents a solution to the inverse kinematics problem for redundant manipulator based on artificial neural network (ANN). The ANN used is of the supervised type -- multilayer feedforward neural network with back error propagation (BEP) training algorithm. The training set for the ANN is obtained by sampling the joint space trajectory of the redundant manipulator arm or from the joint angle encoders of the manipulator. These sampled values of the task space trajectory (end-effector coordinates) are used as the command input vectors to the ANN. By presenting the network with these training set cyclically during training time, the BEP algorithm will change the learning parameters of the ANN so that the sum of the squared difference between the actual joint coordinates and the desired output vectors is minimized.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The processing necessary for high level vision is considered here. The manipulation of symbolic entities, each with values, properties, and constrained relationships to other symbolic entities is a major component. To perform recognition, the system operates in a guided globular reasoning framework. Further, concepts for a vision system which center on deduction are inherently distributed and desirable for an asynchronous systolic array architecture. The concepts presented in this paper, which are normally inaccessible, ease the tasks of processing spatial knowledge and modeling a modularly structured system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a vision-guided control system for an industrial robot capable of picking up an object, moving it to a goal, and placing it there. Tasks given to the control system are based on imperfect knowledge about the environment. The control system corrects the task parameters by matching them against range information gained from the environment. The control system is part of a larger system, which includes a high-level goal-oriented planner. The planner consists of hierarchically organized planning-executing-monitoring triplets, which execute given tasks by dividing them into subtasks, by sending the subtasks either to other triplets or to the control system described in this paper, and by monitoring the execution of the subtasks. The planner sees the robot and the control system as an intelligent robot capable of executing pick-and-place tasks in a dynamic, partly unknown environment. This paper presents the results of the testing of the control system with an industrial 6-axis robot and a structured light-based range sensor. Also the principle of calibrating the robot and the sensor is presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The reliability of the materials handling process involving automated stacking of packages on a pallet or automated sorting of packages in a distribution system depends mainly on the design of the package and the material used for the package. Many problems can be eliminated that result in a higher utilization of the system if the package is designed not only for the product and its requirements but also for an automated handling system with different types of grasping devices. A decision support system is being developed to help the package designer select the most appropriate material and design to satisfy the requirements of the automated materials handling process. The decision support system is programmed in C++ which gives the flexibility and portability needed for this type of system. The user interface is using graphics to ease the understanding of different design options during the selection process.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new algorithm is presented for applying Marill's minimum standard deviation of angles (MSDA) principle for interpreting line drawings without models. Even though no explicit models or additional heuristics are included, the algorithm tends to reach the same 3-D interpretations of 2-D line drawings that humans do. Marill's original algorithm repeatedly generated a set of interpretations and chose the one with the lowest standard deviation of angles (SDA). The algorithm presented here explicitly calculates the partial derivatives of SDA with respect to all adjustable parameters, and follows this gradient to minimize SDA. For a picture with lines meeting at m points forming n angles, the gradient descent algorithm requires O(n) time to adjust all the points, while the original algorithm required O(mn) time to do so. For the pictures described by Marill, this gradient descent algorithm running on a Macintosh II was found to be one to two orders of magnitude faster than the original algorithm running on a Symbolics, while still giving comparable results. Once the 3-D interpretation of the line drawing has been found, the 3-D object can be reduced to a description string using the Universal 3-D Array Grammar. This is a general grammar which allows any connected object represented as a 3-D array of pixels to be reduced to a description string. The algorithm based on this grammar is well suited to parallel computation, and could run efficiently on parallel hardware. This paper describes both the MSDA gradient descent algorithm and the Universal 3-D Array Grammar algorithm. Together, they transform a 2-D line drawing represented as a list of line segments into a string describing the 3-D object pictured. The strings could then be used for object recognition, learning, or storage for later manipulation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A telerobotic system for power-line maintenance is equipped with a laser range-finder allowing it to perform an estimation of its workspace occupancy. This paper describes the model used for describing space occupancy and the 3-D computer vision method developed for extracting this information. It consists in building an octree of the scene from several range images taken from various points of view. A calibration is performed using the first image in order to find the initial position of the camera relative to the scene. Successive 3-D images are used to complete the model information until a satisfying knowledge of the space occupancy is reached.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Random selection of operands for image understanding systems often requires extensive region modification and (local) object re-selections to achieve success, from comparison of their clues. The deductions and processing required to achieve success are quite time-intensive and degrade the system's performance, significantly. In this paper, a priority scheme is proposed which produces a prioritized set of local objects and their combined (or single) regions, as operands for autonomous taskspecifically interacting systems of an image understanding system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper an implementation of the analytic Hough transform (AHT) for exact digital line detection is developed that employs a new, efficient data structure. This new structure eliminates the need to represent each digital line parameter region that is developed during the analysis of the image by empolying a region divider representation. A relative storage scheme is employed thast permits reconstruction of region occupancy information during the search for digital line support. Furthermore, it is shown that all values in the AHT data structure may be stored as rational numbers with fixed and finite numerator and denominator ranges defined by the image resolution. As a result, all floating point computations are replaced by faster, fixed word-size, integer operations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Robots operating in unstructured environments require continuous utilization of sensors and intelligence for adapting to changing situations. In this paper a control method to achieve this goal is described and preliminary experiments are discussed. The control scheme is based on a hierarchically organized set of planning-executing-monitoring (PEM) cycles. Every PEM cycle is a goal-oriented module, which consists of three generic activities -- planning, executing, and monitoring -- and a separate meta control mechanism, which takes care of the control of generic activities inside a PEM cycle. We present our design experiments beginning from the development of a PEM-based logical model for an autonomous machine, and continue to the development of an implementation model for a loading manipulator control system. The laboratory implementations in two industrial robot environments are also described as well as plans for PEM-control implementation for a heavy-duty manipulator designed for loading paper rolls in harbor sites.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.