Quality assessment methods are classified into three types depending on the availability of the reference image or video:
full-reference (FR), reduced-reference (RR), or no-reference (NR). This paper proposes efficient RR visual quality
metrics, called motion vector histogram based quality metrics (MVHQMs). In assessing the visual quality of a video, the
overall impression of a video tends to be regarded as the visual quality of the video. To compare two motion vectors
(MVs) extracted from reference and distorted videos, we define the one-dimensional (horizontal and vertical) MV
histograms as features, which are computed by counting the number of occurrences of MVs over all frames of a video.
For testing the similarity between MV histograms, two different MVHQMs using the histogram intersection and
histogram difference are proposed. We evaluate the effectiveness of the two proposed MVHQMs by comparing their
results with differential mean opinion score (DMOS) data for 46 video clips of common intermediate format
(CIF)/quarter CIF (QCIF) that are coded under varying bit rates/frame rates with H.263. We compare the performance of
the proposed metrics and conventional quality measures. Experimental results with various test video sequences show
that the proposed MVHQMs give better performance than the conventional methods in various aspects such as the
performance, stability, and data size.
This paper presents a shot boundary detection (SBD) method that finds boundaries between shots using the changes in visual content elements
such as objects, actors, and background. Our work presented in this paper is based on the property that the features do not change significantly within a shot whereas they change substantially across a shot boundary. Noticing this characteristic of shot boundaries, we propose a SBD algorithm using the scale- and rotationinvariant local image descriptors. To obtain information of the content elements, we employ the scale invariant feature transform (SIFT) that has been commonly used in object recognition. The number of matched points is large within the same shot whereas zero or the small number of matched points is detected at the shot boundary because all the elements in the previous shot change abruptly in the next shot. Thus we can determine the existence of shot boundaries by the number of matched points. We identify two types of shot boundaries (hard-cut and gradual-transition such as tiling, panning, and fade in/out) with a adjustable frame distance between consecutive frames. Experimental results with four test videos show the effectiveness of the proposed SBD algorithm using scale invariant feature matching.
According to advances in digital image processing techniques, interest in high-quality still images has been increased. This paper proposes an effective discrete wavelet transform (DWT)-based algorithm that efficiently interpolates a highresolution (HR) still image from a low-resolution (LR) one. The decay and persistence properties of DWT coefficients are utilized for HR image reconstruction. DWT coefficients are initially estimated based on the decay of the Lipschitz exponential and the similarity of DWT coefficients across resolution scales is used for coarse estimation. For optimal fine estimation the mean squared error is iteratively minimized. The proposed DWT-based interpolation algorithm yields
better performance than the conventional methods in terms of the peak signal to noise ratio and subjective image quality.
KEYWORDS: Edge detection, Sensors, Image processing, Gaussian filters, Digital filtering, Image filtering, Image segmentation, Detection and tracking algorithms, Electronic filtering, Signal to noise ratio
We present an edge detection method based on a water-flow model and gradient information. The gradient magnitude image emphasizes edges of objects in an image and effective extraction of well-defined and connected edges having large gradient values follows. The proposed method can be classified as a locally adaptive thresholding method. To show the effectiveness of the proposed method, its simulation results for various noise-free and noisy synthetic and real images are compared with those of conventional methods. In addition, edge evaluation results of five edge detection methods are shown quantitatively.
KEYWORDS: Lawrencium, Reconstruction algorithms, Image processing, Super resolution, Image enhancement, Motion models, Digital imaging, Image quality, Neodymium, Control systems
In this paper, we propose a super-resolution (SR) algorithm for reconstructing a high-resolution (HR) image from
multiple low-resolution (LR) sequence images, in which the projection onto convex sets (POCS) algorithm is employed
with artifact reduction constraints. In real dynamic sequences having the object-motion occlusion or relatively large
motions, artifact reduction as well as enhancement of image quality is needed. The proposed POCS-based algorithm
reduces the degradation caused by the motion compensation (MC) error, where the motion confidence map is used to
find the LR pixel with a relatively small MC error. The decision bound in the motion confidence map corresponds to the
bound in residual computation in the conventional POCS algorithm. Also, the proposed algorithm utilizes the directional
information at edges in reconstructing the SR image, in which the four-directional projection function is used to resolve
the desired SR pixel. Finally, the over-compensation is avoided in the iteration process by adding the appropriate
constraint. Experimental results with several test sequences show the effectiveness of the proposed algorithm.
In this paper, we present a frame rate up-conversion method for ultrasound image enhancement. The inherent flexibility of ultrasound imaging and moderate cost without known bio-effects give ultrasound a vital role in the diagnostic process compared with other methods. The conventional mechanical scan method for multi-planar images has a slow frame rate. In the proposed frame rate-up conversion method, new interpolated frames are inserted between two input frames, giving smooth renditions to human eyes. Existing methods employing blockwise motion estimation show block artifacts, in which motion vectors are estimated using a block-matching algorithm (BMA). We propose an optical flow based method to find pixelwise intensity changes that yields more accurate motion estimates for frame interpolation. Consequently, the proposed method can provide detailed and improved images without block artifacts. Interpolated frames may contain hole or overlapped regions due to covered or uncovered areas in motion compensation. Those regions can be easily eliminated by a post processing, in which the similarity of pixel intensity is employed with a ray casting based method. Experimental results with several sets of ultrasound image sequences show the effectiveness of the proposed method.
KEYWORDS: Ultrasonography, 3D image processing, 3D acquisition, Digital filtering, Motion estimation, 3D image enhancement, Linear filtering, Data acquisition, Computer aided diagnosis and therapy, Fetus
In this paper, we present a motion compensated frame rate up-conversion method for real-time three-dimensional (3-D) ultrasound fetal image enhancement. The conventional mechanical scan method with one-dimensional (1-D) array converters used for 3-D volume data acquisition has a slow frame rate of multi-planar images. This drawback is not an issue for stationary objects, however in ultrasound images showing a fetus of more than about 25 weeks, we perceive abrupt changes due to fast motions. To compensate for this defect, we propose the frame rate up-conversion method by which new interpolated frames are inserted between two input frames, giving smooth renditions to human eyes. More natural motions can be obtained by frame rate up-conversion. In the proposed algorithm, we employ forward motion estimation (ME), in which motion vectors (MVs) ar estimated using a block matching algorithm (BMA). To smooth MVs over neighboring blocks, vector median filtering is performed. Using these smoothed MVs, interpolated frames are reconstructed by motion compensation (MC). The undesirable blocking artifacts due to blockwise processing are reduced by block boundary filtering using a Gaussian low pass filter (LPF). The proposed method can be used in computer aided diagnosis (CAD), where more natural 3-D ultrasound images are displayed in real-time. Simulation results with several real test sequences show the effectiveness of the proposed algorithm.
A two-class classification method of image patterns using principal component analysis (PCA) is proposed, in which classification is performed in the two-dimensional (2-D) space constructed by the reconstruction errors. The reconstruction error is computed using PCA for each assumed class. Training data sets are used to compute eigenvectors with which PCA reduces the dimensionality of the input vector space and reconstructs an input vector in the reduced space. The line equation with two parameters is defined as a linear decision boundary and these parameters are estimated by probabilistic approach. Also its application to face detection is experimented.
To manipulate large video databases, effective video indexing and retrieval are required. While most algorithms for video retrieval can be commonly used for frame-wise user query or video content query, video sequence matching has not been investigated much. In this paper, we propose an efficient algorithm to match the video sequences using the Cauchy function of histograms between successive frames and the modified Hausdorff distance. To effectively match the video sequences and to reduce the computational complexity, we use the key frames extracted by the cumulative measure, and compare the set of key frames using the modified Hausdorff distance. Experimental results show that the proposed video sequence matching algorithms using the Cauchy function and the modified Hausdorff distance yield the high accuracy and performances compared with conventional algorithms such as histogram difference and directed divergence methods.
KEYWORDS: Image retrieval, Data hiding, Image compression, Databases, Medical imaging, Digital watermarking, Image storage, Image encryption, Brain, Internet
With the wide spread of internet, image databases with large amount of information are used in various applications such as product briefing, personal information provider, and so on. These image databases show some difficulty in retrieving the information in real-time whereas other databases using the text information are retrieved relatively fast. In this paper, image retrieval method is proposed, in which the image is extracted using the hidden text information related to it. By invisible modification of images, the images are retrieved without any headers or separate files linked to them. This paper presents the robust data hiding method by separating the image into the edge and non-edge regions. The sum of the 8 X 8 image block is quantized by adding the pseudo noise depending on the data bit. The amount of the pseudo noise is controlled adaptively based on the gradient magnitude of the image bock. Real-time extraction of extra data by the proposed algorithm is desirable for the practical image retrieval system. Experiments with various test image sets show that the proposed data embedding algorithm is robust to joint photographic experts group compression.
KEYWORDS: Computer programming, Video, Video coding, Video compression, Optical engineering, Error analysis, Quantization, Video processing, Electronics engineering, Standards development
We propose a moving edge extraction using the concept of entropy and cross-entropy, in which the cross-entropy concept is applied to dynamic scene analysis. The cross- entropy concept provides enhancement of detection for the dynamically changed area. We combine the results of cross- entropy in the difference picture (DP) with those of entropy in the current frame so that we can effectively extract moving edges. We also propose the moving edge extraction method by combining the results of cross-entropy and those of Laplacian of Gaussian (LoG).
This paper presents a modified Hough transform (HT) by formulating the conjugate pair for line detection. By considering conjugate pair of the HT, a fast computing algorithm can be derived. The concept of conjugation is applied to the Radon transform, a generalized HT, and the Gabor transforms. Formulation of the conjugate pairs of 3D Hough transform is also presented.
To manipulate large video databases, effective video indexing and retrieval are required. While most algorithms for video retrieval can be used for frame-wise user query or video content query, video sequence matching has not been investigated much. In this paper, we propose an efficient algorithm to match the video sequences using the modified Hausdorff distance, and a video indexing method using the directed divergence of histograms between successive frames. To effectively match the video sequences and to reduce the computational complexity, we use the key frames extracted by the cumulative directed divergence, and compare the set of key frames, using the Hausdorff distance. Experimental results show that the proposed video sequence matching and video indexing algorithms using the Hausdorff distance and the directed divergence yield the remarkably high accuracy and performances compared with conventional algorithms such as histogram difference or histogram intersection methods.
KEYWORDS: Finite element methods, Optical fibers, Mirrors, Control systems, Data modeling, Protactinium, Temperature metrology, Performance modeling, Optical engineering, Telescopes
In the real world, many objects consist of curved surfaces, thus recognition of 3D objects with curved surfaces is to be investigated. In this paper, we present the shape histogram based algorithm for recognizing and grouping objects, in which the cross entropy between shape histograms is employed. Computer simulations with various synthetic and real images are presented to show the effectiveness of the proposed algorithm.
In this paper, a fast motion compensation algorithm is proposed that improves coding efficiency for video sequences with brightness variations. We also propose a cross entropy measure between histograms of two frames to detect brightness variations. The framewise brightness variation parameters, a multiplier and an offset field for image intensity, are estimated and compensated. Simulation results show that the proposed method yields a higher peak signal to noise ratio compared with the conventional method, with a greatly reduced computational load, when the video scene contains illumination changes.
This paper proposes a relaxation-based algorithm for detection of facial features such as face outline, eyes, and mouth. At first, a number of candidates for each facial feature are detected. To select a correct set of facial features from the candidates, probabilities and geometric relationships of each candidate location are considered, in which a relaxation algorithm is used for implementation. Simulation results with various test images are presented.
The accuracy of the iterative closest point (ICP) algorithm, which is widely employed in image registration, depends on the complexity of the shape of the object under registration. Objects with complex features yield higher reliability in estimating registration parameters. For objects with rotation symmetry, a cylinder for example, rotation along the center axis can not be distinguished. We derive the sensitivity of the rotation error of the ICP algorithm from the curvature of the error function near the minimum error position. We approximate the defined error function to a second order polynomial and show that the coefficient of the second-order term is related to the reliability of the estimated rotation angle. Also the coefficient is related to the shape of the object. In the known correspondence case, the reliability can be expressed by the second moment of the input image. Finally, we apply the sensitivity formula to a simple synthetic object and ellipses, and verify that the predicted orientation variance of the ICP algorithm is in good agreement with computer simulations.
A two-stage high-precision algorithm for detecting the orientation and position of the surface mount device (SMD) is described. In the preprocessing step, a coarse orientation of the SMD is obtained by line fitting. A high-precision fuzzy Hough transform (FHT) is applied to the corner points to estimate precisely the orientation of the device, with its position determined by using four detected corner points. The FHT employed has a real-valued accumulator over the limited range of angles that is determined in the preprocessing step. Computer simulation with a number of test images shows that the parameters obtained by the presented algorithm are more accurate than those by conventional methods such as the moment method, projection method, and Hough transform methods. It can be applied to fast and accurate automatic inspection and placement systems.
This paper investigates motion estimation and compensation in object-oriented analysis-synthesis coding. Object- oriented coding employs a mapping parameter technique for estimating motion information in each object. The mapping parameter technique using gradient operators requires high computational complexity. The main objective of this paper is to propose a hybrid mapping parameter estimation method using the hierarchical structure in object-oriented coding. The hierarchical structure employed constructs a low- resolution image. Then six mapping parameters for each object are estimated from the low-resolution image and these parameter values are verified based on the displaced frame difference (DFD). If the verification test succeeds, the parameters and object boundaries are coded. Otherwise, eight mapping parameters are estimated in a low-resolution image and the verification test is again applied to an image reconstructed by estimated parameters. If it succeeds, the parameters and object boundaries are coded, otherwise, the regions are coded by second-order polynomial approximation. Theoretical analysis and computer simulation show that the peak signal to nose ratio (PSNR) of the image reconstructed by the proposed method lies between those of images reconstructed by the conventional 6- and 8-parameter estimation methods with reduction of the computation time by a factor of about four.
In transmission of image sequences over ATM networks, the channel sharing efficiency in packet loss conditions is important. As one of possible approaches two-layer video coding methods have been proposed. These methods transmit video information over the network with different levels of protection with respect to packet loss. In this paper, a two-layer coding method using pyramid structure is proposed and several realizations of two-layer video coding methods are presented and their performances are compared.
We propose an effective segmentation and recognition algorithm for range images. The proposed recognition system based on the hidden Markov model (HMM) and back-propagation (BP) algorithm consists of three parts: segmentation, feature extraction, and object recognition. For object classification using the BP algorithm we use 3D moments, and for surface matching using the HMM we employ 3D features such as surface area, surface type and line lengths. Computer simulation results show that the proposed system can be successfully applied to segmentation and recognition of range images.
The structure extraction task is analyzed. The co-occurrence matrices (CMs) are the popular basis for this goal. We show that binary preparation of arbitrary texture preserves its structure. This transformation decreases the computation time of analysis and the required memory in dozens times. A number of features for detecting displacement vectors on binarized images are compared. We suggest to use CM elements jointly as the united feature for this goal. We have shown that it is a stable detector for noisy images and simpler than well- known (chi) 2 and (kappa) statistics.
In this paper, we propose an efficient stereo matching algorithm using morphological filtering and finger print on the scale space. We propose a morphological filter using a Gaussian structure element, which has lower computational complexity than conventional Gaussian filtering with similar performance. In stereo matching, we propose a coarse-to-fine feature-based method to minimize the effect of mismatching and noise by scale change. In the proposed stereo matching algorithm, we use the loci of zero-crossing points in the left and right images, as the robust matching features, and dynamic programming for feature correspondence. Computer simulation results with several test images show the effectiveness of the proposed feature-based stereo matching algorithm using the finger prints on the scale space.
In this paper, we propose a block matching algorithm (BMA) using a genetic algorithm. The genetic algorithm was inspired by an information processing scheme which is used by nature. To use the genetic algorithm in 2D block matching, we encode, based on a quad-tree structure, the phenotype representing a motion vector, i.e., the genotype is represented by four symbol strings. The probability of mutation is differently set for each position in a symbol string. Computer simulation results show that we can have the peak signal to noise ratio (PSNR) of the proposed genetic-based BMA comparable to that of the three step search (TSS) or full search (FS) by varying the number of search points.
We propose a two-layer sequence image coding algorithm based on residual block matching using fractal approximation. First, the motion compensation (MC) error signal is encoded by the discrete cosine transform (DCT). The motion vector and DCT coefficients are transmitted as the first layer and the residual signal of MC/DCT is encoded by fractal approximation and transmitted as the second layer. The second layer is encoded by the matching block selected from a dynamic residual pool. The reconstructed MC error image is used as a dynamic residual signal which is called a domain pool in conventional fractal coding. The computer simulation result by the proposed methods and the DCT-based methods shows that the performance improvement by the proposed method is significant.
We propose an efficient algorithm that recognizes handwritten Korean and English characters in a low-resolution document. For a user-friendly input system for low-resolution documents consisting of two different sets of characters obtained by a facsimile or scanner, we propose a document-recognition algorithm utilizing several effective features (partial projection, the number of cross points, and distance features) and the membership function of the fuzzy set theoty. Via computer simulation with several test documents, we show the effectiveness of the proposed recognition algorithm for both printed and handwritten and Korean and English characters in a low-resolution document.
A multiresolution edge detection algorithm for speckle images is proposed. Due to the signal dependence of speckle noise, the vanance of a speckle image depends on the local average intensity; thus an edge detection method independent of the local average intensity is desirable for correct extraction of real, significant changes in an original signal. In the proposed method, each area having different resolution is first classified according to the statistical properties of a speckle image, namely, a discontinuity measure such as the ratio of variance to mean square or the maximum difference between the real and theoretical cumulative density functions. Then the real edges are extracted in a multiresolution environment. Computer simulation with several test images shows that the proposed method significantly reduces false edges in relatively homogeneous areas while detecting fine details properly. Also, simulation results from the conventional edge detection methods for speckle images are compared with those of the proposed method.
The recognition of raised alphanumeric markings on rubber tires for their automatic classification is presented. Raised alphanumeric markings on rubber tires show different characteristics from those of printed characters. In the preprocessing step of the proposed method, we first determine the slope of an arc, along which alphanumerics are marked, using the Hough transform, and align them horizontally. Then we separate each character using vertical and horizontal projections. In the recognition step, to recognize characters hierarchically we use several effective features, such as width of a character, number of cross points, partial projections, and distance features. Computer simulation results show that the proposed system can be successfully applied to automatic classification of rubber tires.
A recovering technique of an elevation map by stereo modeling of the real aerial image sequence is presented. An area-based stereo matching method is proposed and parameter values are experimentally chosen. In a depth extraction step, the depth is determined by solving a vector equation suitable for stereo modeling of aerial images that do not satisfy the epipolar constraint. Also, the performance of the conventional feature-based method is compared via computer simulations. Finally, techniques analyzing the accuracy of the recovered elevation map (REM) are discussed. The experimental results based on error performance show the efficiency of the proposed method.
Conventional polygonal approximation techniques have a fixed parameter that makes it difficult to extract the minimum number of critical points that represent faithfully complex contours having various details. We propose multistep polygonal approximation algorithms that integrate line segments detected at different scales or resolutions by means of scale or resolution partitioning of contour patterns. Via a computer simulation, we show that the proposed multistep methods approximate the contours better than conventional methods having a fixed parameter.
An efficient algorithm is proposed that recognizes a mixed document consisting of printed Korean/alphanumeric text and graphic images. In the preprocessing step, an input document is skew-normalized, if necessary, by rotating it by an angle detected with the Hough transform. Then we separate the graphic image parts from the text parts by considering chain codes of connected components. We further separate each character using vertical and horizontal projections. In the recognition step, a mixed text consisting of two different sets of characters, e.g. , Korean and alphanumeric characters is recognized. Korean and alphanumeric characters are classified and each is recognized hierarchically using several effective features. The output is obtained by combining the recognized characters and separated graphic parts. An efficient automated analysis algorithm for mixed documents consisting of graphic images and two different sets of characters is proposed and its performance is demonstrated via computer simulation.
This PDF contains the communication for "Orientation and position detection of surface-mounted devices and printed circuit boards using the high-precision fuzzy Hough transform."
In this paper, a high-resolution algorithm for detecting the orientation and position of an IC, and an algorithm for compensating the position and skew angle of a PCB, are proposed. The proposed algorithm for the first topic consists of two parts. Its first part is a preprocessing step, in which corner points of an IC are detected and are separated into two groups. Then the coarse angle of the principal axis is obtained by line fitting. The second part is a main processing step, in which the Hough transform over the limited range of angles is applied to the corner points to detect precisely the orientation of an IC or a surface mounting device (SMD). The position of an IC or SMD is determined by using its four corner points. The proposed algorithm for the second topic is the one which detects a rotation angle and translation parameters of a PCB using a template matching method. The PCB is compensated by the detected parameters. The computer simulation shows that the parameters obtained by proposed algorithms are more accurate than those by the several conventional methods considered. The proposed algorithms can be applied to the fast and accurate automatic inspection systems.
A segmentation-based video coding technique is presented. A change detector is used to find the regions of interest, i.e., moving areas, and then the moving areas are segmented based on the motion vector information. For the regions where the prediction error, defined by the difference between the original image and the motion-compensated image reconstructed by the motion vector, is large, the frame difference signal is used to supplement the segmentation results. Depending on the region characteristics, each region is represented by the motion vector and the frame difference signal. After postprocessing, the region information and the region boundaries are transmitted. Computer simulation with video sequences shows that the proposed algorithm reduces the trailing effect and gives better performance than the conventional method.
A simple method to recognize the printed alphanumerics is discussed. The proposed method is a simple rule-based structural method to recognize printed alphanumerics of image scanner data based on the thinning operation. This paper also presents major achievement made toward the development of a fast hierarchical recognition scheme for the printed and handwritten facsimile data. The conventional thinning techniques give good results for high-resolution image scanner data, but they suffer drawbacks for low-resolution data. Our scheme recognizes 55 characters per second on the IBM PC/386 environment and the recognition rate is 98%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.