We present an approach to perform automatic target detection of small targets from coregistered visual, thermal, and range images, using five features of value for target discrimination: Brightness, Texture, Temperature, Surface Planarity, and Height. For each, we proposed a set of operations to extract targets from the images, using inherent target properties that differentiate them from clutter. Each of the target extractors yields a `Target Measure' image, based on a specific feature. These, when combined appropriately, yield better results than those obtained by individual, single image detectors. Two methods are presented to perform information fusion on the target measure images: Binary Combination and Fuzzy Combination. Experimental results using both combination methods on synthetic and real imagery are given with very satisfactory results. A morphological operation called `erosion of strength n' is introduced and utilized as a powerful tool for removal of spurious information in binary images.
KEYWORDS: Performance modeling, Infrared search and track, Data modeling, 3D modeling, Process modeling, Systems modeling, Image processing, Data processing, Algorithm development, Detection and tracking algorithms
This study develops and evaluates a new VHDL-based performance modeling capability for multiprocessor systems. The framework for this methodology involved modeling the following system aspects: processor characterization, and data set size. Initially, all aspects are specified at an abstract level, and eventually become specified at a detailed level through the process of verification and refinement of design assumptions. Processor characterization involves modeling the processor's speed, instruction set, and memory hierarchy. Task modeling is concerned with the execution time and instruction mix of software tasks within the systems Network characterization models bus protocols, topology, and bandwidths. Data set size refers to how much data is represented by the tokens used in the models. In this study, we applied and evaluated this methodology using both 2D and 3D IR search and track (IRST) algorithms. Two different candidate processors were investigated: IBM PowerPC 604 and Texas Instruments TMS320C80. For the 2D IRST algorithm, the abstract and detailed performance modeling results were obtained for both processors using partitioned data and pipelined algorithmic approaches. For the 3D IRST algorithm, abstract performance models for pipelined and parallelized implementations on the PowerPC were developed. These models examined the feasibility of the implementations, the potential risk areas, and laid the groundwork for detailed performance modeling.
A method to automatically detect targets from sets of pixel- registered visual, thermal, and range images is outlined. It uses operations specifically designed to work on the different kinds of images to explore the information given by each of them. Five features are used to distinguish the targets from the clutter; texture, brightness, temperature, surface planarity, and height. We describe two different schemes. The first one involves logically combining the results of individual detectors with binary detection information as output. A morphological operation called `erosion of strength n' is introduced and utilized as a powerful tool for removal of spurious information. The second scheme utilizes the concept of `images of interest' to combine the information provided by the different images. A simple linear combination of these images yields excellent detection results. The success of this scheme supports its suitability for other ATR (Automatic Target Recognition) problems.
We propose a design framework for perfectly reconstructed time-varying linear-phase paraunitary filter bands using a novel adaptive lapped transform (ALT). The ALT is based on the Generalized Lapped Orthogonal Transform (GenLOT) proposed by Queiroz. A time-varying filter bank is constructed through the factorization of the GenLOT into cascaded matrix stages. Variable length lapped transforms are subsequently generated by cascading a number of these matrix stages to build specific length filters. Several constraints on the design ensure perfect reconstruction and a fast implementation. An embedded ALT images codec is presented and the application of the ALT to the H.263 video coding standard is discussed. Preliminary results show that the ALT-based embedded image codec has a 1.61-2.35 and a 2.37-4.04 dB increase in peak signal-to-noise ratios (PSNR) compared to the JPEG image coding standard for the Lenna and Barbara test images, respectively.
KEYWORDS: Head, Video coding, Position sensors, Image segmentation, Low bit rate video, Computer programming, Analog electronics, Video compression, Video, Image quality
In this paper, we investigate the performance of a segmentation-based motion compensation algorithm for videoconferencing which uses the motion information obtained from a set of analog position sensors. The position sensors provide high resolution motion information which is used to perform an affine transformation-based motion compensation. The simulation results are presented in the form of displaced frame difference (DFD) variance improvement versus the usual block matching algorithm. The proposed algorithm has the following advantages: (1) a very small amount of motion information needs to be transmitted, (2) the error energy is concentrated mainly in the boundary regions which makes it less noticeable, (3) different regions of the frame can be transmitted with different quality, and (4) a smaller computational load than advanced techniques is required.
In this paper we investigate the advantages of using lossy encoding of the motion vectors (MV). This is done in the context of accommodating a larger amount of MV information that results if one uses a smaller size of the macroblock (MB). After estimating the improvement in the displaced frame difference (DFD) variance associated with the decrease of the size of the MB to 8 for salesman QCIF sequence, we look at the lossy DCT performance in the context of rate-distortion trade off. A theoretical argument in favor of the Walsh Hadamard transform is made and the experimental results back up the assumption that it performs better than the DCT. The influence of the size of the kernel is also investigated. A combination of the WHT with lossy techniques imbedded in the MV estimation process that improve the MV field redundancy is proved to perform best for the lossy transform case. Then, we look at VQ encoding of the MVs and the results are compared with the transform case in terms of rate- distortion performance and flexibility.
KEYWORDS: Video compression, Video, Image compression, Video processing, Video coding, Distortion, Computer engineering, Digital signal processing, Telecommunications, Control systems
Videoconferencing is an application of video compression that is rapidly expanding in terms of market acceptance and importance. A number of recent papers discuss methods for very low bit rate videoconferencing between 10 - 20 kbps. In addition, telephone lines with state-of-the- art data modems are capable of transmitting data between 20 - 28 kbps. Since videoconferencing requires the transmission and reception of audio and video, the video component is likely to be restricted to about 20 kbps when using telephone lines. While these recent papers indicate acceptable quality for sequences such as `Miss America' for these very low bit rates, they also indicate higher data rates required for other sequences, that are characterized by excessive movement and/or details. In this paper, we are introducing a framework in which sequences that fall in this category are processed so that they match better the characteristics of the coder. However, this cannot be accomplished without reducing the amount of information in the sequence. The key is to make this in such a way as to produce the least subjectively perceptible distortion. Our experiments have shown that the subjective quality of the resulting sequence after active coding and decoding is better than that of the one resulting from standard coding and decoding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.