PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
The quality of VDUs, as experienced by operators of workstations, should be assessable on the basis of methods which have
found general approval. Therefore, methods of assessment and their mutual consistency are investigated. This paper contains a
comparison of results obtained by five different methods, luminance contrast being the independent variable. Fixation duration
and saccade length of the eye movements during search are compared with search velocity and scaled reading comfort. Reaction
times for word identification are also measured. Luminance contrast is varied over a range of about two decades, using both
positive and negative polarity. A considerable varialion in the variables was found and a high correlation between the subjective
and objective parameters was observed. Variations in fixation duration and in reaction times, reflecting changes in processing
time, are consistent. The relative effectivity and the ease of execution of the methods will be discussed. Search velocity and
scaled comfort seem to be suitable methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Two general characteristics of full-color display systems which are known to impact image quality include the ability of the
display system to transfer modulation (chromatic as well as achromatic) and the degree to which the display system adds noise
(chromatic and achromatic) to the signal. This paper describes a model of human spatial-chromatic vision and a corresponding
procedure for using the model to evaluate color display systems. Together the proposed model and procedure constitute a
color image quality metric which is responsive to the modulation transfer and noise generating characteristics of a display
system.
The proposed human vision model employs processing stages which simulate blurring by the optics of the eye, linear spectral
absorption by three classes of cone, addition of internal noise, nonlinear transduction by retinal mechanisms, derivation of
opponent-color images, and calculation of the responses of linear spatial mechanisms with finite spatial frequency and
orientation bandwidth. A summary of the modulation detection, discrimination, and suprathreshold contrast perception
performance of the model is presented and compared with human performance data from the visual science literature. A
procedure for evaluating display systems using the model is described and the results of several analyses of display systems are
presented.
High correlations between predictions made by the model and the results of image quality studies from the display design
literature have been obtained with no free parameters in the model. The results of the validation studies conducted so far
suggest that the proposed method for evaluating color display systems is viable and warrants critical examination.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Variable width (often called proportionally-spaced) fonts pack more characters, and hence more information,
into a line of text than do fixed width fonts. They are thus preferred by typographers, who use them as a
means of fitting more text on fewer pages. Does this higher density result in faster or slower reading speeds?
We compared maximum reading speeds on a CRT using identical characters under three conditions of
pitch: 1) fixed width (FW), each character centered in a constant horizontal space, 2) variable width (VW),
characters occupying only the space required to eliminate overlap, and 3) modified variable width (MVW),
average text density equated to that of the FW condition through the addition of inter-word microspace.
For small characters (close to the acuity limit), FW produced the fastest reading, with MVW yielding better
performance than VW pitch, indicating two kinds of "crowding" effects: one interfering with individual
character recognition and one interfering with word recognition. For medium and large characters (-0.25 to
6 deg height), performance was best with VW pitch, slowest with MVW pitch, and intermediate with FW pitch.
Hence dense text packing may improve performance with all but the smallest characters. Control experiments
using rapid serial visual presentation of text show that the higher text density and lower eye movement
requirements of VW text are responsible for its superiority at medium and large character sizes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Performance was measured on an editing task which required counting the number
of occurrences of an assigned letter in a paragraph of random letters. The task
was presented in three different display modes: (a) a video display (VDT) with
white characters on a black background, (b) a white-on-black photograph of the
VDT display, and (c) a black-on--white photograph of the VDT task display. The
viewing conditions for the three display modes were matched. Defocus was
introduced by cylindrical lenses (simulated astigmatism) and by plus lenses.
Performance was measured by time and accuracy in completing the counting task.
There were 19 normally-sighted young adult subjects tested with the task in the
three display modes under 6 levels of defocus.
For the hard copy displays, performance was significantly faster (on average by
6.6%) for black characters on a white background. Performance with the black
background photographs was consistently, but marginally (0.9%), faster than with
the VDT displays. Cylindrical defocus of 1.50 diopters substantially impaired
efficiency, but low-power plus lenses did not affect performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A current problem in perceptual image quality assessment is the evaluation of the visible effects of digital
image coding on the perceptual quality of images displayed on video screens. These effects are anticipated
to be too small to be assessed by the widely employed method of rating on a category scale consisting
of adjectives. A possible solution to this problem is to enhance the flexibility of category scaling by
using numbers instead of adjectives. In that case the category scale can, in principle, adapt to any given
quality range. In this paper experiments are described in which numerical category scaling has been used
to assess impairment of perceptual image quality due to quantization errors in scale-space coding. The
results show that (1) direct numerical category scaling is an efficient method for assessing slight effects
like the ones usually encountered in digitally coded images, (2) direct category scaling and a scaling
procedure in accordance with functional measurement theory end in the same functional relationship
between impairment and degree of quantization, and (3) unrelated impairments add up to form the
overall impression of impairment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Two investigations were set up into the dependence of the human sharpness impression in complex scenes
on velocity and resolution. Sequences, in which a portrait was moving horizontally at a constant speed,
were presented on a high-resolution monitor (velocities between 0.5 and 35 deg/s). Both a maximum
resolution version and bandwidth-reduced versions were presented. Subjects assessed the (subjective)
sharpness of the stimuli on a categorical scale. The results show that for the maximum resolution
images, there is hardly any change in perceived sharpness as a function of velocity. Furthermore, for the
low resolution images, we find an increase in sharpness with velocity, which implies that the perceived
sharpness range is compressed at those velocities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual Models: Spatial Vision and Spatiotemporal Interactions
This study is an attempt to understand the major functions of early vision by considering how
various components cooperate in preparing visual information to organize our perceptions of the
world around us. To this end, and with the cooperation of SRI's Machine Vision Group, we have
assembled some of these functions in a computational working model, which can graphically
display the spatial structure of the information at a given stage of the visual process, in the form of a
two-dimensional intensity array (or "image"). The development of such a capability should facilitate
the study and comparison of retinal and cortical inputs and outputs of spatial information.
The individual components of the model are well known, and the relations among them are based
on available data from the literature. However, two aspects of this study seem novel. One is the
exploitation of powerful, state-of-the-art tools of computational vision, such as Symbolics
3600-series LISP machines, to create and display our results. These tools were developed primarily
for artificial intelligence purposes; they have rarely been used for basic studies in human vision.
Another important novelty is the combination, into a single, integrated emulation, of the following
properties of the visual process:
. Inhomogeneous filtering by retinal receptive fields. . Re-mapping of visual space by the retinocortical projection. . Image analysis by receptive fields of the striate cortex. . Multiple fixations of a single scene.
Each of these mechanisms has been studied in detail previously, but they have scarcely been
interrelated. Taking a different approach, we use a relatively broad-brush description of each to
study how they could all behave in concert.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A computer simulation of a static model of bipolar cells (BC) of the human
central fovea is presented. Data from human observers and primate experiments
are incorported when available. Resolution (two-point discrimination) is
quantified in the space and frequency domains. Nonpreference for orientation,
i.e. , resolution does not change with orientation of two bars separated by a
variable gap, is optimal for a specific fixed cone matrix and a BC receptive
field organization. Resolution increases asymptotically as width, length, or gap
between two bars increases. There is a critcial size of the two-bar stimulus
above which resolution is independent of cone matrix and BC receptive field
organization. Resolution changes systematically with color and intensity
contrasts. There is a good correlation between resolutions determined in the
space and frequency domains. The computer simulation is used to determine the
parameters for optimal resolution of symbology such as the alphabetic
characters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An algorithm is described for learning image interpolation functions for sensor arrays
whose sensor positions are somewhat disordered. The learning is based on failures of
translation invariance, so it does not require knowledge of the images being presented to the
visual system. Previously reported implementations of the method assumed the visual system
to have precise knowledge of the translations. We demonstrate here that translation estimates
computed from the imperfectly interpolated images can have enough accuracy to allow the
learning process to converge to a correct interpolation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An object-oriented model for the brightness perception of static images is presented. The philosophy
behind the model is that the visual system aims at, a representation of brightness that displays object
properties and is consequently insensitive to variations in light source and viewing conditions. The model
assumes an ensemble of neural units that differ in receptive field size, that is, identical operations are
performed at a number of spatial scales. The operating characteristics reflect the behaviour of typical
cells in the visual pathway, such as ganglion cells, and are robust against variations in light level. The
brightness at each retinal position is the weighted sum of neural activities that exist at this position
in the different scales. The weighting function is such that the brightness impression is robust against
variation in viewing distance.
In this paper it is shown that the model is able to unify different aspects of brightness perception such
as brightness induction, brightness assimilation and the perception of different brightness illusions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual Models: Spatial Vision and Spatiotemporal Interactions
Empirical evidence from both psychology and physiology stresses the importance of inherently
two-dimensional signals and corresponding operations in vision. Examples of this are the existence of
"bug-detectors" , hypercomplex and dot-responsive cells, the occurence of contour illusions, and interactions of
patterns with clearly separated orientations. These phenomena can not be described, and have been largely
ignored, by common theories of size and orientation selective channels. The reason for this is shown to be
located at the heart of the theory of linear systems: their one-dimensional eigenfunctions and the "or"-like
character of the superposition principle. Consequently, a nonlinear theory is needed. We present a first
approach towards a general framework for the description of 2D-signals and 2D-cells in biological vision.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Human vision is able to discriminate chromatic from achromatic changes under a wide variety of
spatiotemporal conditions. Objectively, we distinguish physically between a light whose spectral
energy distribution does not vary in space or time (except for a constant factor), and a light whose
spectral distribution changes shape significantly, in either space or time or both. On the subjective
level, we can visually discriminate spatiotemporal changes in hue from spatiotemporal changes in
lightness. The general question of how to relate these subjective discriminations between chromatic
and achromatic variations to the objective ones is still one of the basic problems at the core of visual
science. Zone theories have generated opponent-color models that provide a useful framework to
analyze these problems. According to those models, post-receptor processing is separated into
independent channels, two color-opponent and one achromatic or luminance channel. Physiological
and psychophysical data on spatiotemporal properties of chromatic and achromatic channels
challenge the notion of chromatic-achromatic independence at an early stage, implying at least one
intermediate process where the two types of information are intermixed. This intermediate stage can
be modeled as an opponent multiplexing process. The multiplex model suggests decoding operations
to separate chromatic and achromatic information at more central stages; and the implementation of
those operations entails the generation of orientation selectivity. It is concluded that chromaticachromatic
independence, a primal characteristic of human vision, must be implemented at stages
located more centrally than previously thought. More generally, opponent multiplexing and the
decoding algorithm are valid principles for any number of dimensions, which suggests that
information processing other than vision could be studied from this perspective.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the biggest challenges confronting color matrix display (CMD) image quality is the presence
of spatial quantization or aliasing artifacts that manifest themselves as stair-steps, roping, and/or
tessellations. A considerable amount of research has been carried out investigating techniques,
frequently referred to as anti-aliasing algorithms, which attempt to minimize the appearance of these
annoying artifacts through the use of gray-scale. While some of this work has focused on the
minimum number of gray levels sufficient for yielding acceptable image quality in CMDs, little, if any
empirical research has been done on determining the optimum luminance interval or ramp function to
use for gray-scale anti-aliasing algorithms in these displays.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Current imaging display systems are capable of digitizing an image to eight
bits of gray scale (256 levels). For demanding imaging applications such as
X-ray images or satellite images more levels of gray scale may be required
to extract fine details in the complex images. Obviously the digitization
process should not result in the loss of critical information. Conversely the
extension of gray scale capability beyond eight bits is associated with a
marked increase in system expense and development resources. The
desire, then, is to provide all the visual information that the human
observer is capable of detecting without over-designing the system beyond
the capacity of human vision.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual Models: Spatial Vision and Spatiotemporal Interactions
The information gathering capacity of the visual system can be specified in units of bits/mm2. The fall-off in
sensitivity of the human visual system at high spatial frequencies allows a reduction in the bits/mm2 needed to specify an
image. A variety of compression schemes attempt to achieve a further reduction in the number of bit/mm2 while
maintaining perceptual losslessness. This paper makes the point that whenever one reports the results of an image
compression study, numbers should be provided. The first is the number of bits/mm2 that can be achieved using
properties of the human visual system, but ignoring the redundancy of the image (entropy coding). The second number is
the bits/mm2 including the effects of entropy coding. The first number depends mainly on the properties of the visual
system, the second number includes, in addition, the properties of the image. The Discrete Cosine Transform (DCT)
compression method is used to determine the first number. It is shown that the DCT requires between 16 and 24
bits/mm2 for perceptually lossless encoding of images, depending on the size of the blocks into which the image is
subdivided. In addition, the efficiency of DCT compression is found to be limited by its susceptibility to interference from
adjacent maskers. The present analysis suggests that the visual system requires many more bits/mm2 than the results of
other researchers who find that .5 bits/mm2 are sufficient to represent an image without perceptible loss.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Humans can discriminate between certain elementary stimulus features in parallel, i.e.,
simultaneously over the visual field. I present evidence that, in man, vernier rnisalignments in
the hyperacuity-range, i.e., below the photoreceptor diameter, can also be detected in
parallel. This indicates that the visUal system performs some form of spatial interpolation
beyond the photoreceptor spacing simultaneously over the visual field. Vernier offsets are
detected in parallel even when orientation cues are masked: deviation from straightness is
an elementary feature of visual perception. However, the identification process, that
classifies each vernier in a stimulus as being offset to the right (versus to the left) is serial
and has to scan the visual field sequentially if orientation cues are masked. Therefore,
reaction times and thresholds in vernier acuity tasks increase with the number of verniers
presented simultaneously if classification of different features is required. Furthermore, when
approaching vernier threshold, simple vernier detection is no longer parallel but becomes
partially serial, or semi-parallel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An approach to generate a pulse density modulation (PDM) on rastered media is described1.
The procedure combines techniques to generate a PDM on spatial nonquantized
media with the standard error diffusion algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a hybrid color dithering scheme suitable for rendering continuous
tone color images on a CRT display with a small number (on the order of 16-256) of distinct
colors. Monochrome (especially bi-level) dithering techniques are well studied. Which
of these techniques extend naturally to color? We look at four classes of monochrome
dithering techniques and attempt to generalize each one, first to multiple gray-level (but
still monochrome) inks and then to a multiple color pallette. In the monochrome case,
we discover that texture introduced by the dithering process can significantly affect the
appearance of the image. We develop a scheme by which the user can control these texture
effects. The primary tradeoff is between very fine grained textures which depend critically
on the local gray level and relatively coarser, more obvious, textures which appear uniform
across the entire image. In the color case, we have the further complication of choosing a
color pallette. We deal primarily with the case where there are a small number of available
colors, and where the color pallette is not optimized separately for each image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Algorithms are investigated for the printing or display of color images at near original image quality with a
minimum number of output colors. Each algorithm consists of a quantizer possibly used in conjunction with haiftoning.
We consider both image independent and image dependent quantizers implemented in RGB or in the uniform
color space L*u*v*. The halftoning techniques that we use are multilevel extensions of error diffusion and ordered
dither. Image quality resulting from use of these algorithms is measured by subjective evaluation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method is described for reducing the visibility of artifacts arising in the display of
quantized color images on CRT displays. The method is based on the differential spatial sensitivity
of the human visual system to chromatic and achromatic modulations. Because the
visual system has the highest spatial and temporal acuity for the luminance component of an
image, we seek a technique which will reduce luminance artifacts at the expense of introducing
high-frequency chromatic errors. In this paper we explore a method based on controlling
the correlations between the quantization errors in the individual phosphor images. The luminance
component is greatest when the phosphor errors are positively correlated, and is minimized
when the phosphor errors are negatively correlated. The greatest effect of the correlation
is obtained when the intensity quantization step sizes of the individual phosphors have equal
luminances. For the ordered dither algorithm, a version of the method can be implemented by
simply inverting the matrix of thresholds for one of the color components.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The utilization of digital image sequences is becoming increasingly important in modern imaging applications, including
HDTV, interactive video, teleconferencing, telerobotics, and medical imaging. Due to the immense amount
of data in image sequences, high compression coding methods are critical for efficient transmission and storage.
Often, compression ratios exceeding 200:1 are required. Because these ratios are near the limits of conventional
coding methods, we are investigating alternative methods which take human visual perception into account. In this
way, sequences can be coded such that only the most perceptually important information is retained. Techniques of
this type, known as "second generation" coding methods, have proven very successful for the compression of single
images. In this paper, we show that these methods are also effective for image sequence coding, and that they are
capable of delivering the high compression ratios required for present and future applications. Five different sequence
coding methods, following this basic philosophy, are discussed: coding via 3-D split-and-merge, edge-based coding,
segmentation-based coding using Gibbs-Markov random fields, the application of the Gabor decomposition to coding,
and the use of polar separable quadrature mirror filters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a sub-band coder for true color images that uses an
empirically derived perceptual masking model to set the allowable quantization noiselevel
not only for each sub-band but also for each pixel in a given sub-band. The input
image is converted into YIQ space and each channel is passed through a separable
Generalized Quadrature Mirror Filterbank (GQMF). This separates the image's
frequency content into into 4 equal width bands in both the horizontal and vertical
dimension, resulting in a representation consisting of 16 sub-bands for each channel.
Using this representation, a perceptual masking model is derived for each channel.
The model incorporates spatial-frequency sensitivity, contrast sensitivity, and texture
masking. Based on the image dependent information in each sub-band and the
perceptual masking model, noise-level targets are computed for each point in a subband.
These noise-level targets are used to set the quantization levels in a DPCM
quantizer. The output from the DPCM quantizer is then encoded, using an entropybased
coding scheme, in either lxi , 1x2, or 2x2 pixel parts, based on the the statistics
in each 4x4 sub-block of a particular sub-band. One set of codebooks, consisting of
100,000 entries, is used for all images. A block elimination algorithm takes
advantage of the peaky spatial energy distribution of sub-bands to avoid using bits for
quiescent parts of a given sub-band. The resultant bitrate depends on the complexity
of the input image. For the images we use, high quality output requires bitrates from
0.25 to 1 .25 bits/pixel, while nearly transparent quality requires 0.5 to 2.5 bits/pixel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A common property of many contemporary image coding algorithms is that they code an image at a
number of resolutions. The different algorithms provide alternative solutions to how a low-resolution
image should be updated to a high-resolution image with a minimum of additional information. In most
available image coding techniques such as subband and transformation coding, both the low-resolution
image and the high-resolution information are derived by filtering and subsampling the original image.
The available coding algorithms differ mostly in how they accomplish this splitting of the original image
into different components, since similar quantization techniques are used in all cases to reduce the data
rate of these components. In this paper we present an alternative technique for coding the high-resolution
components. We argue that a low-resolution image deviates from the original image because it has to
satisfy additional local (symmetry) constraints. In general, all high-resolution components are required to
restore the local asymmetry of the original image. However, if the image is neither completely symmetrical
nor asymmetrical, as is often the case, then fewer components may be sufficient to restore the original
image. We find that the performance of a coding algorithm is mainly determined by how often the local
symmetry constraints fail and high-resolution information must be added. In the majority of the cases,
one high-resolution coefficient is sufficient to restore the original image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image decomposition via even- and odd-symmetric, size and orientation selective band-pass filters, as
suggested by the receptive field properties of visual cortical neurons, is well suited to image coding purposes.
Interpretation of the even/odd filter outputs as a complex ("analytic") signal offers the alternative of a polar
signal representation by a local amplitude and a local phase component. This is also indicated by our
measurement of a rotation symmetric shape of the two-dimensional probability density function (pdf) of the
even/odd filter outputs.
Our investigations into the properties of such a representation show that it provides an interesting separation of
the "amount of signal variation" (local amplitude) vs. the "type of signal variation" (local phase) . Furthermore
an efficient vector quantization procedure can be applied to the two-dimensional amplitude/phase vector. This
procedure divides the 2D signal space of the analytic signal into polar separable patches. Since phase
quantization errors are more tolerable at small amplitude levels local phase is quantized dependent on the
amplitude level. While typical pdf-optimized quantizers produce an increasingly higher amplitude resolution
towards very small amplitudes, human vision allows the application of an appropriate threshold which leads to an
"irrelevance zone" wherein obviously no phase information has to be coded. Using this coding scheme good
image quality can be obtained with about 0.8 bit/pixel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Through a series of experiments with one dimensional signals, this paper explores the effects of
various degrees of phase degradation for different lengths of sequences and for different choices in the set of
sampling frequencies on the quality (as measured by normalised mean-square-error - NMSE) of
reconstructed signals. Details of these experiments, the results from these experiments, and some theoretical
understanding are presented. The quality(NMSE) of a signal reconstructed from noisy Fourier transform
(FT) phases is found to depend on, amongst other, the length of the sequence and on the choice of the set
of frequencies used for sampling the FT phase.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Exploiting human visual limitations in image reconstruction significantly reduces computational complexity. Based
on a multiresolution pyramid image representation, direct and indirect exploitation of these limitations are attainable. In
this study, direct exploitation of the variable acuity feature of the human visual system is achieved through tracking the
viewer's fovea. Multiresolution images are reconstructed such that high resolution is assigned to a rectangular region,
centered at the fovea, with spatial resolution dropping gradually with eccentricity. Indirect exploitation makes use of the
human visual sensitivity to abrupt intensity changes (edges) in the image. Accordingly, high resolution need only be
preserved within 2X2 pixel neighborhood around the detected edges while low resolution is assigned elsewhere. The
amount of savings in the number of pixels rendered could be as high as 98% for the direct exploitation and may exceed
50% (depending on image edge density) for the indirect application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
If we try to predict flicker on visual displays on the basis of
classical data using classical linear models of flicker, we find
several inconsistences or even contradictions: 1. A considerable
fraction of the population of our observers has critical flicker-
fusion-frequencies (CFF) well above the equivalent classical
data. 2. The slope of the high frequency roll-off of the temporal
modulation function (derived from the CFF-distributions for two
different display modulations) is clearly less steep than the
slope of classical data. 3. The results of measurements of CFF
as a function of field size which were done both for displays and
light-boxes (with uniform field modulation) differ' considerably.
4. CFF may increase even if the temporal modulation is decreased.
This experiment may thus be considered as the "experimentum
crucis" against the classical linear models of flicker of Kelly
or De Lange. 5. For decreasing deflection speeds of the electron-
beam (with field-size and luminance held constant) , increasing
CFFs are found. All of the above findings may be reconciliated
if correlation detector cells, which receive signal input from
spatially separated receptive fields, are added to the neural
pathway.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The interpretation of line drawings is known to be very difficult, and has a long history in vision research.
However for certain restricted but important types of drawings we have been able to produce good 3-D
interpretations quite efficiently using only local image-plane computations. The types of drawings we can
handle are line drawings of 3-D space curves, for instance, a drawing of the 3-D path followed by a butterfly
or a line drawing of a potato chip.
Such line drawings are, of course, intrinsically ambiguous - there is simply not enough information in
the 2-D image to arrive at a unique 3-D interpretation. Despite this difficulty, there remains the fact that
for any given image all people see pretty much exactly the same 3-D interpretation (or sometimes a small
number of interpretations). People, therefore, must be bringing additional knowledge or assumptions to
the problem.
In this paper we show that by picking the smoothest 3-D space curve that is consistent with the image
data we can obtain a 3-D interpretation which is very similar to the people's interpretation. The teleological
motivation for selecting the smoothest 3-D space curve is that it is the most stable 3-D interpretation, and
thus in one sense the most likely 3-D interpretation. The process of computing the smoothest 3-D space
curve is carried out by simple, local processing that can be implemented by a neural network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The present paper describes a color classification method that partitions color image data into
a set of uniform color regions. The ability to classify spatial regions of the measured image into a
small number of uniform regions can be useful for several problems including image segmentation
and image representation. First, the input image data are mapped from device coordinates into an
approximately uniform perceptual color space. Colors are classified by means of cluster detection
in the uniform color space. The process is composed of two stages of basic classification and
reclassification. The basic classification is based on histogram analysis to detect color clusters
sequentially. The principal components of the color data are extracted for effective discrimination of
clusters. At the reclassification stage, the extracted representative colors by the basic classification
are reclassified on a color distance. The performance of the method is discussed in an experiment
using a picture of paper objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a new iconographic technique that can be used to create integrated displays of multiparameter
spatial distributions. One specific approach uses color coding to integrate up to three parameters.
Geometric codes provide for the display of more parameters than three, with potentially many different
forms of coding. Development of these displays requires an appropriately comprehensive and flexible
environment, which is provided by the Exvis system we have been developing. Three examples of
multipramameter distributions integrated with geometric coding are presented. The complexity and
challenge of conducting research to find effective iconographic codings are discussed, and some strategy
is suggested.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A number of observer based objective measures of image sharpness have been proposed: Higgins, Granger, Cohen &
Carison and Barten. Many of these measures, seemingly different in concept, are highly correlated with the
psychovisual perception of sharpness. We re-examine the basis for these sharpness measures, and will show that:
The signal power spectra of a number of real scenes exhibit a 1/u2 dependence. The corresponding amplitude spectra can
be interpreted as the frequency disiribution of spatial frequency content in average scenes. The relationship of these
results to simple models of image structure will be discussed.
The log spatial integration used in all the measures is equivalent to performing a linear spatial frequency integration
with a weighting function (1/u) given by the frequency of occurrence of information in real scenes.
While the forms of the visual MTF are substantially different, the measures belong to a class of generalized log
frequency sharpness measures that, for equivalent viewing conditions, provide equivalent objective measures of
sharpness.
The linear dependence of subjective sharpness with the generalized SQF makes this class of sharpness measures a useful
tool for product development in both the photographic and display industries.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We report on several experiments that we designed to study the relative strength of visual attributes in the perception of
texture. Our stimuli are composed of microelements that are arranged regularly in two-dimensional space. Each
microelement can be characterized by the conjunction of several attributes (shape, size, color, etc.). Although the spatial
regularity results in a "homogeneous" texture, we can manipulate the arrangement of selected attributes so as to break this
homogeneity by forming textural patterns, whose discriminability can be tested experimentally.
The stimuli allow flexibility in choosing the ways in which different attributes can be matched spatially to generate
patterns. We have used these stimuli to study the roles of color, orientation, luminance and polarity in forming textures.
Preliminary results indicate an inherent similarity between the mechanisms subserving texture perception and those
mediating the perception of motion; the latter were studied with a similar type of stimuli. Finally, we report on a method
for isolating the role of specific texture mechanisms by comparing the results of carefully selected experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper studies the computation of surface orientation by analyzing the responses of multiple spatio-spectrally localized
channel filters. Images containing textures that encode information about local surface orientation are decomposed into
narrowband sub-images possessing characteristic radial frequency and orientation properties. By analyzing the spatial variation in
the filter responses, information about the spatial variation in the pattern I texture can be elucidated and subsequently used to
estimate surface orientation. The channel filters used are Gaborfunctions, which have previously been applied successfully to
related problems in texture analysis, segmentation, and characterization. The Gabor functions are plausible approximations to
the responses of the highly oriented simple cells in mammalian striate cortex. They also possess important properties for the
local isolation and characterization of textures. In our approach, texture gradients are modeled as giving rise to pattern frequency
gradients that can be exiracted on a highly localized basis. A variational optimization procedure for estimating the pattern
frequency variation is implemented via a discrete relaxation procedure that is suitable for a massively parallel computation. The
result of the optimization procedure is a stable dense map describing the localized image frequency content. The computed image
frequency characteristics are then used to define a texture density measure used in a planar-surface approximation procedure,
yielding slant/tilt estimates describing the surface orientation. Experimental results support the theoretical derivations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have identified a class of stimuli which seem to tap into the basic human ability to identify and
name shapes. Using computer-generated stimuli, we found that patterns with low fractal dimension
contours evoked the perception of namable objects, and that this proportion is increased when a preattentive
criterion for examining the patterns is used. Furthermore, this result holds whether the patterns
are filled in shapes (e.g., cloud patterns) or simple edge contours.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In digital halftone printers, image artifacts (unwanted texture )
are produced by laser writer position errors. Examples of this type of error
are errors induced by imperfections in M-sided rotating polygon mirrors,
errors induced by pitch errors in lead screw positioners .The determination of
the tolerable position error requires a calculational method for estimating the
visibility of the position error induced texture. The method for estimating the
visibility of halftone dot texture proposed by Nasanen [1] is applied in this
paper to position error induced artifacts. In order to simplify the analysis, the
response characteristics of the material are considered to be binary. This
assumption greatly simplifies the mathematical description of the halftone dot
(or "pel" as it is sometimes called) since the transmissivity is now a constant
over the written region and writing only occurs when a threshold is exceeded
.( This is the model used by Melnychuck and Shaw [ 2] .)
In Section II the contrast detection model proposed by Quick [ 3 and applied
to digital halftones by Nasanen [ 1 1 .is discussed. In order to compare
different halftone patterns, the dot visibility calculation is used to select a
repeat size ("spatial period" ) so that the halftone dots are not detectable
when there is no positioning error. In Section III, the same model is then
applied to calculate the visibility of sinusoidal position errors. For raster
scanned,continuos tone images , Bestenreiner et al. [4] and Schubert[5] have
shown that the visibility of the error depends on the frequency of the
position error , and the ratio of the magnitude of the error to the repeat size.
The results presented below indicate that , for digital halftone printing, the
visibility of the error depends also on the pattern used if the dot size exceeds
the step size ( or the distance between addresable dots. ) Thus , a maximum
acceptable position error can be calculated for given printing conditions and
in some cases shown to be less than that calculated for raster scanned
(contone) images [4],[5].
For position errors that are described by a spectrum rather than a single
frequency, some of the simplifying assumptions used by Nasanen[1] have to be
reexamined. An extension of the contrast response model to the case in which
the position error is described by a frequency spectrum or a power spectral
density is proposed in SectionlV.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.