PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE
Proceedings Volume 6777, including the Title Page, Copyright
information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a wireless channel is viewed as a heterogeneous network in the time domain, and an adaptive video transmission scheme for H.264 scalable video over wireless channels modeled as a finite-state Markov chain processes is presented. In order to investigate the robustness of adaptive video transmission for H.264 scalable video over wireless channels, statistical channel models can be employed to characterize the error and loss behavior of the video transmission. Among various statistical channel models, a
finite-state Markov model has been considered as suitable for both wireless links as Rayleigh fading channels and wireless local area networks as a combination of bit errors and packet losses. The H.264 scalable video coding enables the rate adaptive source coding and the feedback of channel parameters facilitates the adaptive channel coding based on the dynamics of the channel behavior. As a result, we are able to develop a true adaptive joint source and channel based on instantaneous channel estimation feedback. Preliminary experimental results demonstrate that the estimation of the finite-state Markov channel can be quite accurate and the adaptive video transmission based on channel estimation is able to perform significantly better than the simple channel model in which only average bit error rate is used for joint source and channel coding design.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an efficient approach for supporting wireless video full interactive services. One of the main goals
of wireless video multicast services is to provide priority including dedicated bandwidth, controlled jitter (required by
some real-time and interactive traffic), and improved loss characteristics. The proposed method is based on storing
multiple differently encoded versions of the normal/interactive video streams at the server. The corresponding video
streams are obtained by encoding the original uncompressed video file as a sequence of I-P(I) frames and I-P(M) frames
using different GOP (Group Of Pictures) pattern. Mechanisms for controlling the normal/interactive request are also
presented and their effectiveness is assessed through extensive simulations. Wireless normal/interactive video services
are supported with considerably reduced additional delay and acceptable visual quality at the wireless client-end.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose an optimal routing discovery algorithm for ad hoc multimedia networks whose resource keeps
changing, First, we use stochastic models to measure the network resource availability, based on the information about
the location and moving pattern of the nodes, as well as the link conditions between neighboring nodes. Then, for a
certain multimedia packet flow to be transmitted from a source to a destination, we formulate the optimal soft-QoS
provisioning problem as to find the best route that maximize the probability of satisfying its desired QoS requirements in
terms of the maximum delay constraints. Based on the stochastic network resource model, we developed three
approaches to solve the formulated problem: A centralized approach serving as the theoretical reference, a distributed
approach that is more suitable to practical real-time deployment, and a distributed dynamic approach that utilizes the
updated time information to optimize the routing for each individual packet. Examples of numerical results demonstrated
that using the route discovered by our distributed algorithm in a changing network environment, multimedia applications
could achieve better QoS statistically.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an effective IPTV channel management method using SVC (scalable video coding) that considers concurrently both channel zapping time and network utilization. A broadcasting channel is encoded in two-layered bitstream (base-layer channel and
enhancement-layer channel) to supply for heterogeneous environments. The proposed algorithm locates only a limited numbers of base-layer channels close to users to reduce the network delay part of channel zapping time and adjusts the length of GOP (group of picture) into each base-layer channel to decrease the video decoding delay part of channel zapping time, which are performed based on user's channel preference information. Finally, the experimental results are provided to show the performance of the proposed schemes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we consider the delivery of layered video from parallel heterogenous servers within a video-on-demand
infrastruture. A parallel server architecture enables the service of requests by more than one server, thus
reducing load at individual servers and dispersing network load. Serving requests for a single video through all
or a subset of servers in the system reduces the probability of server overload brought about by a large number of
requests for popular content; more clients may also be admitted for the retrieval of video data. Delivery through
multiple servers requires that the video data be partitioned. Ideally, the data should be partitioned such that
multiple server retrieval provides the same download and access time performance possible when retrieving from
a single server of the same total bandwidth. We design and analyse play-while-retrieve strategies that involve
streaming layers from different servers and show how access time can be reduced through these strategies. While
system wide data striping can completely remove the problem of hotspotting, the method does not scale well
and problems may be encountered when the system grow in size or when heterogenous disks have to be used.
Since our proposed scheme takes into consideration heterogenous upload bandwidth and layer bitrates, it may
be suitable for a peer to peer network where peer upload bandwidth is limited and varied.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Advanced collaboration environments (ACEs) where remote participants can cooperate with natural interaction, the
display service plays the most significant role. With enhanced
wide-screen display support, the quality of experience
(QoE) improves since it allows participants to share visual as if they are having a face-to-face meeting. Also, it is
important to visualize diverse types of media over various display devices with different resolutions and sizes. Thus,
there have been several existing efforts for interactive and networked display supports for advanced collaboration. In this
paper, by extending, we propose a clustered networked display system for SMeet (Smart Meeting Space for ACE),
dubbed as SMeet One Display. The proposed SMeet One Display system has several key features such as 'support for
heterogeneous display devices as one virtual display' and 'enhanced efficiency in terms of system resource management'.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As the use of streaming media applications increased dramatically in recent years, streaming media security
becomes an important presumption, protecting the privacy. This paper proposes a new encryption scheme in view of
characteristics of streaming media and the disadvantage of the living method: encrypt the control message in the
streaming media with the high security lever and permute and confuse the data which is non control message according
to the corresponding control message. Here the so-called control message refers to the key data of the streaming media,
including the streaming media header and the header of the video frame, and the seed key. We encrypt the control
message using the public key encryption algorithm which can provide high security lever, such as RSA. At the same time
we make use of the seed key to generate key stream, from which the permutation list P responding to GOP (group of
picture) is derived. The plain text of the non-control message XORs the key stream and gets the middle cipher text. And
then obtained one is permutated according to P. In contrast the decryption process is the inverse process of the above.
We have set up a testbed for the above scheme and found our scheme is six to eight times faster than the conventional
method. It can be applied not only between PCs but also between handheld devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we design and implement a two-way real-time communication system for audio over cable television
(CATV) networks to provide an audio-based interaction between the CATV broadcasting station and CATV subscribers.
The two-way real-time communication system consists of a real-time audio encoding/decoding module, a payload
formatter based on a transmission control protocol/Internet protocol (TCP/IP), and a cable network. At the broadcasting
station, audio signals from a microphone are encoded by an audio codec that is implemented using a digital signal
processor (DSP), where the MPEG-2 Layer II audio codec is used for the audio codec and TMS320C6416 is used for a
DSP. Next, a payload formatter constructs a TCP/IP packet from an audio bitstream for transmission to a cable modem.
Another payload formatter at the subscriber unpacks the TCP/IP packet decoded from the cable modem into audio
bitstream. This bitstream is decoded by the MPEG-2 Layer II audio decoder. Finally the decoded audio signals are
played out to the speaker. We confirmed that the system worked in real-time, with a measured delay of around 150 ms
including the algorithmic and processing time delays.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper assesses the media synchronization quality of preventive control schemes employed at media sources
and media destinations for voice and video over a network. Preventive control is required to try to avoid
asynchrony (i.e., out of synchronization). We here deal with two preventive control techniques employed at
sources: Advancement of transmission timing of media units (MUs), each of which is the information unit for
media synchronization (e.g., a video picture), with network delay estimation and temporal resolution control
of video. We also handle three preventive control techniques employed at destinations: Change of buffering
time with network delay estimation, preventive pausing, and preventive shortening of output duration. By
experiment, we make a performance comparison among preventive control schemes which employ the preventive
control techniques at sources and destinations. We also clarify the relations between subjective and objective assessment results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Precise inter-media synchronization of transported media among multiple sessions is an important issue in high-quality media-based collaboration environments. Especially, well known uncompressed HD-based media requires extremely tight inter-media synchronization. In this paper, by employing localized clock synchronization, we propose precise inter-media synchronization scheme for multiple session-based uncompressed HD media transport. To avoid delayed clock synchronization message, minimum offset selection and filtering schemes are used for the proposed scheme. Moreover, to reduce receiver's buffer overburdening, multiple senders share receiver's buffer overhead by using frame-send-interval adjustment. These two approaches can make network-adaptive precise inter-media synchronization. From the simulation results, we verify that the proposed synchronization scheme satisfies synchronization requirements with high accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes an adaptive switching control scheme which dynamically switches between high-bit rate and
low-bit rate video files according to the network load for the playback of MPEG video stored on iSCSI disks.
The scheme adaptively switches read requests between the two video files by estimating the throughput from the
size of transmitted video data and the transmission time. By subjective assessment and objective assessment, we demonstrate the effectiveness of the scheme.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper examines the influences of network latency and packet loss on the playback quality of MPEG video on
iSCSI disks. The paper also explains the relationships between the video-playback quality and the transmission
speed, average bit rate of video, file-reading block size, and receive window (RWIN) size. Experimental results
indicate that the additional delay at which the average MU rate suddenly falls down increases as the RWIN size
becomes larger up to the amount of read-response data plus the number of bytes of the iSCSI response header.
In addition, we can reduce the deterioration in the video-playback quality by enlarging the file-reading block size.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The MPEG-4 BSAC (Bit Sliced Arithmetic Coding) is a fine-grain scalable codec with layered structure which consists
of a single base-layer and several enhancement layers. The scalable functionality allows us to decode the subsets of a full
bitstream and to deliver audio contents adaptively under conditions of heterogeneous network and devices, and user
interaction. This bitrate scalability can be provided at the cost of high frequency components. It means that the decoded
output of BSAC sounds muffled as the transmitted layers become less and less due to deprived conditions of network
and devices. The goal of the proposed technology is to compensate the missing high frequency components, while
maintaining the fine grain scalability of BSAC. This paper describes the integration of SBR (Spectral Bandwidth
Replication) tool to existing MPEG-4 BSAC. Listening test results show that the sound quality of BSAC is improved
when the full bitstream is truncated for lower bitrates, and this quality is comparable to that of BSAC using SBR tool
without truncation at the same bitrate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we develop a 3D audio reproduction scheme for the purpose of delivering audio over IP networks. In this
scheme, audio streams constructed at a server are composed of the TCP/IP header followed by multi-channel audio data
compressed by MPEG advanced audio coding (AAC), and the decoded audio signals are played out on a stereo
loudspeaker system at the client. Since the audio source is recorded by a multi-channel microphone but the playout is
dedicated to stereo speakers, the quality mismatch between the multi-channel and the stereo system should be overcome.
As a potential solution, we first investigate the effect of 3D audio processing on the audio quality at the client by
applying a head-related transfer function (HRTF). Next, a crosstalk cancellation process is applied to the audio with 3D
effects in order to improve the immersion of the processed 3D effects on a stereo loudspeaker system. Finally, we
evaluate the performance of the 3D audio reproduction system in terms of the identification of an audio source and
quality comparison before and after applying the crosstalk cancellation technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we investigate the use of variable frame rate (VFR) analysis in automatic speech recognition (ASR). First,
we review VFR technique and analyze its behavior. It is experimentally shown that VFR improves ASR performance for
signals with low signal-to-noise ratios since it generates improved acoustic models and substantially reduces insertion
and substitution errors although it may increase deletion errors. It is also underlined that the match between the average
frame rate and the number of hidden Markov model states is critical in implementing VFR. Secondly, we analyze an
effective VFR method that uses a cumulative, weighted cepstral-distance criterion for frame selection and present a
revision for it. Lastly, the revised VFR method is combined with spectral- and cepstral-domain enhancement methods
including the minimum statistics noise estimation (MSNE) based spectral subtraction and the cepstral mean subtraction,
variance normalization and ARMA filtering (MVA) process. Experiments on the Aurora 2 database justify that VFR is
highly complementary to the enhancement methods. Enhancement of speech both facilitates the frame selection in VFR
and provides de-noised speech for recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a speaker diarization system developed at the Institute for Infocomm Research (I2R) for NIST Rich
Transcription 2007 (RT-07) evaluation task. We describe in details our primary approaches for the speaker diarization
on the Multiple Distant Microphones (MDM) conditions in conference room scenario. Our proposed system consists of
six modules: 1). Least-mean squared (NLMS) adaptive filter for the speaker direction estimate via Time Difference of
Arrival (TDOA), 2). An initial speaker clustering via two-stage TDOA histogram distribution quantization approach, 3).
Multiple microphone speaker data alignment via GCC-PHAT Time Delay Estimate (TDE) among all the distant
microphone channel signals, 4). A speaker clustering algorithm based on GMM modeling approach, 5). Non-speech
removal via speech/non-speech verification mechanism and, 6). Silence removal via "Double-Layer Windowing"(DLW)
method. We achieves error rate of 31.02% on the 2006 Spring (RT-06s) MDM evaluation task and a competitive overall
error rate of 15.32% for the NIST Rich Transcription 2007 (RT-07) MDM evaluation task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, the conformance of hypothetical reference decoder (HRD) is addressed for H.264 when there is jitter among the transmission of packets via a channel without packet loss but variation among transmission delay. The sending rate is decoupled
from the coding one. Both the jitter and the total size of coded bitstream are taken into consideration such that the values of buffer size and initial buffer delay are minimized, especially
when the sending rate is greater than the coding one. Sufficient conditions are derived for the conformance of a coded bitstream to a HRD at the constant delay. These conditions are then used to design iterative algorithms to determine a minimal buffer size and a minimal initial buffer delay for the decoder. A novel
interpolation method is also presented such that it is suitable for a wide range of sending rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to achieve high computational performance and low power consumption, modern microprocessors are usually equipped with special multimedia instructions, multi-threading, and/or multi-core processing capabilities. Therefore, parallelizing H.264/AVC algorithm is crucial in implementing real-time encoder on multi-thread
(or -core) processor. Also, there is a significant need for investigation on complexity reduction algorithms such as fast inter mode selection.
Multi-core system makes it possible to uniformly distribute workloads of H.264/AVC over a number of slower and simpler processor cores each consisting of single high performance processor. Therefore, in this paper, we propose a new adaptive slice size selection technique for efficient slice-level parallelism of H.264/AVC encoder on multi-core (or multi-thread) processor using fast inter mode selection as a
pre-processing. The simulation results show that the proposed adaptive slice-level parallelism has a good parallel performance compared to fixed slice size parallelism. The experiment methods and results can be applied to many multi-processor systems for real-time H.264 video encoding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we describe a fine granularity scalable (FGS) video coding scheme that refines both residue and
motion information in the quality layers. Significant gains can be achieved when each enhancement layer undergoes
the motion compensation, prediction process with its own motion vector field (MVF). However, a motion
refined FGS scheme involves a motion estimation process for each enhancement layer of the scalable video. Given
the high computational cost of motion estimation in H.264, encoders can be computational expensive to implement.
Our proposed scheme carries out a simplified motion refinement scheme for enhancement layers, exploiting
the correlation of motion information between successive layers through macroblock (MB) type refinement. By
restricting the MB type of FGS layer MB according to the MB type of base layer MB, time required for encoding
FGS layers can be reduced. Through controlling the macroblock modes of macroblock in both the base and the
enhancement layers, the encoding time can be substantially reduced with minimal impact on coding efficiency.
The encoder optimization scheme we describe is especially effective when encoding a video with a low bitrate
base layer and a large range of extractable bitrates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In spatial error concealment (SEC), methods like bilinear interpolation (BI) and directional interpolation (DI) are commonly used to estimate the missing pixel values resulting from losses occurring in video streams. Despite being able to preserve spatial smoothness, BI produces a blurring effect and is unable to preserve structural information. DI produces spurious edges in regions with no strong edges, resulting in visible artefacts. In this paper, we propose a SEC algorithm that addresses the above drawbacks by formulating a weighted sum of candidate macroblocks produced from DI and BI, with weights derived adaptively through local information. We demonstrate that the proposed algorithm offers visual improvements over both DI based algorithms and the SEC algorithm based on BI in the H.264/AVC reference software JM 12.0. Most importantly, this unique approach preserves edge information and spatial smoothness in the error concealed macroblock due to the integration of both BI and DI.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we developed and implemented an image capturing and processing system that equipped with capability of
capturing images from an input video in real time. The input video can be a video from a PC, video camcorder or DVD
player. We developed two modes of operation in the system. In the first mode, an input image from the PC is processed
on the processing board (development platform with a digital signal processor) and is displayed on the PC. In the second
mode, current captured image from the video camcorder (or from DVD player) is processed on the board but is displayed
on the LCD monitor. The major difference between our system and other existing conventional systems is that image-processing
functions are performed on the board instead of the PC (so that the functions can be used for further
developments on the board). The user can control the operations of the board through the Graphic User Interface (GUI)
provided on the PC. In order to have a smooth image data transfer between the PC and the board, we employed Real
Time Data Transfer (RTDXTM) technology to create a link between them. For image processing functions, we developed
three main groups of function: (1) Point Processing; (2) Filtering and; (3) 'Others'. Point Processing includes rotation,
negation and mirroring. Filter category provides median, adaptive, smooth and sharpen filtering in the time domain. In
'Others' category, auto-contrast adjustment, edge detection, segmentation and sepia color are provided, these functions
either add effect on the image or enhance the image. We have developed and implemented our system using C/C#
programming language on TMS320DM642 (or DM642) board from Texas Instruments (TI). The system was showcased
in College of Engineering (CoE) exhibition 2006 at Nanyang Technological University (NTU) and have more than 40
users tried our system. It is demonstrated that our system is adequate for real time image capturing. Our system can be
used or applied for applications such as medical imaging, video surveillance, etc.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
JPEG has been a widely recognized image compression standard for many years. Nevertheless, it faces its own
limitations as compressed image quality degrades significantly at lower bit rates. This limitation has been addressed in
JPEG2000 which also has a tendency to replace JPEG, especially in the storage and retrieval applications. To efficiently
and practically index and retrieve compressed-domain images from a database, several image features could be extracted
directly in compressed domain without having to fully decompress the JPEG2000 images. JPEG2000 utilizes wavelet
transform. Wavelet transforms is one of widely-used to analyze and describe texture patterns of image. Another
advantage of wavelet transform is that one can analyze textures with multiresolution and can classify directional texture
pattern information into each directional subband. Where as, HL subband implies horizontal frequency information, LH
subband implies vertical frequency information and HH subband implies diagonal frequency. Nevertheless, many
wavelet-based image retrieval approaches are not good tool to use directional subband information, obtained by wavelet
transforms, for efficient directional texture pattern classification of retrieved images. This paper proposes a novel image
retrieval technique in JPEG2000 compressed domain using image significant map to compute an image context in order
to construct image index. Experimental results indicate that the proposed method can effectively differentiate and
categorize images with different texture directional information. In addition, an integration of the proposed features with
wavelet autocorrelogram also showed improvement in retrieval performance using ANMRR (Average Normalized
Modified Retrieval Rank) compared to other known methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Usually we accept it as a fact that one person's left and right hands are symmetric in some degree. But we don't know
exactly how similar they are. In this paper, we designed an experiment to illustrate it in numbers that two palms from one
person are more similar than two palms from two persons. This similarity may enable us to register on a palmprint
verification system with one hand and go through the system with the other hand. We also designed another interesting
experiment to tell that when doing personal verification by looking at the palmprint pictures, human beings cannot be
100 percent correct as we assumed before. Under certain circumstance, the machine can do a better job than a person.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a new online palmprint verification method, providing a more convenient way to users when
having palmprint images captured. Unlike other palmprint verification methods, which use positioning pods to fix the
location of a palm, we suggest capturing a palmprint without using any devices to locate the palm. Therefore it makes the
users more comfortable but requires better positioning algorithm to locate the palmprint automatically. Here we propose
an inscribe circle based palmprint positioning and interesting area extraction algorithm to deal with the palmprint
position variation introduced in capturing. We also suggest using the histogram stretching to eliminate the impact of
environment light variation. We suggest using the Niblack method to extract the principle lines on a palm and proposed a
bi-directional matching method for similarity measurement. The experimental results demonstrate the effectiveness of
our method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As ubiquitous computing technologies are rapidly evolving in the area of collaboration, user interactions are also expanding to extremely various styles of embedded computing systems. In order to keep up with this, we propose a unique design of Interaction Manager (IM) for Smart Meeting Space (SMeet), which is our prototype system for Advanced Collaborative Environment (ACE), to support advanced user interactions. In this paper, IM provides generic interfaces and managing methods for heterogeneous interaction devices. Furthermore, these interaction activities can be effectively shared with
remote-party collaborators by IM. The preliminary version of proposed IM is verified through experiments between two separately located SMeet nodes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a group (or inter-destination) synchronization scheme with prediction for haptic media in
a remote drawing system. The scheme aims to improve the group synchronization control, which adjusts the
output timing among multiple destinations, to keep the interactivity high. It outputs the position information by
predicting the future position later than received position information by a fixed amount of time. It also advances
the output time of position information at the local terminal by the same amount of time. By experiment, we
demonstrate the effectiveness of the scheme.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This research studies the Virtual Reality simulation for collaborative interaction so that different people from different places can interact with one object concurrently. Our focus is the real-time handling of inputs from multiple users, where object's behavior is determined by the combination of the multiple inputs. Issues addressed in this research are: 1) The effects of using haptics on a collaborative interaction, 2) The possibilities of collaboration between users from different environments. We conducted user tests on our system in several cases: 1) Comparison between non-haptics and haptics collaborative interaction over LAN, 2) Comparison between non-haptics and haptics collaborative interaction over
Internet, and 3) Analysis of collaborative interaction between
non-immersive and immersive display environments. The case studies are the interaction of users in two cases: collaborative authoring of a 3D model by two users, and collaborative haptic interaction by multiple users. In Virtual Dollhouse, users can observe physics law while constructing a dollhouse using existing building blocks, under gravity effects. In Virtual Stretcher, multiple users can collaborate on moving a stretcher together while feeling each other's haptic motions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a 3D multimedia presentation tool that allows the user to manipulate intuitively only through the voice input and the gesture input without using a standard keyboard or a mouse device. The authors developed this system as a presentation tool to be used in a presentation room equipped a large screen like an exhibition room in a museum because, in such a presentation environment, it is better to use voice commands and the gesture pointing input rather than using a keyboard or a mouse device. This system was developed using IntelligentBox, which is a component-based 3D graphics software development system. IntelligentBox has already provided various types of 3D visible, reactive functional components called boxes, e.g., a voice input component and various multimedia handling components. IntelligentBox also provides a dynamic data linkage mechanism called slot-connection that allows the user to develop 3D graphics applications by combining already existing boxes through direct manipulations on a computer screen. Using IntelligentBox, the 3D multimedia presentation tool proposed in this paper was also developed as combined components only through direct manipulations on a computer screen. The authors have already proposed a 3D multimedia presentation tool using a stage metaphor and its voice input interface. This time, we extended the system to make it accept the user gesture input besides voice commands. This paper explains details of the proposed 3D multimedia presentation tool and especially describes its component-based voice and gesture input interfaces.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a system to realize an on-line instruction environment among physically separated participants based on a
multi-modal communication strategy. In addition to visual and acoustic information, commonly used communication
modalities in network environments, our system provides a haptic channel to intuitively conveying partners' sense of
touch. The human touch sensation, however, is very sensitive for delays and jitters in the networked virtual reality
(NVR) systems. Therefore, a method to compensate for such negative factors needs to be provided. We show an NVR
architecture to implement a basic framework that can be shared by various applications and effectively deals with the
problems. We take a hybrid approach to implement both data consistency by client-server and scalability by peer-to-peer
models. As an application system built on the proposed architecture, a remote instruction system targeted at teaching
handwritten characters and line patterns on a Korea-Japan high-speed research network also is mentioned.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper deals with a remote haptic painting lesson system by which a teacher trains a student how to paint pictures or figures while conveying the sense of force interactively through a network. In the system, we introduce media synchronization control in order to achieve a high quality of haptic transmission. We make a quality
comparison of four media synchronization schemes (Virtual-Time Rendering (VTR), VTR with prediction, fixed buffering with prediction, and Skipping) by subjective assessment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Network delay in haptic-based CVEs (collaborative virtual environments) severely deteriorates the haptic interaction quality (e.g., larger force feedback than real). In order to compensate this delay effect, existing studies dynamically change spring and damper coefficients according to the network delay. However, it is difficult to choose proper coefficients to offset the delay effect by precisely reflecting virtual object characteristics. In this paper, a new delay-compensation scheme based on the force feedback prediction is proposed to improve the force feedback experience. By predicting the virtual object movements and force feedback, the proposed scheme in client side provides timely force feedback to a user. Then, it gradually converges to real (but delayed) information from the server in order to maintain the consistency of virtual environment. According to the experiment results, the proposed scheme can improve the haptic interaction quality by providing more realistic force feedback similar to that of no network delay.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper deals with two kinds of work in which two users manipulate different haptic interface devices from
each other. Although a number of haptic interface devices are currently available, the devices have different
specifications from each other. By experiment, we investigate the influences of difference in workspace size
between two devices (PHANToM Omni and PHANToM Desktop) on the efficiency of networked collaborative
work and the outcome of networked competitive work (i.e., a networked real-time game).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.