Parsed and fixed block representations of visual information for image retrieval

Soo Hyun Bae; Biing-Hwang Juang

doi:10.1117/12.811572

10 February 2009 Parsed and fixed block representations of visual information for image retrieval

Soo Hyun Bae, Biing-Hwang Juang

Proceedings Volume 7240, Human Vision and Electronic Imaging XIV; 724017 (2009) https://doi.org/10.1117/12.811572
Event: IS&T/SPIE Electronic Imaging, 2009, San Jose, California, United States

Abstract

The theory of linguistics teaches us the existence of a hierarchical structure in linguistic expressions, from letter to word root, and on to word and sentences. By applying syntax and semantics beyond words, one can further recognize the grammatical relationship between among words and the meaning of a sequence of words. This layered view of a spoken language is useful for effective analysis and automated processing. Thus, it is interesting to ask if a similar hierarchy of representation of visual information does exist. A class of techniques that have a similar nature to the linguistic parsing is found in the Lempel-Ziv incremental parsing scheme. Based on a new class of multidimensional incremental parsing algorithms extended from the Lempel-Ziv incremental parsing, a new framework for image retrieval, which takes advantage of the source characterization property of the incremental parsing algorithm, was proposed recently. With the incremental parsing technique, a given image is decomposed into a number of patches, called a parsed representation. This representation can be thought of as a morphological interface between elementary pixel and a higher level representation. In this work, we examine the properties of two-dimensional parsed representation in the context of imagery information retrieval and in contrast to vector quantization; i.e. fixed square-block representations and minimum average distortion criteria. We implemented four image retrieval systems for the comparative study; three, called IPSILON image retrieval systems, use parsed representation with different perceptual distortion thresholds and one uses the convectional vector quantization for visual pattern analysis. We observe that different perceptual distortion in visual pattern matching does not have serious effects on the retrieval precision although allowing looser perceptual thresholds in image compression result poor reconstruction fidelity. We compare the effectiveness of the use of the parsed representations, as constructed under the latent semantic analysis (LSA) paradigm so as to investigate their varying capabilities in capturing semantic concepts. The result clearly demonstrates the superiority of the parsed representation.

Citation Download Citation

Soo Hyun Bae and Biing-Hwang Juang "Parsed and fixed block representations of visual information for image retrieval", Proc. SPIE 7240, Human Vision and Electronic Imaging XIV, 724017 (10 February 2009); https://doi.org/10.1117/12.811572

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available