We propose a novel unsupervised learning algorithm that makes use of image fusion to efficiently cluster remote sensing data. Exploiting nonlinear structures in multimodal data, we devise a clustering algorithm based on a random walk in a fused feature space. Constructing the random walk on the fused space enforces that pixels are considered close only if they are close in both sensing modalities. The structure learned by this random walk is combined with density estimation to label all pixels. Spatial information may also be used to regularize the resulting clusterings. We compare the proposed method with several spectral methods for image fusion on both synthetic and real data.
We present an efficient algorithm and theory for Geometric Multi-Resolution Analysis (GMRA), a procedure for dictionary learning. Sparse dictionary learning provides the necessary complexity reduction for the critical applications of compression, regression, and classification in high-dimensional data analysis. As such, it is a critical technique in data science and it is important to have techniques that admit both efficient implementation and strong theory for large classes of theoretical models. By construction, GMRA is computationally efficient and in this paper we describe how the GMRA correctly approximates a large class of plausible models (namely, the noisy manifolds).
Mapping images to a high-dimensional feature space, either by considering patches of images or other features, has lead to state-of-art results in signal processing tasks such as image denoising and imprinting, and in various machine learning and computer vision tasks on images. Understanding the geometry of the embedding of images into high-dimensional feature space is a challenging problem. Finding efficient representations and learning dictionaries for such embeddings is also problematic, often leading to expensive optimization algorithms. Many such algorithms scale poorly with the dimension of the feature space, for example with the size of patches of images if these are chosen as features. This is in contrast with the crucial needs of using a multi-scale approach in the analysis of images, as details at multiple scales are crucial in image understanding, as well as in many signal processing tasks. Here we exploit a recent dictionary learning algorithm based on Geometric Wavelets, and we extend it to perform multi-scale dictionary learning on image patches, with efficient algorithms for both the learning of the dictionary, and the computation of coefficients onto that dictionary. We also discuss how invariances in images may be introduced in the dictionary learning phase, by generalizing the construction of such dictionaries to non-Euclidean spaces.
KEYWORDS: Diffusion, Sensors, Signal processing, Filtering (signal processing), Electrodes, Image filtering, Wavelets, Data processing, Data fusion, Data modeling
We describe signal processing tools to extract structure and information from arbitrary digital data sets. In particular heterogeneous multi-sensor measurements which involve corrupt data, either noisy or with missing entries present formidable challenges. We sketch methodologies for using the network of inferences and similarities between the data points to create robust nonlinear estimators for missing or noisy entries. These methods enable coherent fusion of data from a multiplicity of sources, generalizing signal processing to a non linear setting. Since they provide empirical data models they could also potentially extend analog to digital conversion schemes like "sigma delta".
We apply a unique micro-optoelectromechanical tuned light source and new algorithms to the hyper-spectral microscopic analysis of human colon biopsies. The tuned light prototype (Plain Sight Systems Inc.) transmits any combination of light frequencies, range 440nm 700nm, trans-illuminating H and E stained tissue sections of normal (N), benign adenoma (B) and malignant carcinoma (M) colon biopsies, through a Nikon Biophot microscope. Hyper-spectral photomicrographs, randomly collected 400X magnication, are obtained with a CCD camera (Sensovation) from 59 different patient biopsies (20 N, 19 B, 20 M) mounted as a microarray on a single glass slide. The spectra of each pixel are normalized and analyzed to discriminate among tissue features: gland nuclei, gland cytoplasm and lamina propria/lumens. Spectral features permit the automatic extraction of 3298 nuclei with classification as N, B or M. When nuclei are extracted from each of the 59 biopsies the average classification among N, B and M nuclei is 97.1%; classification of the biopsies, based on the average nuclei
classification, is 100%. However, when the nuclei are extracted from a subset of biopsies, and the prediction is made on nuclei in the remaining biopsies, there is a marked decrement in performance to 60% across the 3 classes. Similarly the biopsy classification drops to 54%. In spite of these classification differences, which we believe are due to instrument and biopsy normalization issues, hyper-spectral analysis has the potential to achieve diagnostic efficiency needed for objective microscopic diagnosis.
Classically, analysis on manifolds and graphs has been based on the study of the eigenfunctions of the Laplacian and its generalizations. These objects from differential geometry and analysis on manifolds have proven useful in applications to partial differential equations, and their discrete counterparts have been applied to optimization problems, learning, clustering, routing and many other algorithms.1−7 The eigenfunctions of the Laplacian are in general global: their support often coincides with the whole manifold, and they are affected by global properties of the manifold (for example certain global topological invariants). Recently a framework for building natural multiresolution structures on manifolds and graphs was introduced, that greatly generalizes, among other things, the construction of wavelets and wavelet packets in Euclidean spaces.8,9 This allows the study of the manifold and of functions on it at different scales, which are naturally induced by the geometry of the manifold. This construction proceeds bottom-up, from the finest scale to the coarsest scale, using powers of a diffusion operator as dilations and a numerical rank constraint to critically sample the multiresolution subspaces. In this paper we introduce a novel multiscale construction, based on a top-down recursive partitioning induced by the eigenfunctions of the Laplacian. This yields associated local cosine packets on manifolds, generalizing local cosines in Euclidean spaces.10 We discuss some of the connections with the construction of diffusion wavelets. These constructions have direct applications to the approximation, denoising, compression and learning of functions on a manifold and are promising in view of applications to problems in manifold approximation, learning, dimensionality reduction.
Recent work by some of the authors presented a novel construction of a multiresolution analysis on manifolds and graphs, acted upon by a given symmetric Markov semigroup {Tt}t≥0, for which Tt has low rank for large t. This includes important classes of diffusion-like operators, in any dimension, on manifolds, graphs, and in nonhomogeneous media. The dyadic powers of an operator are used to induce a multiresolution analysis, analogous to classical Littlewood-Paley and wavelet theory, while associated wavelet packets can also be constructed. This extends multiscale function and operator analysis and signal processing to a large class of spaces, such as manifolds and graphs, with efficient algorithms. Powers and functions of T (notably its Green's function) are efficiently computed, represented and compressed. This construction is related and generalizes certain Fast Multipole Methods, the wavelet representation of Calderon-Zygmund and pseudo-differential operators, and also relates to algebraic multigrid techniques. The original diffusion wavelet construction yields orthonormal bases for multiresolution spaces {Vj}. The orthogonality requirement has some advantages from the numerical perspective, but several drawbacks in terms of the space and frequency localization of the basis functions. Here we show how to relax this requirement in order to construct biorthogonal bases of diffusion scaling functions and wavelets. This yields more compact representations of the powers of the operator, better localized basis functions. This new construction also applies to non self-adjoint semigroups, arising in many applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.