|
1.IntroductionMultisensor image fusion is the process of combining two or more images of a scene to create a single image that is more informative than any of the input images.1 Image-fusion technology is employed in numerous applications including visual interpretation, image drawing, geographical information gathering, and military target reconnaissance and surveillance. In particular, research into techniques for image fusion by contrast reversal in local image regions has important theoretical and practical significance.1 Image-fusion methods are classified as spatial- or transform-domain techniques. Spatial-domain methods are simple, but generally result in images with insufficient detail. Transform-domain strategies based on image-fusion arithmetic and wavelet transformations (WTs) represent the current state of the art. Wavelets can be used to resolve an original image into a series of subimages with different spatial resolutions and frequency-domain characteristics. This representation fully reflects local variations in the original image. In addition, WTs can affect multiresolution analysis,2,3 perfect refactoring, as well as orthogonal features.4 Image-fusion arithmetic based on WT coefficients can flexibly resolve multidimensional low-frequency and high-frequency image components. Wavelet transforms can also realize multisensor image fusion using rules that emphasize critical features of the scene.5,6 Traditional convolution-based WT methods for multiresolution analysis have been widely applied to image fusion for images with a large number of pixels, but the memory and the computational requirements for these techniques, and their Fourier-domain equivalents, can be substantial. Attempts to create more efficient algorithms in the transform domain have employed the lifting wavelet transform (LWT).7–9 Also known as the second-generation WT,10 the LWT is not dependent upon the Fourier transform. Rather, all operations are carried out in the spatial domain. Image reconstruction is achieved by simply adjusting the calculation and sign orders in the decomposition process,11 thereby reducing two-dimensional image data computation by half, and the data storage to about 75%. One important motivation for the use of WTs in image processing is their ability to segregate low-frequency content that is critical for interpretation. Traditional image-fusion methods are based on selecting these significant wavelet decomposition coefficients.12–14 Even with the effective separation and processing of low-frequency components afforded by WT decomposition, such an approach fails to take into full account the relationships among multiple input images. The result can be adverse fusion effects. Significant information can be lost when local area variance corresponding to pixels across images is small.8,9 Other algorithms use principal component analysis (PCA) to estimate the wavelet coefficients. This method works well in low-noise environments, but PCA breaks down when corruption is severe, even if only very few of the observations are affected.15 For example, consider the two PCA simulation results shown in Fig. 1. Suppose that the light line in Fig. 1(a) represents an object in an image, and that the “” markers represent samples of that object that have been corrupted by low-level Gaussian noise. The reconstruction of the object from the samples using the classical PCA approach is shown as a heavy line. The results of a similar experiment are shown in Fig. 1(b) where the PCA reconstruction is seriously in error as the result of a single noise outlier in the sampling process. To remedy shortcomings in the current methods, this paper presents an improved image-fusion algorithm based on the LWT. For low-frequency image components represented in the LWT decomposition, scale coefficients are determined through matrix completion16 instead of PCA. For the high-frequency detail and edge information, the LWT coefficients are chosen through self-adaptive regional variance estimation. 2.Matrix Completion and Robust Principal Component Analysis2.1.OverviewThe matrix completion problem has been the subject of intense research in recent years. Candés et al.17 verify that the -norm optimization problem is equal to -norm optimization under a restricted isometry property. Candés and Recht16 demonstrate exact matrix completion using convex optimization. The “nuclear norm” of the matrix , in which denotes the ’th largest singular value, can be used to approximate the matrix rank, . The method yields a convex minimization problem for which there are numerous efficient solutions. Candés and Recht16 prove that if the number, , of sampled entries obeys for some positive constant , then matrix can be perfectly recovered with probability , by solving a simple convex optimization problem.Lin and Ma15 report a fast, scalable algorithm for solving the robust PCA (RPCA) problem. The method is based on recovering a low-rank matrix with an unknown fraction of corrupted entries. The mathematical model for estimating the low-dimensional subspace is to find a low-rank matrix. The algorithm proceeds as follows: given a matrix with , the rank is the target dimension of the subspace. The observation matrix is modeled as in which is a subsampling projection operator and represents a matrix of unmodeled perturbations that is assumed sparse relative to .2.2.Matrix CompletionThe objective of matrix completion is to recover in the low-dimensional subspace the truly low-rank matrix from , under the working assumption that is zero. That is, we seek It has been shown that the solution to this convex relaxation represents an exact recovery of the matrix under quite general conditions.16 Further, the recovery is robust to noise with small magnitude bounds; that is, when the elements of are small and bounded. For example, if is a white noise matrix with standard deviation , and Frobenius norm , then the recovered will be in a small neighborhood of with high probability if .182.3.Robust Principal Component AnalysisConventional PCA is often used to estimate a low-dimensional subspace via the following constrained optimization problem: In the observation model Eq. (5), minimize the difference in the matrices and by solving where is the target dimension of the subspace, and the use of the Frobenius norm represents an assumption that the matrix elements are corrupted by additive i.i.d. Gaussian noise. PCA works well in practice as long as the magnitude of noise is small. To use PCA, the singular value decomposition (SVD) of is used to project the columns of onto the subspace spanned by the principal left singular vectors of .RPCA employs an identity operator and sparse matrix which differ from those in the matrix completion and PCA approach. Wright et al.19 and Candés et al.20 have shown that, for a sufficiently sparse error matrix, a low-rank matrix can be recovered exactly from the observation matrix by solving the following convex optimization problem: where is a positive weighting parameter. RPCA has been used for background modeling, removing shadows from face images, alignment of the human face, and video denoising.21,22In the present paper, RPCA is coupled with the “inexact augmented Lagrange multiplier” (IALM)15 method to determine the low-frequency LWT coefficients for fusion of corrupted images. The IALM method is described in Sec. 3.2 after introducing the general procedure. 3.Frequency-Domain Fusion Rules3.1.OverviewBy adopting separate fusion strategies for high- and low-frequency components, the WT can differentially preserve the critical features that accompany these separate bands. The procedure that exploits this property is shown in Fig. 2. The source images are converted to frequency-domain coefficients by the LWT. Frequency-band-dependent fusion rules are applied to the low- and high-frequency components of each image. The inverse lifting wavelet transform (ILWT) is used to reconstruct the fused image. 3.2.Low-Frequency Fusion Based on Inexact Augmented Lagrange MultiplierWeighted average coefficients are often employed to fuse low-frequency wavelet coefficients. This method is effective when the coefficients of the fused images are similar. However, when contrast reversal occurs in local regions of an image, this procedure results in a loss of image detail in the fused image due to reduced contrast. Further, erroneous or missing regions of corrupted images strongly affect PCA results. These inadequacies of the weighted average method and PCA provide the motivation for using RPCA to determine the weighting of low-frequency coefficients. There is ordinarily little difference in the low-frequency coefficient values extracted by the LWT from different images of the same scene. RPCA coefficients are used to represent low-frequency content in an attempt to preserve fidelity and coherency between the subbands. Algorithms have been developed in this research to solve the RPCA problem that is the basis for the recovery of the low-rank matrix and the estimation of the sparse matrix from the observation matrix . We employ the IALM method to compute the low-frequency subband coefficients. The method is sketched as follows. Let denote a set of corrupted images from sensors, and let be the corresponding set of low-frequency subimages computed using the LWT. is the number of LWT layers. For simplicity, we assume square images so that . Stack all columns of each into a single vector of dimension , then use these vectors as columns of a matrix . After normalizing the data, we denote by the element of , The cumulative low-frequency subimage matrix is modeled similarly to Eq. (3), in which denotes the noise-free and integrated low-frequency subimage sequence matrix, and denotes the sparse error matrix from which high-frequency content has been attenuated by the selection of LWT coefficients. The low-frequency LWT coefficients are similar across multiple subimages of the same scene. According to the model, is noise-free and will ideally, therefore, consist of identical columns. Accordingly, will be of low rank as required by the matrix completion procedure. Thus, can be estimated via matrix completion and RPCA by solving where the augmented Lagrange multiplier is In this equation, is an estimated positive weighting parameter representing the proportion of the sparse matrix in the low-rank matrix . The default value for this fraction is . is a positive tuning parameter balancing accuracy and computational effort. is the trace of the product and is the iterated Lagrange multiplier.A flowchart of the IALM algorithm is shown in Fig. 3. Definitions of the notation used in the flowchart appear in Table 1. The algorithm is recursive with superscript indicating the iteration number. The quantity is the recovered low-rank matrix for some sufficiently large , say . A reasonable strategy for transforming the resulting to the final low-frequency subimage is to unwrap its first column to form the original image structure. The final low-frequency subimage is denoted . Table 1Notation used in the IALM∂ algorithm.
In this process, is initialized to ; and is initialized to zero matrix as the same size of ; is initialized to where is the column size of ; tolerance for stopping criterion is initialized to ; and is set to zero for loop computation. 3.3.High-Frequency Fusion Based on Self-Adapting Regional Variance EstimationProcessing of high-frequency wavelet coefficients has a direct effect on salient details which affect the overall clarity of the image. As the variance of a subimage characterizes the degree of gray level change in a corresponding image region, the variance is a key indicator in processing of high-frequency components. In addition, there is generally a strong correlation among adjacent pixels in a local area, so that there is significant amount of shared information among neighboring pixels. When variances in corresponding local regions across subimages vary widely, a high-frequency fusion rule for selecting the source image of greatest variance has been shown to be effective at preserving image features.8,9 However, if the local variances of two source images are similar, this method can result in the loss of information by discarding subtle variations among different subimages. An empirical procedure has been developed in which a thresholding procedure is used to segregate local areas that have sufficiently large variance. This allows the entire set to be represented by the single maximum-variance set member. The selection of this difference threshold, , is discussed below. Let us return to the original set of images . Denote by the gray-scale value at pixel in the ’th image. Also let denote a matrix associated with image in which matrix element contains the normalized sample variance of the window of pixels centered on pixel . The normalized sample variance means that all variance values are in the interval [0,1]. Without loss of generality, we select images and with which to describe the steps of the high-frequency fusion algorithm:
In summary, IALM is used to determine the low-frequency component to be fused, and self-adapting regional variance is employed to estimate the high-frequency contribution. The fused wavelet coefficients are combined by ILWT to create the final result. 4.Experimental Results and Analysis4.1.Comparison of Robust Principal Component Analysis AlgorithmsTo validate the new procedure, four groups of experiments results are reported. The objective of the first is to compare the performance of RPCA algorithms with that of IALM. The results are shown in Table 2. Two mainstream algorithms are compared—singular value thresholding (SVT), accelerated proximal gradient with IALM. Table 2Comparison of RPCA algorithms.
In this table, the input dataset named observation matrix of Eq. (6) is of dimension . It has some random missing or broken pixels. For fair comparison, we set , the rank of , to , and define the normalized mean squared error (NMSE) as In Table 2, the column labeled #SVD indicates the number of iterations. The “times” column displays the number of seconds to run the algorithm. The oversampling rate is six, implying downsampling of the data appearing in the observation matrix, in which, indicates the number of degrees of freedom in the rank matrices: . elements from are then sampled uniformly to form the known samples in .16 Among the three algorithms, IALM exhibits superiority performance in all three measures. The results indicate that time increases proportionately with . Note, however, that #SVD is not dependent upon . 4.2.Fusion of Clean ImagesFor convenience, we will refer to the new algorithm as . The next two groups of experiments involve processing of left-focus–right-focus images and visible-light–infrared-light images, comparing different image-fusion algorithms with . The source images are not corrupted by noise or errors. The spline wavelet basis23 was selected for the LWT process. Through factorization, the equivalent lifting wavelet was obtained. The experimental results are shown in Figs. 4 and 5. The first group of source images involves those with eccentric focus, the second contains images of visible contrasting and infrared light. Fig. 4(a) shows a left-focused source image, whereas Fig. 4(b) is right-focused; Fig. 5(a) is a visible-light source image, while Fig. 5(b) uses an infrared source; in Figs. 4(c)–4(f) and 5(c)–5(f) are, respectively, the fusion results by the weighted average over low frequencies and the absolute value maximum method over high frequencies (WA_AM), weighted average over low frequencies and the local area maximum method over high frequencies (WA_AM), improved pulse-coupled neural networks (PCNN) method,24,25 and PCA-weighted over low frequencies, the self-adaptive regional variance estimation method over high frequencies (), and the algorithm developed in this paper (). The processed images empirically suggest that a clearer fused image is obtained through (). More detailed information is evident, e.g., in Figs. 4(e) and 4(f) in which the image information on the left edge of the large alarm clock is apparently richer than the same feature in the other three fused images. This also means that algorithm is equally effective to algorithm , even though the algorithm has more detailed information (Table 2). Furthermore, the new algorithm achieves a fusion result with finer detail. For example, the barbed wire in Fig. 5(d) is more clearly visible than the same feature in (c). In Fig. 5, the person in 5(c) is better defined than in 5(d), while in 5(e) and 5(f), both the barbed wire and the person, and even the smoke in the upper-right corner of the image, are easier to identify than in the others. This enhanced clarity admits more effective subsequent processing. The following objective criteria were evaluated:
Tables 3 and 4 report the objective performance evaluation measures for the four fusion algorithms. Table 3Experimental objective evaluation measures of Fig. 4.
Table 4Evaluation comparison of Fig. 5.
Relative to the other algorithms, obtains the largest MI and AG for the fused images, suggesting that this algorithm can provide fused images with higher information content and better clarity. The objective indicators of fidelity to the source image also favor the IALM and self-adaptive regional variance estimation algorithm performance. 4.3.Fusion of Corrupted ImagesTo assess whether is robust to missing data and image corruption, we continue to use clean, multifocus clock images for processing. At a 0.15 error rate, 15% of the pixels of the original image are corrupted, and an additional 15% are missing (gray-level values set to zero). This implies an effective data corruption rate or 30%. The results of the test of the four algorithms are shown in Fig. 6. Figures 6(a) and 6(b) show, respectively, Fig. 4(a) with errors and Fig. 4(b) with errors. Figure 6(c) shows the result of using without a denoising filter, while Fig. 6(d), labeled , shows the result of using with an adaptive median filter. The result of using PCNN with an adaptive median filter is labeled and appears in Fig. 6(e). To achieve this outcome, we use the adaptive median filtering strategy proposed by Chen and Wu27 to identify pixels corrupted by impulsive noise and replace each damaged pixel by the median of its neighborhood. The adaptive median filter can employ varying window sizes to accommodate different noise conditions and to reduce distortions like excessive thinning or thickening of object boundaries. Figure 6(f) shows results using without denoising. The clarity of result 6(f) relative to those in 6(c), 6(d), and 6(e) is quite apparent. The empirical image quality tracks the improvement in PSNR as reported in the captions. Figures 6(g) and 6(h) show 400% blow ups of portions of 6(e) and 6(f). These results demonstrate the ability of to recover the missing or erroneous data, while preserving image detail in both corrupted and clean images. 5.ConclusionsTraditional convolution-based wavelet transform processing for image fusion has shortcomings including large memory requirements and high computational complexity. The approach to fusion taken in this research uses different fusion rules for low-frequency and high-frequency decomposition components represented on a lifting wavelet basis set. Low-frequency components are characterized by the matrix completion and RPCA methods: IALM, whereas the high-frequency components critical for image details are represented by taking into account the variance differences among proximal neighborhoods. Furthermore, strong correlation between pixels in a local area is captured by a self-adaptive regional variance assessment. Experimental results show that the new algorithm not only improves the amount of information and the correlation between the fused and source images, but also reduces the level of distortion. Significant clarity improvement relative to state-of-the-art methods is also demonstrated for corrupted images. AcknowledgmentsThis research was supported in part by the National Natural Science Foundation of China (Grant No. 30970780) and by the General Program of Science and Technology Development Project of Beijing Municipal Education Commission of China (Grant No. KM201110005033). J.D. and D.B. efforts were supported in part by the U.S. National Science Foundation under Cooperative Agreement DBI-0939454. Any opinions, conclusions, or recommendations expressed are those of the authors and do not necessarily reflect the views of the NSF. This work was undertaken in part while Z.W. was a visiting research scholar at the Michigan State University. The authors thank the Beijing University of Technology’s Multimedia Information Processing Lab for assistance. ReferencesB. Khaleghi et al.,
“Multisensor data fusion: a review of the state-of-the-art,”
Inf. Fusion, 14 28
–44
(2013). http://dx.doi.org/10.1016/j.inffus.2011.08.001 Google Scholar
G. Piella,
“A general framework for multiresolution image fusion: from pixels to regions,”
Inf. Fusion, 4
(4), 259
–280
(2003). http://dx.doi.org/10.1016/S1566-2535(03)00046-0 Google Scholar
Y. Chai, H. Li and Z. Li,
“Multifocus image fusion scheme using focused region detection and multiresolution,”
Opt. Commun., 284
(19), 4376
–4389
(2011). http://dx.doi.org/10.1016/j.optcom.2011.05.046 Google Scholar
R. K. Sharma and M. Pavel, Probabilistic Model-Based Multisensor Image Fusion, 1
–35 Oregon Graduate Institute of Science and Technology(1999). Google Scholar
Y. Zheng,
“An orientation-based fusion algorithm for multisensor image fusion,”
Proc. SPIE, 7710 77100K
(2010). http://dx.doi.org/10.1117/12.849656 Google Scholar
R. Nava, B. Escalante-Ramírez and G. Cristóbal,
“A novel multi-focus image fusion algorithm based on feature extraction and wavelets,”
Proc. SPIE, 7000 700028
(2008). http://dx.doi.org/10.1117/12.781403 Google Scholar
C. Ramesh and T. Ranjith,
“Fusion performance measures and a lifting wavelet transform based algorithm for image fusion,”
Inf. Fusion, 1 317
–320
(2002). http://dx.doi.org/10.1109/ICIF.2002.1021168 Google Scholar
G. Liu and C. Liu,
“A novel algorithm for image fusion based on wavelet multi-resolution decomposition,”
J. Optoelectron., 15 334
–347
(2004). Google Scholar
Z. Qiang and J. Peng,
“Remote sensing image fusion based on small wavelet transform’s local variance,”
J. Huazhong Univ. Sci. Technol., 6 89
–91
(2003). Google Scholar
W. Sweldens,
“The lifting scheme: a construction of second generation wavelets,”
SIAM J. Math. Anal., 29
(2), 511
–546
(1998). http://dx.doi.org/10.1137/S0036141095289051 Google Scholar
M. Chen and H. Di,
“Study on optimal wavelet decomposition level for multi-focus image fusion,”
Opto-Electron. Eng., 31 64
–67
(2004). Google Scholar
Q. Lin and F. Gui,
“A novel image fusion algorithm based on wavelet transforms,”
Proc. SPIE, 7001 70010M
(2008). http://dx.doi.org/10.1117/12.780162 PSISDG 0277-786X Google Scholar
L. S. Arivazhagan, L. Ganesan and T. Kumar,
“A modified statistical approach for image fusion using wavelet transform,”
Signal Image Video Process., 3
(2), 137
–144
(2009). http://dx.doi.org/10.1007/s11760-008-0065-4 Google Scholar
S. El-Khamy et al.,
“Regularized super-resolution reconstruction of images using wavelet fusion,”
Opt. Eng., 44
(9), 097001
(2005). http://dx.doi.org/10.1117/1.2042947 Google Scholar
Z. Lin and Y. Ma,
“The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices,”
(2011). Google Scholar
E. J. Candés and B. Recht,
“Exact matrix completion via convex optimization,”
Found. Comput. Math., 9 717
–772
(2009). http://dx.doi.org/10.1007/s10208-009-9045-5 Google Scholar
E. J. Candés, J. K. Romberg and T. Tao,
“Stable signal recovery from incomplete and inaccurate measurements,”
Commun. Pure Appl. Math., 59 1207
–1223
(2006). http://dx.doi.org/10.1002/cpa.20124 Google Scholar
E. J. Candés and Y. Plan,
“Matrix completion with noise,”
Proc. IEEE, 98 925
–936
(2010). http://dx.doi.org/10.1109/JPROC.2009.2035722 Google Scholar
J. Wright et al.,
“Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization,”
Proc. Neural Inf. Process. Syst., 3 1
–9
(2009). Google Scholar
E. J. Candés et al.,
“Robust principal component analysis?,”
J. ACM, 58 11
(2011). Google Scholar
W. Tan, G. Cheung and Y. Ma,
“Face recovery in conference video streaming using robust principal component analysis,”
in Proc. IEEE Int. Conf. on Image Processing,
3225
–3228
(2011). Google Scholar
H. Ji et al.,
“Robust video denoising using low rank matrix completion,”
in Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition,
1791
–1798
(2010). Google Scholar
A. Z. Averbuch and V. A. Zheludev,
“Image compression using spline based wavelet transforms,”
Wavelets Signal Image Anal., 19 341
–376
(2001). Google Scholar
X. Qu et al.,
“Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain,”
Acta Autom. Sin., 34 1508
–1514
(2008). http://dx.doi.org/10.1016/S1874-1029(08)60174-3 THHPAY 0254-4156 Google Scholar
Y. Chai, H. F. Li and M. Y. Guo,
“Multifocus image fusion scheme based on features of multiscale products and PCNN in lifting stationary wavelet domain,”
Opt. Commun., 284 1146
–1158
(2011). http://dx.doi.org/10.1016/j.optcom.2010.10.056 Google Scholar
C. S. Xydeas and V. Petrovic,
“Objective image fusion performance measure,”
Electron. Lett., 36 308
–309
(2000). http://dx.doi.org/10.1049/el:20000267 ELTNBK 0013-4759 Google Scholar
T. Chen and H. Wu,
“Adaptive impulse detection using center-weighted median filters,”
IEEE Signal Process. Lett., 8 1
–3
(2001). http://dx.doi.org/10.1109/97.889633 Google Scholar
BiographyZhuozheng Wang is an associate professor at Beijing University of Technology and a visiting scholar at Michigan State University sponsored by the China Scholarship Council. He received his MS and PhD degrees in electronic engineering from Beijing University of Technology in 2005 and 2013. He is the first author of more than 10 academic papers and has written one book chapter. His current research interests include image processing, electroencephalography, and virtual reality technology. He has been a reviewer and is a member of SPIE. J. R. Deller Jr. is an IEEE fellow and professor of electrical and computer engineering at Michigan State University, where he received the distinguished faculty award in 2004. He received a PhD in biomedical engineering in 1979, an MS degree in electrical and computer engineering in 1976, and an MS degree in biomedical engineering in 1975 from the University of Michigan, and his BS degree in electrical engineering (summa cum laude) in 1974 from Ohio State University. His research interests include statistical signal processing with applications to speech and hearing, genomics, and other aspects of biomedicine. Blair D. Fleet received her BS degree (summa cum laude) from Morgan State University, Baltimore, MD, in 2010, and her MS degree from Michigan State University in 2012, both in electrical engineering. She is a National Science Foundation graduate research fellowship award recipient, as well as a GEM (the National Consortium for graduate degrees for Minorities in Engineering and Science, Inc.) fellow. She is currently pursuing her PhD in electrical engineering at Michigan State University. Her research interests include merging signal/image processing with the evolutionary computation fields to solve challenging engineering processing problems, especially in the biomedical domain. |