PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325301 (2024) https://doi.org/10.1117/12.3050050
This PDF file contains the front matter associated with SPIE Proceedings Volume 13253, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325302 (2024) https://doi.org/10.1117/12.3041668
Recent Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in vision-language tasks, such as image captioning and question answering. However, they lack the essential perception ability, namely object detection. In this work, we focus on detecting prohibited items and discuss the possibility of integrating multimodal LLMs into the detection process. Our method first performs image captioning on the x-ray prohibited item image, followed by creating instructions to prompt the multimodal LLMs to identify the prohibited item. Our approach leverages the contextual understanding and language processing strengths of MLLMs. While current methods in real-time object detection having high accuracy, they often require extensive training on large datasets specific to the prohibited items. In contrast, MLLMs can understand and generate detailed descriptions, which can be advantageous in scenarios where prohibited items may not be well-represented in training data or exhibit significant variability in appearance. Our results suggest that MLLMs can complement traditional methods by providing a more nuanced understanding of prohibited items through their ability to interpret and respond to complex queries, potentially improving detection rates in challenging environments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Xuan Liang, Guoqing Zhou, Qiaobo Cao, Jiasheng Xu, Sikai Su, Zhou Tian
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325303 (2024) https://doi.org/10.1117/12.3042440
Water surface glare severely impacts bathymetric inversion based on multispectral remote sensing images. This paper compares and analyzes three glint removal methods (Hedley, Lyzenga, Goodman) to enhance bathymetric inversion accuracy from multispectral satellite imagery. Experimental validation is conducted using Gaofen-1 satellite images and Sentinel-2A satellite images, with the western sea area of Weizhou Island serving as the designated test area. The results of the experiments show that the Hedley method is most suitable for bathymetric inversion, followed by the Goodman algorithm, and Lyzenga method as the least effective. However, all methods improve bathymetric inversion compared to no glint removal.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325304 (2024) https://doi.org/10.1117/12.3041876
In response to the limited generalization capability of deep learning-based methods for aerial image dehazing across various levels of haze degradation, based on visual prompt learning methods for image dehazing is proposed. The algorithm mainly utilizes a multi-scale encoder-decoder architecture based on U-Net, introducing Prompt learning in the decoding stage to further enhance the generalization performance of image dehazing. Firstly, soft cue techniques are used to generate learnable parameters, which adaptively adjust the weight values according to the features, thereby encoding the discrimination information for various types of haze degradation. Secondly, by interacting the cue components with the main feature extraction network, the algorithm dynamically guides the network using different levels of degradation information to direct the image reconstruction process. Experimental results show that the proposed algorithm achieves higher dehazing performance and better visual restoration quality under three levels of haze on the benchmark dataset SateHaze1k.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Qingdong Wu, Zhiqiang Wang, Yao Xu, Jun Yan, Zhaohui Liu
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325305 (2024) https://doi.org/10.1117/12.3040970
The corded target is an indispensable part in the study of machine vision-based measurement technology, and each coded value corresponds to a different unique value, which makes it be of an important application in image recognition. Since a center circle of the circular coded target can provide precise positioning coordinate, this paper carries out the research on the combined preprocessing algorithm for detecting the center circle of circular coded target images in order to improve the recognition efficiency and accuracy of the center circle of circular coded target image of Autonomous Rail Rapid Transit(ART).Firstly, the color images captured by field experiment are gray-scale processing and smoothed by Gaussian filtering; secondly, Canny edge detection is performed on the images to accurately extract the edge features of the target images. Finally, an improved Hough transform method is used to identify the center circle contour of the extracted edges. The experimental results show that the algorithm not only improves the recognition efficiency, but also ensures the accuracy of recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325306 (2024) https://doi.org/10.1117/12.3040982
With the continuous development of sensor technology, the interface throughput has become a bottleneck for sensor performance. The emergence of CoaXPress2.0 protocol facilitates optimal sensor performance in cameras. This study utilizes Microchip’s FPGA and PHY chips to construct a set of CoaXPress2.0 protocol communication equipment, integrated with Sony IMX455 sensor to achieve 62.39MP image at 17.9fps output, reaching the theoretical maximum frame rate of the sensor. It is three times the frame rate of USB3.0. With the continuous development of sensor technology, the throughput of the interface limits the performance of the sensor, and the emergence of CoaXPress2.0 protocol helps the camera to play the best performance of the sensor. The experimental equipment achieved a maximum communication rate of 12.5Gbps per cable under CoaXPress2.0 protocol, with a single cable bandwidth reaching 10.9Gbps for image data. Additionally, a method is proposed to configure sensor registers using soft check, eliminating the limitation of requiring a specific host computer to open the camera, thereby enhancing the versatility of industrial cameras.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325307 (2024) https://doi.org/10.1117/12.3041063
In this paper, we propose a novel method for 3-D localization utilizing 1-D angle of arrival (AOA) measurements, also referred to as space angles (SAs), acquired from linear arrays. Our method hinges on the semidefinite relaxation (SDR) framework. Initially, we employ the SDR method to derive an initial solution. Recognizing the inherent limitations of this initial solution, we incorporate a pioneering randomization technique to enhance precision. Our simulation results illustrate the exceptional performance of our proposed method when contrasted with the present SDR-based method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325308 (2024) https://doi.org/10.1117/12.3040985
To meet the needs of target detection in complex underwater scenes, an picture registration method based on sensitive domains is proposed, considering the characteristics of blurred target boundaries, poor contrast, and low overall gray value in forward-looking sonar pictures. In this method, the gradient enhancement method is used to improve the boundary of the target domain in the forward-looking sonar (FLS) picture, and the morphological operation method is used to remove a large number of noise points in the picture background and locate the coordinates and range of sensitive domains. For the extracted sensitive regions containing rich gray change information, the particle swarm refinement (PSO) method and the mutual information (MI) method are used for initial registration. On this basis, the mutual information secondary registration method is used to optimize the registration error and improve registration accuracy. Experiments show that the method can accurately locate sensitive regions, has short calculation time, and high registration accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Ruixiang Li, Guoqing Zhou, Lin Li, Yangleijing Li, Ying Yao
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325309 (2024) https://doi.org/10.1117/12.3041524
Aiming at the huge amount of point cloud information data which is difficult to be visualized, this paper proposes a multilevel adaptive binomial tree cropping visualization method. Firstly, the octree is used to spatially organize and manage the point cloud data to build a multilevel data model, then the tree structure is trimmed by combining with the viewpoint adaptive binary tree trimming to optimize the overall tree section, and finally the multiresolution level-of-detail (LOD) is built to realize the fast visualization of the point cloud. Experiments using the dataset published by ETH Zurich, Switzerland, show that compared with the traditional visualization algorithm, the visualization frame rate of this method is improved by about 2.9 times.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Yangleijing Li, Guoqing Zhou, Lin Li, Ruixiang Li, Ying Yao
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530A (2024) https://doi.org/10.1117/12.3041547
Newly designed single-photon radar mounted on satellites can gather high-precision three-dimensional data. It is vulnerable to noise, though. This paper offers a parallelogram denoising kernel approach based on multi-feature adaption to solve the irregular background noise and the challenges of signal extraction in steep slope locations. In contrast to conventional circular or elliptical denoising kernels, this approach better matches the properties of single-photon point cloud data. Using a variety of characteristics, including slope and spatial density, it can recognize signals in an adaptive manner. While new radars show excellent accuracy capabilities, noise introduces an error to the measurement. The approach presented in this work solves the signal extraction problem well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530B (2024) https://doi.org/10.1117/12.3042552
Image dehazing is a classic low-level vision task. Applying dehazing algorithms can improve image quality. Many dehazing algorithms have achieved good results on synthetic datasets, but real-world non-homogeneous dehazing is still a challenge. This paper proposes a residual-and-attention-based dehazing network (RAD) with a generative adversarial network training framework. The generator adopts an encoder-decoder network structure, including encoder, bottleneck, and decoder, with adaptive mixup operation between the encoder and decoder. Using the first three residual layers of a pre-trained ResNet model as the encoder provides the network with strong feature extraction capabilities. Attention modules combining channel attention and pixel attention are used in the bottleneck layer and decoder to enhance the dehazing network's effectiveness against non-homogeneous haze. Adaptive mixup operation is employed to connect different feature maps. Adaptive mixup operation helps the network better preserve shallow image features. Experimental results show that the proposed RAD dehazing algorithm achieves superior performance on NH-HAZE.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Tongyu Li, Jie Chen, Mengyue Zhang, Jiangning Yang, Dianji Jia, Shangqing Li, Yuan Li
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530C (2024) https://doi.org/10.1117/12.3041926
During the enhancement process of airborne flare-containing marine remote sensing image, the detailed wave texture especially in low illumination area are prone to loss because of the excessive image brightness difference. In this paper, a novel method based on maximun interclass variance (OTSU), non-subsampled contourlet transform (NSCT), and retinex is presented. The flare-containing image is segmented into shaded area and non-shaded area, separately, by using mean filtering and OTSU method. Contrast limited adaptive histogram equalization (CLAHE) algorithm is applied to suppress the brightness of non-shaded area. Then, the shaded area is decomposed into a series of low frequency and high frequency subband coefficients by using NSCT. Retinex method is utilized to process low-frequency component, while Laplace operator is applied to process high-frequency components. The processed shaded area is obtained by taking inverse NSCT of the processed coefficients. Finally, stitch the processed shaded and non-shaded area by linear parameter, the processed flare-containing image is obtained. Experimental results show that the proposed method performs better in terms of contrast, brightness, and detailed texture enhancement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530D (2024) https://doi.org/10.1117/12.3042012
In recent years, enhancing layer interaction has become an important aspect in CNNs, as it can improve the information flow and representation capability of models. With the introduction of attention mechanisms, methods for enhancing layer interaction have also become increasingly complex. In this paper, we reveal that existing layer interaction models exhibit high time complexity and inference time, thus reducing the training efficiency of the model. To address this issue, we propose an efficient layer interaction model named ELINet, which achieves enhanced layer interaction while maintaining low time complexity and inference time. We evaluated our proposed method on multiple datasets, including CIFAR-10, CIFAR-100, and the ImageNet-1K. By comparing the Top-1 accuracy and actual runtime, we demonstrated the effectiveness and efficiency of our proposed model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530E (2024) https://doi.org/10.1117/12.3041944
In this letter, we propose a multi-task learning (MTL) method for efficient design of sparse conformal arrays with weighting optimization. The problem is to find a set of optimal weights with minimum dipole antennas that will generate a beam pattern radiated by the uniform conformal array. Simulation results are also shown that a sparse conformal antenna array design with favorable pattern-matching is obtained by MTL method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530F (2024) https://doi.org/10.1117/12.3041212
Multi-focus image fusion aims to synthesize key features of different images to enhance visual performance and produce high-quality images with higher resolution and finer details to enrich overall visual perception. However, existing methods mainly focus on feature extraction without paying much attention to the feature reconstruction process. In this paper, we propose a fusion method based on edge feature interaction and global reconstruction. By combining spatial attention and channel attention mechanisms, image features are effectively extracted at the encoder stage. And the edge information is detected using the canny operator, and subsequently an interactive edge feature fusion block is designed to facilitate the edge information fusion of two images detected by the canny operator. In the decoder stage, a self-attention mechanism is used for image reconstruction to further improve the quality of image fusion. Extensive experiments show that our model achieves state-of-the-art level and performs well in both subjective perception and objective evaluation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530G (2024) https://doi.org/10.1117/12.3041021
This paper proposes an image stitching algorithm based on improved ORB (Oriented FAST and Rotated BRIEF) features, aiming to enhance the efficiency and accuracy of PCB image registration. Firstly, feature points are extracted using the ORB algorithm, and BEBLID descriptors are introduced to improve description accuracy. Then, the NNDR algorithm is employed for coarse matching, followed by the construction of optimal geometric constraints based on feature point voting to further optimize the feature points. Subsequently, the RANSAC algorithm is utilized to compute a high-precision transformation matrix. Finally, the improved gradual fade-weighted fusion algorithm is applied to achieve image stitching, reducing the impact of overlap area shadows and obtaining high-quality stitched images. Experimental results demonstrate that the proposed algorithm exhibits good robustness and accuracy in handling PCB image stitching tasks, providing an effective solution for large-scale quality assessment and defect detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530H (2024) https://doi.org/10.1117/12.3041036
Lanthanum bromide scintillation crystal gamma spectrometer detectors, compared to more traditional sodium iodide scintillation crystal gamma spectrometer detectors, have better energy resolution and detection efficiency. They are considered as a potential replacement for traditional sodium iodide detectors in the field of in-situ automatic monitoring of the ocean. However, they have drawbacks such as self-radioactivity, which can affect the measurement spectrum lines and subsequently impact the calculation of characteristic peak areas of the analyte nuclides and the results of activity analysis. This paper selects appropriate background subtraction methods for lanthanum bromide detectors' self-radioactivity and studies gamma spectral analysis methods for seawater, including spectrum smoothing, peak searching, and peak area fitting. Through simulated seawater gamma spectral data analysis experiments, it is demonstrated that the established spectral analysis methods can achieve accurate qualitative and quantitative analysis of radioactive nuclides in seawater, meeting the requirements for real-time and effective monitoring of the marine radioactive environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530I (2024) https://doi.org/10.1117/12.3041933
The non-local GBT model (NLGBT) shows excellent performance in terms of speed in depth map denoising. However, this model suffers from more artifacts and difficulty in recovering image details. Therefore, in this paper, we propose an improved NLGBT model. Specifically, we utilize an improved noise iteration update parameter supplemented with a non-fixed number of iterations, sacrificing a small amount of denoising speed to greatly improve the denoising accuracy. Experiments show that the improved NLGBT model not only denoises quickly, but also suppresses artifacts while preserving the local structure of the image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530J (2024) https://doi.org/10.1117/12.3041280
Based on traditional image stitching technique, improved RANSAC and NISwGSP algorithms are proposed, which both boosted the number of matched points and significantly improved the algorithm efficiency. The operations include using secondary matching based on grid partitioning for feature points, optimizing the algorithm through multi-threaded concurrency and optimizing parameters to reduce computational complexity. Finally after incorporating the seam estimation strategy, the ghosting phenomenon in large-disparity image stitching has been alleviated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530K (2024) https://doi.org/10.1117/12.3041588
In the hyperspectral imaging field, denoising is not only a fundamental issue in image processing but also an essential preprocessing step. In recent years, denoising models have introduced various spatial single factor regularizations to characterize their spatial priors. However, these models do not fully leverage the commonalities and spectral continuity across different bands of hyperspectral images (HSI). To address this, we propose a low-rank tensor decomposition algorithm incorporating a two-factor regularization constraint on the dimensionality reduction factor. This model captures the global low-rank nature of the spectrum, while weighted group sparse constraints are imposed on the spatial factors to enhance the group sparsity of the HSI. Additionally, continuity constraints on the spectral factors are introduced to promote the spectral continuity of the HSI. The model is further refined by employing a logarithmic low-rank function to constrain the coefficient tensor. Moreover, we develop a proximal alternating minimization (PAM) algorithm and the algorithm of alternating direction multiplication (ADMM) is used to solve this model. Extensive experiments demonstrate that our method surpasses other existing HSI denoising techniques and exhibits superior performance in mixed noise removal.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530L (2024) https://doi.org/10.1117/12.3041001
An algorithm for dehazing images is presented in this study, which combines fusion transmittance and light field depth information to address issues such as color distortion and loss of image details that commonly occur in traditional dehazing algorithms relying on atmospheric scattering models with imprecise parameter estimation. This algorithm utilizes light field multi-view technology to estimate depth, gather depth information of foggy scenes, outline the sky region within the scene, and segment the image into foreground and sky regions. Transmittance values are calculated separately for each region, and then a confidence fusion process is applied to determine the final transmittance value. The obscured image is then enhanced using the atmospheric scattering model to achieve dehazing results. Experimental results demonstrate that this algorithm surpasses current single-image dehazing algorithms in various fog density scenarios by producing superior dehazing results, effectively reducing color distortion and halo effects while also recovering more image details.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530M (2024) https://doi.org/10.1117/12.3041277
The distribution of haze in UAV aerial haze images is usually inhomogeneous, and the traditional dehazing algorithms that may have some problems such as haze residue or excessive dehazing when processing them, this paper brings up a polarization dehazing method based on the calculation of haze density. Firstly, the minimum RGB color channel map is calculated to establish the haze density distribution map, and the dynamic scattering coefficient is constructed on the basis of the haze density and the transmittance is estimated. Secondly, the local atmospheric polarization degree is automatically estimated by combining the polarization degree and the transmittance, so as to realize the accurate separation of the atmospheric scattered light. Then, a two-scale correction method based on the detection of highlight regions is applied with the transmission for the image which contains highlight regions. Finally, the clear image is acquired by transforming atmospheric scattering model. The results of experiments show that our method can effectively remove the haze in the UAV aerial haze images, and has achieved excellent performance in both subjective evaluation and objective index.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530N (2024) https://doi.org/10.1117/12.3040990
Semi-supervised classification methods generate pseudo-labels from unlabeled data, where pseudo-labels' precision is vital for successful classification outcomes. Addressing the challenge of inaccuracies in pseudo-label categories during training, this paper introduces a semi-supervised classification strategy for PolSAR images, leveraging the Vision Transformer (ViT) technology. In this study, the Simple Linear Iterative Clustering (SLIC) method was applied to create superpixel blocks with distinct boundaries and organized structures, facilitating the generation of pseudo-labels with a correctness rate of 85.9%. This paper introduces the Vision Transformer (ViT) network, capitalizing on its multi-head attention mechanism. This approach enhances global image information extraction while simultaneously addressing local details for pseudo-label training. Specifically, on the Flevoland data set, an increase in pseudo-label category accuracy of 7.09% and 5.47% was achieved. For the Wuhan Tongshun River dataset, improvements of 2.44% and 3.41% in pseudo-label category accuracy were recorded, thereby elevating the precision of PolSAR semi-supervised classification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530O (2024) https://doi.org/10.1117/12.3042032
The objective of this study is to enhance the precision of automatically categorizing fetal heart rate (FHR) deceleration using a novel approach. The proposed Fourier transform-based network integrates SGE (Spatial Group-wise Enhance) and Fca (Fully Connected Attention) modules. This combination leads to a substantial enhancement in the accuracy of deceleration event classification. This method is designed to assist clinicians in evaluating the oxygenation status of the fetus. The model's performance evaluation indicates an accuracy of 88.52%, a Matthews correlation coefficient of 83.85%, a recall rate of 84.03%, a precision of 78.74%, and an F1 score of 80.84%. These findings confirm the efficacy of this approach in automated deceleration classification, particularly in addressing the intricate correlation between FHR decelerations and uterine contractions. This approach also demonstrates its capacity to diminish the subjective errors and burden of clinicians.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530P (2024) https://doi.org/10.1117/12.3041604
Synthetic Aperture Radar (SAR) is an active earth observation system. The process of acquiring images by SAR system shows periodic characteristic stripe noise thus decreasing image quality, which seriously limits the application of SAR image interpretation and information extraction. Oblique band noise usually exists in system-level geometrically corrected remote sensing products obtained after geometric positioning, map projection, and resampling, and such remote sensing products have higher scientific application value and practical applicability. Therefore, this paper proposes a coherence-diffusion- based oblique band noise removal method for remote sensing images, and processes the simulated band noise and the actual presence of the band noise images, and qualitatively and quantitatively compares it with existing band removal methods, and then compares it with existing band removal methods. The method is compared qualitatively and quantitatively to verify the feasibility, effectiveness and practicality of the algorithm in this paper. The experimental results show that the method of this paper can be applied to strip noise at any angle, and the processing results are obviously better than the existing methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Information Recognition Method and Detection Modeling
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530Q (2024) https://doi.org/10.1117/12.3041160
Detection of tunnel surface structural defects is crucial to ensuring safe tunnel operations. However, the tunnel defect samples collected in reality are characterized by complex background noise, little quantity. To reduce the negative effects caused by these problems, a novel few-shot tunnel defect detector (FSTDD) is proposed in this paper. The proposed FSTDD consists of two stages. In the first stage, richly annotated numerous base classes are utilized to train a base detector, which incorporates an attention module that reduces background noise. In the second stage, the detector's partial parameters are fine-tuned using a few examples of novel classes, and the testing set is estimated using an offline prototype calibration. Extensive trials show that our FSTDD detects rare tunnel defect in 10 shots with 30.96% mAP50 on our tunnel defect datasets, outperforming existing approaches significantly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530R (2024) https://doi.org/10.1117/12.3040986
In target detection, remote sensing images have characteristics such as complex background, dense target distribution, and small targets, which lead to poor detection results, missed detections and false detection. This paper presents ABSYOLOv5s, a target detection method based on YOLOv5s. In order to overcome the difficulties in detecting small target and prevent missed detection and error detection, a self-attention mechanism with position information encoding is introduced to strengthen the fusion of feature information within the same scale. BiFPN is used in the neck network to better integrate low-level and high-level feature information. In addition, the ShapeIoU loss function proposed by Zhang et al. is applied to make the model in this article concentrate more on bounding box’s shape and scale to improve detection accuracy. This paper conducted a complete experiment on the remote sensing vehicle data set COWC. The experiment shows that all indicators of the improved model have been improved, with the accuracy increased by 0.4%, recall, mAP,mAP@0.5/0.95, have improved by 1.1 per cent, 0.7 per cent, and 0.4 per cent, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530S (2024) https://doi.org/10.1117/12.3041134
To address the challenges of high labor intensity and low efficiency in traditional manual inspection, the use of traditional pixel segmentation methods has issues such as algorithm incompatibility, stringent lighting condition consistency requirements, and a lack of robustness against interference. This paper introduces a U-Net convolutional neural network-based rail surface defect detection algorithm. Initially, multiple contraction blocks are employed to learn complex structures for feature extraction. Subsequently, two 3×3 and one 2×2 convolutional layers are utilized to form bottleneck sections and generate feature vectors. Subsequently, two 3×3 convolutional layers and a 2×2 upsampling layer form an expansion section to convert features into images. Finally, the required number of segmentation features are output through a 3x3 convolutional layer to produce a single-channel map. Experiments demonstrate that compared to traditional image segmentation methods, MIoU, FWIoU, PA, MPA, and BCE have improved by 26.7%, 8.7%, 8.4%, 11.7%, and 93.8% respectively. The introduction of skip connections in this method effectively automates the segmentation of surface image defects on steel rails, offering faster recognition speed, improved accuracy, and enhanced robustness. This approach meets the automation needs of lossless detection and holds promising applications in engineering practices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Chenzhen Zhao, Shuning Zhang, Lingzhi Zhu, Weihao Hu, Chenyu Sun
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530T (2024) https://doi.org/10.1117/12.3041502
Aiming at the problems that the traditional recognition of human motions using optical devices such as video cameras is susceptible to light variations, occlusion and privacy leakage, a human motions recognition scheme based on Range-Doppler Map is proposed. Firstly, the echo data of human motions are collected using 77 GHz FMCW millimetre wave radar, and the Micro-Doppler Maps are built frame by frame by projecting the Range-Doppler Map(RDM) to the velocity dimension and constituting the human motion recognition dataset; Secondly, an improved ConvNeXt network is proposed for the recognition and classification of different human motions, with the introduction of a channel attention mechanism in the underlying network, which enhances the mining ability of ConvNeXt network for potential key features. The experimental results show that the human motions recognition scheme based on Micro-Doppler Maps and the ConvNeXt network incorporating the attention mechanism has a recognition accuracy of up to 97% for seven human motions, namely, boxing, squatting, running, stepping in place, walking, waving, and swinging the arm in place.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530U (2024) https://doi.org/10.1117/12.3041972
Unmanned Aerial Vehicles (UAVs) are increasingly vital in diagnosing insulator defects, essential for ensuring the stability and safety of electrical transmission lines. In response, we propose a lightweight insulator defect detection model suitable for edge computing environments. Initially, we introduce the yolov8-C2f-RepGhost model, which features a streamlined backbone network. The conventional bottleneck in the C2f module of the backbone network is replaced with a RepGhost bottleneck, utilizing structural re-parameterization to enhance efficiency, thus rebranded as C2f-RepGhost. The RepGhost module significantly boosts detection speed. Furthermore, we employed GridMask data augmentation to expand and diversify the dataset, improving its utility in training and enhancing the model’s generalization capabilities. Our experimental results demonstrate that the yolov8-C2f-RepGhost model achieves notable enhancements in both speed and accuracy when trained on this augmented dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530V (2024) https://doi.org/10.1117/12.3041067
In order to cope with the challenges of a wide variety of crop diseases and insect pests, fast propagation speed, and long traditional detection cycles, from the perspective of ecological environment protection, we propose a corn pest and disease identification based on the improved You Only Look Once (yolov8) convolutional neural network model. By imitating the structure and working principle of the human brain's neural network, it can learn and extract features from massive data to achieve automatic identification and analysis of complex problems. In order to improve the recognition rate of corn diseases, we used a multi-channel convolution module to improve the C2f module of yolov8, then collected a large number of pictures of multiple types of corn diseases and pests, and manually annotated the picture data set of corn diseases and pests. The improved model is applied to the detection and identification of corn pests and diseases. The research results show that this method can accurately identify 6 common corn diseases and provide effective technical support for the prevention and control of corn diseases.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530W (2024) https://doi.org/10.1117/12.3041248
Whenever a forest fire occurs, it causes significant damage to the natural environment as well as to people's health and safety. The earlier the fire is detected, the more effective it is to control the fire and reduce the damage. To detect forest fires quickly and efficiently, this paper proposes an improved algorithm based on YOLOv7-tiny to fulfill accuracy and real-time requirements. SPD-Conv replaces the traditional stepwise convolution and pooling layer in YOLOv7-tiny to reduce computational complexity without sacrificing accuracy, and the accuracy has been slightly improved. GSConv is utilized to perform Slim-Neck operation on the feature tensor in the Neck part of the model, which minimizes the calculation cost and ensures computational accuracy at the same time. In addition, in order to exclude the influence of fog on the image in the real environment, we use a combination of dark-channel a priori algorithm and histogram equalization in the preprocessing stage to realize the image defogging operation. The experimental data show that the improved YOLOv7-tiny model achieves 3.5% improvement in correctness, 4.2% improvement in recall, and 6.7% improvement in FPS compared with the original YOLOv7-tiny model, which achieves a better balance between real-time and correctness than other models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530X (2024) https://doi.org/10.1117/12.3040979
Cyclic codes play an important role in the theory of error-correcting codes and widely used in the fields of communication, storage, etc. In this paper, the authors construct two kinds of optimal ternary cyclic codes with minimum distance 4.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530Y (2024) https://doi.org/10.1117/12.3040996
For apple recognition in natural orchard environments, the YOLOv8 algorithm offers unique advantages of accurate detection and rapid speed. This paper introduces an enhanced YOLOv8 model derived from the original YOLOv8 model. Firstly, the CReToNeXt structure replaces the C2f module in the original YOLOv8 model's head section, reducing computational complexity and increasing the model's receptive field, thereby enhancing detector performance by fully exploiting and utilizing multi-scale information of features to improve object detection accuracy. Then, the Shuffle- Attention (SA) attention mechanism module is introduced, enabling the algorithm to integrate deeper features with larger receptive fields, reducing the impact of imbalanced training sample annotation quality, improving the precision of predicted bounding boxes, and enhancing the detection ability of small objects. The study in this paper demonstrates that the improved YOLOv8 model attains a distinct enhancement over the original model: Precision (P) increases by 1.1%, Recall (R) increases by 0.9%, and mean Average Precision (mAP) increases by 1.5%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132530Z (2024) https://doi.org/10.1117/12.3041112
To boost the mean average precision (mAP) of number plate detection in challenging scenarios characterized by uneven lighting, tilted number plates, varying angles, and other interfering elements, an enhanced target detection algorithm has been introduced by refining the YOLOv5 network architecture. Firstly, the lightweight network ShuffleNet v2 is introduced to replace the Backbone of YOLOv5, reducing the parameters of the YOLOv5 network and enhancing computational speed. Secondly, the Stemblock module replaces the header convolutional layer to enhance feature extraction quality and diversity. Finally, the comparative experiments demonstrate that the YOLOv5sand the original YOLOv5n models achieve an average detection accuracy of 94.8% for number plates, whereas the improved average accuracy rises to 99.5%, resulting in a noticeable increase in overall accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325310 (2024) https://doi.org/10.1117/12.3041043
Video object tracking is an important research direction in the field of computer vision, with applications spanning a wide range of fields such as video surveillance, intelligent transportation, and human-computer interaction. With the rapid development of technology, a large number of new algorithms have emerged in the field, significantly improving accuracy, speed and robustness.
This review article delves into the evolution of algorithms, tracing their progression from early tracking algorithms to correlation filter tracking and deep learning tracking algorithms. It introduces the definition of video object tracking algorithms and expounds upon their historical background. By comparing and analyzing these algorithms, their distinct performance, characteristics, advantages, disadvantages, applicable scenarios, and potential areas for improvement become more evident. The article further explores technological advancements, challenges encountered, evaluation methods for algorithms, and research progress, while offering perspectives on future trends.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325311 (2024) https://doi.org/10.1117/12.3041045
Aiming at the problem of high-precision recognition of satellite signals, this paper proposes a Doppler-shift oriented dynamically optimised LSTM satellite signal recognition method, which utilises the effect of Doppler shift on satellite communication signals of different orbits and combines the differences in the modulation methods of communication signals of different regimes to realise the identification of satellite signals, and constructs a satellite signal recognition model using a neural network combination of CNN-LSTM, and the satellite signal recognition accuracy performance reaches 99% in the presence of In the presence of Doppler shift, the simulation is based on the signal samples collected from the air port, and the accuracy performance of satellite communication signal recognition for typical regimes such as BPSK, QPSK, QAM, etc. reaches 99.02%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Haoxiang Yang, Chengguo Yuan, Yabin Zhu, Lan Chen, Xiao Wang, Futian Wang
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325312 (2024) https://doi.org/10.1117/12.3041069
The mainstream human activity recognition (HAR) algorithms are developed based on RGB cameras, which are easily influenced by low-quality images (e.g., low illumination, motion blur). Meanwhile, the privacy protection issue caused by ultra-high definition (HD) RGB cameras aroused more and more people’s attention. Inspired by the success of event cameras which perform better on high dynamic range, no motion blur, and low energy consumption, we propose to recognize human actions based on the event stream. We propose a lightweight uncertainty-aware information propagation based Mobile-Former network for efficient pattern recognition, which aggregates the MobileNet and Transformer network effectively. Specifically, we first embed the event images using a stem network into feature representations, then, feed them into uncertainty-aware Mobile-Former blocks for local and global feature learning and fusion. Finally, the features from MobileNet and Transformer branches are concatenated for pattern recognition. Extensive experiments on multiple event-based recognition datasets fully validated the effectiveness of our model. The source code of this work will be released after the acceptance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325313 (2024) https://doi.org/10.1117/12.3041827
Depression, as a severe mental disorder, is characterized primarily by significant and persistent mood disturbance and anhedonia. Currently, its diagnosis relies mainly on clinician interviews, leading to a high rate of misdiagnosis. Deep learning has shown promising research outcomes in areas such as image processing and speech recognition, addressing the high misdiagnosis rate of depression in clinical practice. This study collected resting-state and reward/punishment task-related brain functional information from 22 depression patients and 15 healthy controls. By integrating deep learning methods with brain functional information, depression identification was conducted. To address issues related to depression identification features based on task-related brain function and its task stimuli, this study utilized locally consistent brain functional imaging data from depression patients and healthy controls during reward/punishment tasks as features for depression identification. Experimental results were compared among reward tasks, punishment tasks, and both tasks combined. The results showed that classification results under reward task stimuli were superior, achieving recognition accuracies, sensitivities, specificities, precisions, and F1 scores of 88.08%, 88.50%, 87.83%, 88.18%, and 0.88, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325314 (2024) https://doi.org/10.1117/12.3040994
Road defect detection is an important task to ensure road safety and prevent vehicle damage. However, there are limitations in identifying road potholes in situations where the detection environment is not ideal. In this paper, a prune-HRNet model is proposed to optimise the HR-Net model by pruning process to ensure high accuracy with reduced computational cost. First, 70% of the dataset is randomly selected as the training set and 30% as the test set to train and test the proposed model. Then the above experiments are repeated using LeNet and ResNet models and compared. The results of the road defect recognition experiments show that the accuracy of the test set of the proposed model in this paper is 0.1068 higher than that of the traditional model, and the recall rate is 0.1226 higher than that of the traditional model. the generalisation ability of the model is verified. Therefore, the model proposed in this paper can be used for real-time monitoring of road potholes to ensure road safety and reduce road maintenance costs, which is of guiding significance for the promotion of intelligent transport systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325315 (2024) https://doi.org/10.1117/12.3041086
In the current research landscape, human pose recognition technology is increasingly gaining attention, particularly in the areas of interactive design, health monitoring, and security applications. However, traditional computer vision-based methods for human pose recognition rely heavily on lighting conditions and face limitations when dealing with complex environments. To overcome these constraints, this paper explores a method of human pose recognition based on wireless signal's Channel State Information (CSI). This method, unlike traditional computer vision technologies, does not depend on lighting conditions, making it more suitable for complex environments and offering better protection of user privacy. Traditional CSI-based methods for human pose recognition mainly employ Long Short-Term Memory (LSTM) neural networks, but these methods are limited in computational efficiency due to constraints in parallelization capabilities. To address this, the paper introduces a novel network architecture named “CSI Transformer,” which combines Temporal Convolutional Networks (TCN) and the Transformer architecture for efficient processing of CSI data. Initially, the data undergoes preliminary feature extraction using temporal convolutional networks, followed by deep feature analysis and pose prediction through the Vision Transformer architecture. Experimental results indicate that, compared to traditional LSTM-based methods, the CSI Transformer can recognize human poses more efficiently while ensuring accuracy and safeguarding data privacy. This study offers a new perspective and method for the application of wireless signals in human pose recognition, highlighting its significance in advancing related technologies and applications. Moreover, the introduction of the CSI Transformer provides an effective solution to issues such as low computational efficiency and privacy protection, suggesting a broader application prospect for wireless signal-based human pose recognition technology in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325316 (2024) https://doi.org/10.1117/12.3041018
The perception of ground target vibration signals is the key to target recognition technology, targeting the vibration signals generated by ground target movements such as walking, jumping, and running of personnel. Firstly, experimental testing was conducted to establish a signal sample database that reflects the characteristic information of target attributes. Then, time-domain and frequency-domain analysis were performed on the characteristics of target vibration signals to achieve the recognition of targets in different motion states. The experiments showed that these two methods are simple, easy to implement, and have good recognition effects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325317 (2024) https://doi.org/10.1117/12.3041508
An improved YOLOv5m fighting behavior detection algorithm is proposed to address the problem of insufficient accuracy and real-time performance of behavior detection algorithms for aerial viewpoints. Three UAVs located at different heights and angles are used to shoot and construct the fight behavior dataset, after replacing the convolution module and introducing maximum feature pooling, a lightweight network is designed and the original backbone network is replaced to achieve the lightweight effect; an attention mechanism is added to the detection head to improve the detection accuracy of the model, and the effects of the two attention mechanisms on the model's detection accuracy are tested; bi-directional features are added to the neck The pyramid structure is added to the neck to strengthen the feature extraction ability; the results show that compared with the original network, the detection accuracy is improved from 88.3% to 93.4%, the number of parameters is reduced by 83.37%, the weight of the model is reduced from 17.02MB to 11.39MB and the number of detected frames per second reaches 48, which meets the requirements of lightweight and real-time detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325318 (2024) https://doi.org/10.1117/12.3041890
In the mobile driving scenario, insufficient data has become a major challenge for the application of scene text recognition models. An alternative to reduce the cost of data annotation is the active learning method, which improves the performance of the model by screening the data with the largest annotation information entropy for training. However, the calibration deviation of modern convolutional neural network is large, and the confidence cannot accurately reflect the real situation of model prediction. In view of the above problems, a scene text recognition framework based on active learning is proposed. The strategy of generating identically distributed heterogeneous data based on anti-aliasing operation is introduced into the framework. A confidence evaluation method based on prediction invariance is proposed. Combined with the active learning method, the confidence of sample prediction is evaluated and corrected. In addition, a text recognition dataset for mobile driving scenarios is established. This method tested SAR, SVTR, and RobustScanner models on the DSO dataset, with accuracy improvements of 11.28%, 15.12%, and 11.66%, respectively. Compared with experiments with randomly annotated data, the accuracy gains of each model were 5.32%, 5.88%, and 6.53%, respectively. The results confirm that this method significantly reduces manual annotation costs while enhancing model performance and robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325319 (2024) https://doi.org/10.1117/12.3040971
Schools, as a scene with a high concentration of students, have a higher probability of generating dangerous behaviors than other scenes due to their age. Especially in primary and secondary school environments, younger students are prone to push, fall and other behaviors that can cause personal injuries. Existing methods for behavior detection require implementation on powerful CPU or GPU hardware devices, which brings many inconvenient factors such as high power consumption and large hardware size. In order to address the above mentioned issues, this project proposes a solution for the detection of dangerous behavior and FPGA acceleration in schools. First, YOLOv4-Tiny backbone network improving with depthwise separable convolutions is deployed on an FPGA platform. Second, the structure of the convolutional blocks is optimized for interface parallelism to accelerate the read and write speeds. Finally, the weights of different layers are quantized using 16-bit fixed-point numbers with different decimal bit widths. The experimental results on the self-made school dangerous behavior dataset show that the method proposed in this paper achieves a detection speed of 291ms on the FPGA platform with the mAP value of 94.76%, the power consumption of 4.01W on the hardware platform, and the energy efficiency ratio of about 5.81GOPS/W, which is an improvement of 3.72 and 1.85 times compared to the energy efficiency ratio of CPU and GPU, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531A (2024) https://doi.org/10.1117/12.3041044
Polarization imaging technology is widely utilized in imaging scattering media because it can capture more dimensional information compared to intensity imaging. In this study, we propose an unsupervised polarization dehazing network to tackle the challenge of low performance exhibited by dehazing networks trained on synthetic datasets when applied in real-world scenarios. The network parameters are updated solely using foggy images. Based on the atmospheric scattering model, foggy images are degraded in two dimensions: increasing distance and multiplying attenuation. The dependency relationship between the atmospheric scattering model parameters before and after image degradation is derived, and the network parameters are updated accordingly. The atmospheric scattering model is then inverted to restore the fogless image based on the atmospheric light at infinity and transmission map estimated by the network. The experimental results demonstrate that this algorithm surpasses existing algorithms in various no-reference image evaluation metrics and visual performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531B (2024) https://doi.org/10.1117/12.3041563
Human pose estimation serves as a critical foundation for various subsequent tasks such as human-computer interaction, motion analysis, and action recognition. Current human pose estimation networks often demand larger model parameters and computational resources to continuously enhance estimation accuracy, posing challenges for implementation on low-powered edge computing devices. This study, through experimentation, identified a prevalent issue of redundancy in convolution channels within existing human pose estimation network models. Notably, strong similarities were observed among feature maps output by convolution channels. In response to these challenges, this paper introduces a lightweight yet accurate human pose estimation network model, designed to be applicable to most edge computing devices while maintaining high estimation accuracy. The proposed model initiates improvements to ordinary convolutions in the network by employing a reparametrizable partial convolution for redundant reduction. Simultaneously, it enriches the diversity of the extracted features. Furthermore, an effective multiscale cross-attention mechanism is designed to fuse features at different stages of the backbone network. This approach enhances accuracy while mitigating the severe decrease in inference speed associated with excessive multiscale fusion. Through these design strategies, the proposed model achieves a balance between accuracy and speed, with a smaller computational and parameter footprint. Experimental validation on the COCO and MPII datasets verifies the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531C (2024) https://doi.org/10.1117/12.3041273
The DETR(Detection Transformer), based on the Transformer architecture, has shown great potential as an end-to-end object detection application in this field. It is worth noting that the DETR model uses the Hungarian algorithm for object prediction. However, due to the instability of the Hungarian matching algorithm itself, the model faces inconsistency issues in target optimization during early training stages, resulting in slower convergence speed. To address this challenge, this paper proposes an innovative hybrid task assignment algorithm. This algorithm increases the number of positive sample queries using one-to-many matching, allowing queries to predict multiple aspects of a single target and improving matching stability. Additionally, targets are grouped based on their true box sizes and corresponding grouping is applied to queries as well. Each query group is responsible for matching targets of specific sizes, significantly enhancing matching stability and accelerating model convergence process. Experimental results on COCO dataset demonstrate the effectiveness of this approach, showing outstanding performance in a single-scale DETR model with ResNet-50 backbone network achieving an average precision (AP) of 38.8% within 12 training epochs. Compared to baseline models with similar settings, it achieves a 3.2% AP improvement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531D (2024) https://doi.org/10.1117/12.3042460
Recognizing human action in videos at low resolution is of crucial importance for security monitoring and privacy protection. The previous methods usually carry out in a two-stage manner combined with super-resolution. The super-resolution module is trained with paired high-resolution (HR) and low-resolution (LR) action videos in the first step, then the output videos are used to train the action classification module subsequently. However, the practice guides super-resolution enhancement for promoting video visual quality rather than action recognition accuracy. Moreover, only low-resolution videos are usually available in the real world. We propose an end-to-end framework OSRO for low-resolution video human action recognition, which is trained in a one-stage pipeline with only low-resolution videos. Specifically, the super-resolution enhancement module and action classification module are cascaded to form a single-stream model, jointly optimized via our newly designed comprehensive loss LOSRO . Extensive experiments demonstrated that our model OSRO achieves excellent performance, which obtains an accuracy of 80.62% on the low-resolution UCF101 dataset, which surpasses the previous best method (Prog. DVSR) by 10.07%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531E (2024) https://doi.org/10.1117/12.3041034
YOLOv7 algorithm is widely used in fruit target detection, but there are issues including intricate construction, expensive computing, and low precision. The aforementioned issues can be addressed through the implementation of an enhanced algorithm. EDEM-YOLO based on YOLOv7 was proposed to detect the maturity of two common tomato varieties, big tomato and cherry tomato. The original backbone network of YOLOv7 is replaced with a lightweight feature extraction network model, Efficient Vision Transformer, in order to enhance the model's simplicity. In addition, in order to optimize the deviation of the width and height of the bounding box, the loss function of the improved model was replaced by MPDiou, in order to enhance the information extraction ability of the model for small objects. EMA (Efficient Multi-Scale Attention) and Dynamic head object detection head based on attention mechanism are fused in the head network layer. The experimental results show that the EDEM-YOLO algorithm model achieves high accuracy and detection efficiency. Without using the original pre-training weights of YOLOv7, the Mean Average Precision (mAP) of maturity detection is 81.7%, which is 1.5%higher than that of the original YOLOv7 and the calculation amount is reduced by 62.3%. The number of parameters is reduced by 24.1%. The EDEM-YOLOv7 model shows excellent detection ability in similar object detection models, and can be applied to various complex mechanical automatic detection tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wireless Communication Design and Signal Reception Processing
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531F (2024) https://doi.org/10.1117/12.3040972
Interrupted sampling repeater jamming (ISRJ) is highly coherent with radar signals and can produce a significant false target jamming effect, which severely affects radar detection. To achieve jamming signal reconstruction and suppression, this paper proposes a data-driven parameter estimation method based on the feature analysis of the jamming signal. The method uses the neural network based on the gated recurrent units (GRU) to extract the temporal and frequency domain characteristics of the radar received signal contaminated by jamming, and combines it with the jamming envelope to estimate the jamming sampling pulse signal. Simulation analysis validates the effectiveness and expected estimation accuracy of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Zhichao Xu, Jie Chen, Mengyue Zhang, Suqin Xu, Liming Yuan
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531G (2024) https://doi.org/10.1117/12.3042467
This paper presents an innovative synthetic aperture radar (SAR) image-based on-borne detection method for internal ocean waves, which is designed to address the challenges of automation and real-time capabilities in existing algorithms. By preprocessing SAR images through geographic registration, land masking, filtering, and quality control, and extracts spatial features such as distribution, orientation, and scale to provide precise detection criteria for internal ocean waves. The proposed algorithm not only achieves high-precision detection but also significantly improves efficiency, overcoming the limitations of resource-constrained satellite platforms that hinder the implementation of complex machine learning models. The paper systematically introduces the algorithm, data processing procedure, experimental results, and conclusion, demonstrating the significant advantages of the proposed method in terms of detection accuracy and efficiency. This provides new technical means for the research of internal ocean waves.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531H (2024) https://doi.org/10.1117/12.3041282
Traditionally, GNSS receivers in smartphones are primarily used for position localization. However, with the release of Android 7.0 by Google, smartphones now support the provision of raw GNSS data, which can be utilized for velocity estimation. To explore the feasibility and performance of GNSS velocity estimation on smartphones, this study employs raw GNSS data from smartphones and conducts velocity estimation based on both the TDCP (Time-Differenced Carrier Phase) method and the Doppler method. The results demonstrate that velocity estimation on smartphones is feasible, with the TDCP-based method achieving higher accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531I (2024) https://doi.org/10.1117/12.3041674
Oceanic internal waves typically develop in stratified water and are highly stochastic. The internal waves in SAR images usually appear as irregular bright and dark stripes. In this paper, a convolutional neural network based on SqueezeNet model is used to detect internal waves in SAR images. For this purpose, 1388 samples from spaceborne (ENVISAT, ERS-2) SAR images are utilized to train and test the network. The experimental results show that the overall accuracy is 83.5%, the detection rate and false alarm rate of the internal waves are 75% and 8%, respectively. And the average speed is 0.0131 s/image. The accuracy of the algorithm is satisfied for processing a complete SAR image and can also be applied to the images from other remote sensing platforms. In addition, this paper adopts lightweight convolutional neural network as the basic framework for internal waves detection, which provides possible portability for remote sensing platforms, such as the spaceborne and airborne, with limited computational resources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Chonghao Wu, Yawen Wang, Yanan Zhou, Mingming Ren, Cao Huang, Feiteng Luo
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531J (2024) https://doi.org/10.1117/12.3040976
To improve the efficiency of the return link resources of GEO satellites and meet the QoS requirements of real-time services, this paper proposes a Best Fit Utility Function Timeslot Allocation algorithm (BFUF-TA) to allocate timeslots in the MF-TDMA frame structure. Simulation results show that, compared with the traditional Flexible Timeslot Allocation algorithm, the throughput of the BUFU-TA algorithm has increased by 8%, the delay rate of its real-time services has decreased by 5%, and its resource slot utilization rate has increased by approximately 6.5%, reaching to 98.785%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531K (2024) https://doi.org/10.1117/12.3042422
New navigation signal broadcasts from low-orbit communications satellites. In comparison with the GNSS signal, these new navigation signals are designed based on Doppler measurements and share the transmission channel with the communication signal. The Doppler frequency shift simulation analysis of Iridium signal received by the ground is analyzed. The Iridium STL signal structure is analyzed. Delay calibration of a new navigation signal transmission is an important task for satellite design. This paper proposed a delay calibration method of the new navigation signal. And the delay calibration error is analyzed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lin Li, Guoqing Zhou, Ruixiang Li, Yangleijing Li, Shuaiguang Zhu
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531L (2024) https://doi.org/10.1117/12.3041258
In the obstacle detection process of lidar-based intelligent vehicles, there exists the phenomenon of inaccurate clustering of obstacle point clouds, which leads to the problem of misdetection and omission of obstacle detection. For this reason, this paper proposes an ellipsoidal neighborhood obstacle point cloud clustering algorithm (ENC), which firstly analyzes the spatial distribution of the vehicle lidar point cloud, then selects the sampling points and compares the distances between them and the surrounding points, then determines the long and short axes of the ellipsoid, and finally performs the clustering. The KITTI point cloud set is used to validate this paper's algorithm and compare the results with DBSCAN algorithm and Euclidean clustering algorithm. The experiments show that the ENC algorithm in this paper has the highest positive detection rate of 95.82% and the time consumed meets the real-time requirements of detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531M (2024) https://doi.org/10.1117/12.3041303
In the process of using LiDAR to measure the underwater terrain, some anomalous data are generated due to the influence of the complex environment such as fish and water plants in the water. Therefore, in order to obtain more accurate bathymetric data of underwater terrain, it is necessary to eliminate the anomalous values of the bathymetric data. This paper proposes the method of constructing equivalent weight function, using the method of constructing equivalent weight function, using simulated data experiments, comparing and analyzing the boundary values of different weight functions, and using the airborne bathymetric LiDAR echo signal data measured at Saipem Tallow Beach, Guilin, to conduct the validation experiments of detecting and rejecting the outliers. The application of the third-order trend surface fitting method, Resistance M Estimated, and the method of constructing the equivalent weight function proposed in this paper in rejecting anomalous data in the actual bathymetric point cloud data are compared. The experimental results show that a total of 1206 outlier points are detected by the third-order trend surface; a total of 1479 outlier points are detected by the resisted M-estimation method; and 1634 outlier points are rejected by the algorithm of this paper, with an increase of 10.5% and 35.5% in the amount of outlier data rejection, respectively. Comparative analysis of the experimental results of the three methods in different terrain environments proves that this paper's algorithm achieves better rejection results in all types of terrain of the measured point cloud data experiment, and the rejection rate and accuracy are better than the comparison methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531N (2024) https://doi.org/10.1117/12.3041472
Addressing the problem that existing point cloud network models do not make sufficient use of millimetre-wave radar point cloud information, a millimetre-wave radar human fall detection method based on an improved point transformer is proposed. The method uses the improved point transformer network model to extract the multidimensional feature information of the point cloud for classification and identification to achieve accurate fall detection. Several human action data from different environments and different individuals were collected to construct a human posture point cloud dataset. Comparative experiments show that the proposed millimeter-wave radar human fall detection method based on the improved point transformer makes full use of the millimeter-wave radar point cloud information, achieves up to 99.37% detection accuracy, and shows good generalisation ability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531O (2024) https://doi.org/10.1117/12.3041699
This paper proposes a miniaturized ±45° dual-polarized base station antenna for low-frequency 5G. The original two resonant modes are gradually separated by chamfering the inflected right-angled edges of the radiating arms of the square ring dipole, so as to realize miniaturization and wide bandwidth. To induce a novel resonant mode at elevated frequencies, a compact square ring dipole is strategically positioned within the larger square ring dipole. By fusing multiple resonant modes together, a wider bandwidth is obtained. The outcome reveals that the antenna has S11 and S22 less than -15 dB in 616-992 MHz, port isolation (S21) higher than 33 dB, and gain not less than 8.03 dBi, and the variation of half-power bandwidth (HPBW) within 69±4°, with the size of only 0.38λ0*0.38λ0(λ0 represents the wavelength associated with the center frequency f0 in a vacuum or free space conditions).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531P (2024) https://doi.org/10.1117/12.3041663
This PAPER presents the design of a voltage controlled oscillator. The LC type oscillator is designed in 180nm process to operate in Wifi and Bluetooth application bands. The test results show a static power dissipation of 12mW at 27°C and a tuning range of 2.32-2.58GHz, when the phase noise is -117dBc/Hz@1MHz.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531Q (2024) https://doi.org/10.1117/12.3041934
With the continuous development of radar jamming methods and styles, radar systems are facing more and more complex electromagnetic environment, resulting in limited suppression performance through spatial anti-jamming measures solely. In order to solve the problem, this paper proposes an anti-jamming method based on joint spatial-polarimetric processing for Frequency Diverse Array Multiple-Input Multiple-Output (FDA-MIMO) system. Firstly, the signal model of Polarimetric FDA-MIMO (PFDA-MIMO) is established. Based on this model, a joint spatial-polarimetric adaptive filtering method is proposed and experiments are carried out to demonstrate the advantages of joint spatial-polarimetric processing in anti-jamming. Compared to the filtering method that only utilizing the spatial information, this method has a better performance dealing with multidimensional jamming and improves the ability of FDA-MIMO in resisting main lobe jamming.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531R (2024) https://doi.org/10.1117/12.3041773
Space debris tracking is a crucial aspect of addressing threats posed by space debris, ensuring the safety and reliability of space missions. Traditional methods based on correlation filters rely on image features and prior knowledge of the target, while deep learning-based approaches require complex designs of header networks and loss functions, making them difficult to train. In this paper, we propose a space debris tracking algorithm that models the continuous motion of targets. This approach transforms the problem from generating target position information on a single image to generating target position information on a sequence of continuous images. When generating target sequences, the model no longer relies solely on visual features from the current search image but also takes the target's historical motion trajectory as spatiotemporal cues input to the decoder. This enables the model to extract motion patterns of the target, allowing it to track the target even when temporarily occluded in the image sequence. Experimental results on a real open-source space debris tracking dataset demonstrate the effectiveness of the proposed algorithm compared to six traditional methods based on correlation filters and deep learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Chuang Weng, Dengchao Peng, Danrui Zhang, Xiwen Gong
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531S (2024) https://doi.org/10.1117/12.3041746
To address the technical challenges of data transmission efficiency in the software design phase of CAN-Ethernet gateway, an optimization scheme for user-space protocol stacks based on DPDK (Data Plane Development Kit) technology is proposed. Traditional kernel-based packet processing methods incur significant performance overhead due to frequent interrupts. By introducing DPDK technology, packet processing is done directly in user space, avoiding the performance losses caused by interrupts. This study designed a DPDK user-space protocol stack based on the Loongson 2k1000LA development board and LoongOS operating system, implementing support for ARP, ICMP, and UDP protocols. Performance comparison tests with traditional kernel protocol stacks show that this solution significantly improves gateway data throughput, reduces data processing latency, and lowers CPU utilization, thereby enhancing the overall performance of the CAN-Ethernet gateway. In addition, KNI (Kernel NIC Interface) technology is used to handle special packets, ensuring system flexibility and scalability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531T (2024) https://doi.org/10.1117/12.3041698
The current focus in the development of space-based synthetic aperture radar (SAR) is on compact payloads weighing less than 100kg, capable of achieving resolutions greater than 1m. This concentration is aimed at satisfying the requirements for launching multiple satellites on a single rocket and reducing expenses. This paper explores the development and advancement of a Ku-band SAR payload for the Taijing-4(03) Satellite, which was launched on January 23, 2024 alongside four other satellites. The design of the SAR payload was tailored to fit the specifications of a micro-nano satellite platform, resulting in a lightweight, sleek layout weighing under 80kg and seamlessly integrated with the plate-shaped satellite platform. The article also presents a strategy for optimizing the beams of the phased array SAR antenna, significantly enhancing the performance of the SAR system. The SAR payload offers a variety of operational modes such as slide-spot, strip, scan 1, scan 2, and more, with a maximum achievable resolution exceeding 1m. Extensive testing of the payload in orbit yielded numerous high-quality SAR images that could be utilized for emergency disaster response, ecosystem preservation, forest monitoring, crop management, sea ice tracking, and other applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531U (2024) https://doi.org/10.1117/12.3040984
In recent years, the research on direct position determination (DPD) for non-circular signals using multiple arrays has garnered significant interest. This paper presents a novel DPD algorithm that achieves high accuracy with low complexity. The proposed algorithm begins by conducting virtual aperture expansion of the received multiple station non-circular signals, thereby enhancing performance. Subsequently, it employs unitary transformation to convert complex calculations into real calculations, thereby substantially reducing computational complexity. Finally, by leveraging subspace data fusion and the Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) algorithm, the source positions can be accurately estimated. Simulation results indicate that the algorithm attains high localization accuracy while maintaining low complexity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531V (2024) https://doi.org/10.1117/12.3041852
In order to improve the accuracy, stability and reliability of broadcasting timing service, the quality of broadcasting timing signal needs to be monitored. Therefore, the receiving and monitoring algorithm of frequency modulation (FM) broadcasting timing signal is studied in this paper. The timing monitoring system is designed and realized. The basic principles of FM broadcast timing and timing monitoring are described in this paper. Because the power of subsidiary communication authority (SCA) channel is only 1/10 of the audio channel’s, the demodulated signal will be distorted completely if traditional demodulation methods are used. Therefore, the different demodulation algorithms of FM broadcasting timing signals are studied in this paper. Simulation results show that the combination algorithm of matched filtering and cross-correlation method is proved to improve signal-to-noise ratio (SNR) by about 15 dB. The signal receiving and processing flow of the monitoring system is designed. In order to evaluate the broadcasting timing performance, FM broadcasting timing monitoring system is built to collect timing data. The experimental results show that the resolution of monitoring measurement can reach 100ns, the demodulation accuracy of broadcasting signal reception is better than 50us, and the broadcasting timing accuracy can reach sub-millisecond level.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531W (2024) https://doi.org/10.1117/12.3041229
Based on the characteristics of synthetic aperture radar (SAR) images, this paper designs a lightweight generative adversarial network (GAN) model that uses dual-channel mode to simultaneously read two SAR images from different angles and generate new images containing multi-angle image information to achieve data expansion. This model uses an attention mechanism module to enhance the expressiveness of key features and improve the quality of generated images. Images synthesized by this method incorporate information from multiple angle views and contain more detailed target features. Finally, experiments show that the fused image obtained by this method has high information entropy and is highly similar to the source image. Additional, the effectiveness of the data expansion is proved by using the single shot multibox detection (SSD) framework.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531X (2024) https://doi.org/10.1117/12.3041592
The presence of noise can seriously affect the intelligent decoding and recognition of Ground-Penetrating Radar (GPR) underground pipelines, in view of this, this paper proposes a deep unfolding network (GPR-DUNet) suitable for denoising GPR underground pipeline data. Firstly, in the gradient descender part, a gated gradient descent module is designed to unfold the proximal gradient descent algorithm, which enables the network to be trained even when the degradation matrix is unknown, secondly, in the denoiser part, a denoising proximity mapping module is constructed to obtain features at different scales of the GPR image using the group normalized channel attention mechanism and a simplified local enhanced feed-forward network to dramatically improve the denoising performance, and lastly, a cross-stage feature fusion submodule was designed to address the problem of information loss between stages. The experimental results on real and simulated GPR image denoising show that the method has good results in the field of GPR image denoising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Sihang Zhang, Zhi Yang, Kuanjun Zhu, Hongbin Xie, Bin Liu, Lixian Zhou, Junhui Li
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531Y (2024) https://doi.org/10.1117/12.3040968
The Green Plastic Cover (GPC) around power transmission towers is one of the main external hazards to power transmission lines, and conducting remote sensing identification of GPC is of great significance for the management of external hazards to power transmission lines. Existing remote sensing identification studies mainly focus on plastic greenhouses and plastic mulches, with relatively few studies on GPC, especially in areas around power transmission towers. Consequently, this research selected four areas along transmission corridors in Jiangsu Province as a case study to boost GPC mapping performance through integrating the focal loss function into the TransUNet model using Sentinel-1 and Sentinel-2 data, and subsequently conducting with the U-Net model and Deeplabv3+ model for comparative experiment to validate the superiority of the model selected in this study. The experimental results demonstrated that the TransUNet model performs well in extracting GPC, with Precision, Recall, IoU and F1 values reaching 82.24%, 92.38%, 77.01% and 0.87, respectively. It is feasible and effective to utilize the model in this paper to identify GPC along transmission corridors, which can provide a decision-making basis for the comprehensive management of the risk of external damage.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531Z (2024) https://doi.org/10.1117/12.3041203
This article proposes a small broadband dual polarization antenna that can be applied to base stations The antenna configuration includes vertically positioned ±45° dipole elements accompanied by a metallic reflector, utilizing a W-shaped feeding structure, resulting in a simple antenna structure. By introducing slots at the center of the patch and adding two rectangular patches at the end of the dipole arms, miniaturization is achieved above the substrate. To stabilize the radiation pattern, four baffles are added along the edges of the square reflector. The antenna element size is only 0.28λL×0.28λL×0.21λL, where λL denotes the maximum wavelength within the frequency band of operation for the antenna. The simulation results show that the antenna achieves a return loss of less than -15 dB in the frequency band of 1.7-2.83GHz, isolation between the dual ports less than -30dB, a 3dB beamwidth of 64±8° and maintains a consistent antenna gain of 8.7±0.9 dBi. There is a wide range of application prospects in the field of base station applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325320 (2024) https://doi.org/10.1117/12.3042532
In the context of active sonar, automatic tracking faces challenges in strong reverberation and high background noise environments. This paper addresses the difficulties in automatic tracking by proposing new algorithms for track initiation and track maintenance, incorporating normalized information gain and existing multiple hypothesis correlation tracking algorithms. This approach partially mitigates the issue of excessive false alarms in active target tracking under complex conditions. The effectiveness of this improved algorithm is validated through simulation and experimentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325321 (2024) https://doi.org/10.1117/12.3041909
In order to improve the efficiency of spatial data transmission and meet different service requirements, this study effectively manages multiple data streams through virtual channel multiplexing. In particular, for asynchronous data streams, this study employs an innovative Zebra Optimization Algorithm for scheduling. The algorithm takes into account key factors such as the priority of each service, the delay tolerance and the urgency of the remaining amount of frame data when deciding the scheduling order of the virtual channel. As verified by simulation experiments, this optimized scheduling strategy for asynchronous channels achieves significant improvement in scheduling efficiency, especially in reducing the average scheduling delay and reducing the frame data residual.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 1325322 (2024) https://doi.org/10.1117/12.3041037
The enhancement of the ocean phenomenon on SAR images is an important prerequisite for the use of multi-pol SAR images for ocean phenomenon detection. This paper proposed a method of polarization decomposition based on the characteristics of multi-pol SAR images and the scale differences between the background waves and the ocean phenomenon. The images were firstly processed by polarization decomposition and then the small-scale resonance scattering part and the part of breaking waves are extracted for fusion operations. The experiment results show that the proposed method can effectively preserve information and enhance the details of the ocean phenomenon.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.