The three-dimensional reconstruction of bronchopulmonary segments based on computed tomography (CT) is very critical in lesion, lung cancer localization and surgical resection. However, there is currently no fast and accurate method for three-dimensional reconstruction of pulmonary segments, and the process of labeling pulmonary segments needs to rely on other information such as bronchi and blood vessels, which will greatly consume the time and mental cost of doctors. In this paper, based on the principle of pulmonary segments division, we propose a two-stage fast pulmonary segments division method based on segmental bronchi. Specifically, for a CT image, we employ two well-trained nnUNet models in the first stage to accurately segment 5 lobes and 18 segmental bronchi, respectively. This is because each pulmonary segment should encompass its corresponding segmental bronchi, while lung lobe boundaries exhibit greater distinctiveness compared to those of pulmonary segments. In the second stage, we consider the distance from each pixel point to the segmental bronchi of various pulmonary segments in each lobe, and further divide each lobe to obtain the final 18 types of segments. Finally, we visually validated the rationality of the results by employing the principle of using pulmonary veins as demarcations for pulmonary segments.
Accurately segmenting complete teeth from CBCT images is not only crucial in the field of orthodontics but also holds significant importance in forensic science. However, fully automated teeth segmentation is challenging due to the tight interconnection of teeth and the complexity of their arrangement, as well as the difficulty in distinguishing them from the surrounding alveolar bone due to similar densities. Currently, U-Net-based approaches have demonstrated remarkable success across a spectrum of medical image processing tasks, particularly in the task of segmentation. This work compares some U-Net-based segmentation methods (U-Net, U-Net++, U2-Net, nnU-Net, and TransUNet) on clinical teeth segmentation. We assess the enhancements introduced by these networks over the original U-Net and validate their performance on the identical dataset. Experimental results, both qualitative and quantitative, reveal that all methods perform well, with TransUNet demonstrating the best performance, achieving a Dice coefficient of 0.9364. Notably, U-Net, serving as the foundational model, outperforms U-Net++ and U2-Net, highlighting its robust generalization capability.
The segmentation of pulmonary arteries and veins in computed tomography scans is crucial for the diagnosis and assessment of pulmonary diseases. This paper discusses the challenges in segmenting these vascular structures, such as the classification of terminal pulmonary vessels relying on information from distant root vessels, and the complex branches and crossings of arteriovenous vessels. To address these difficulties, we introduce a fully automatic segmentation method that utilizes multiple 3D residual U-blocks module, a semantic embedding module, and a semantic perception module. The 3D residual U-blocks module can extract multi-scale features under a high receptive field, the semantic embedding module embeds semantic information to aid the network in utilizing the anatomical characteristics of parallel pulmonary artery and bronchi, and the SPM perceives semantic information and decodes it into classification results for pulmonary arteries and veins. Our approach was evaluated on a dataset of 57 lung CT scans and demonstrated competitive performance compared to existing medical image segmentation models.
Monocular depth estimation is a popular task. Due to the difficulty of obtaining true depth labels for the bronchus and the characteristics of the bronchial image such as scarcity of texture, smoother surfaces and more holes, there are many challenges in bronchial depth estimation. Hence, we propose to use a ray tracing algorithm to generate virtual images along with their corresponding depth maps to train an asymmetric encoder-decoder transformer network for bronchial depth estimation. We propose the edge-aware unit to enhance the awareness of the bronchial internal structure considering that the bronchus has few texture features and many edges and holes. And asymmetric encoder-decoder is proposed by us for multi-layer features fusion. The experimental results of the virtual bronchial demonstrate that our method achieves the best results in several metrics, including MAE of 0.915 ± 0.596 and RMSE of 1.471 ± 1.097.
Optical coherence tomography (OCT) is a non-invasive imaging modality that suitable for accessing retinal diseases. Since the thickness and shape of the retinal layer are diagnostic indicators for many ophthalmic diseases, segmentation of the retinal layer in OCT images is a critical step. Automated segmentation of oct images has made many efforts but there are still some challenges, such as lack of context information, ambiguous boundaries and inconsistent prediction of retinal lesion regions. In this work, we propose a new framework of Densely Encoded Attention Networks (DEAN) that combines dense encoders with position attention in an U-architecture for retinal layers segmentation. Since the spatial position of each layer in OCT image is relatively fixed, we use convolution in dense connections to obtain diverse feature maps in the encoder and employ position attention to improve the spatial information of learning targets. Moreover, up-sampling and skip connections in the decoder are to restore resolution by the position index saved during down-sampling, while supplementing the corresponding pixels is to guide the network capturing the global context information. This method is evaluated on two public datasets, and the results demonstrate that our method is an effective strategy on improving the performance of segmenting the retinal layers.
Pulmonary vessel segmentation from CT images is essential to diagnosis and treatment of lung diseases, particularly in treatment planning and clinical outcome evaluation. The main challenge for pulmonary vessel segmentation is complicated structures of the vascular trees and their similar intensity values with other tissues like the tracheal wall and lung nodules. This paper presents a novel relation extractor U-shaped network combining convolution and self-attention mechanism in an encoder-decoder mode. Particularly, we employ convolution in the shallow layers to extract local information of vessels in a short range and apply self-attention in the deep layers to capture long-range contextual relationship between ancestors and descendants of the vascular tree. We evaluate our proposed method on 50 computer tomography volumes, with the experimental results showing that our method can improve the average coefficient dice and recall to 85.60 and 86.04 respectively.
To deal with multitask segmentation, detection and classification of colon polyps, and solve the clinical problems of small polyps with similar background, missed detection and difficult classification, we have realized the method of supporting the early diagnosis and correct treatment of gastrointestinal endoscopy on the computer. We apply the residual U-structure network with image processing to segment polyps, and a Dynamic Attention Deconvolutional Single Shot Detector (DAD-SSD) to classify various polyps on colonic narrow-band images. The residual U-structure network is a two-level nested U-structure that is able to capture more contextual information, and the image processing improves the segmentation problem. DAD-SSD consists of Attention Deconvolutional Module (ADM) and Dynamic Convolutional Prediction Module (DCPM) to extract and fuse context features. We evaluated narrow-band images, and the experimental results validate the effectiveness of the method in dealing with such multi-task detection and classification. Particularly, the mean average precision (mAP) and accuracy are superior to other methods in our experiment, which are 76.55% and 74.4% respectively.
Accurate pulmonary nodule segmentation in computed tomography (CT) images is of great importance for early diagnosis and analysis of lung diseases. Although deep convolutional networks driven medical image analysis methods have been reported for this segmentation task, it is still a challenge to precisely extract them from CT images due to various types and shapes of lung nodules. This work proposes an effective and efficient deep learning framework called enhanced square U-Net (ESUN) for accurate pulmonary nodule segmentation. We trained and tested our proposed method on publicly available data LUNA16. The experimental results showing that our proposed method can achieve Dice coefficient of 0.6896 better than other approaches with high computational efficiency, as well as reduce the network parameters significantly from 44.09M to 7.36M.
The problems of the large variation in shape and location, and the complex background of many neighboring tissues in the pancreas segmentation hinder the early detection and diagnosis of pancreatic diseases. The U-Net family achieve great success in various medical image processing tasks such as segmentation and classification. This work aims to comparatively evaluate 2D U-Net, 2D U-Net++ and 2D U-Net3+ for CT pancreas segmentation. More interestingly, We also modify U-Net series in accordance with depth wise separable convolution (DWC) that replaces standard convolution. Without DWC, U-Net3+ works better than the other two networks and achieves an average dice similarity coefficient of 0.7555. Specifically, according to this study, we find that U-Net plus a simple module of DWC certainly works better than U-Net++ using redesigned dense skip connections and U-Net3+ using full-scale skip connections and deep supervision and can obtain an average dice similarity coefficient of 0.7613. More interestingly, the U-Net series plus DWC can significantly reduce the amount of training parameters from (39.4M, 47.2M, 27.0M) to (14.3M, 18.4M, 3.15M), respectively. At the same time, they also improve the dice similarity compared to using normal convolution.
Kidney segmentation is fundamental for accurate diagnosis and treatment of kidney diseases. Computed tomography urography imaging is commonly used for radiologic diagnosis of patients with urologic disease. Recently, 2D and 3D fully convolutional networks are widely employed for medical image segmentation. However, most 2D fully convolutional networks do not take inter-slice spatial information into consideration, resulting in incomplete and inaccurate segmentation of targets in 3D volumes. While the spatial information is truly important for 3D volumes segmentation. To tackle these problems, we propose a computed tomography urography kidney segmentation method on the basis of spatiotemporal fully convolutional networks that employ the convolutional long short-term memory network to model inter-slice features of computed tomography urography images. We trained and tested our proposed method on kidney computed tomography urography data. The experimental results demonstrate our proposed method can effectively leverage the inter-slice spatial information to achieve better (or comparable) results than current 2D and 3D fully convolutional networks.
Abdominal kidney segmentation plays an essential role in diagnosis and treatment of kidney diseases, particularly in surgical planning and clinical outcome analysis before and after kidney surgery. It still remains challenging to precisely segment the kidneys from CT images. Current segmentation approaches still suffer from CT image noises and variations caused by different CT scans, kidney location discrepancy, pathological morphological diversity among patients, and partial volume artifacts. This paper proposes a fully automatic kidney segmentation method that employs a volumetric convolution driven cascaded V-Net architecture and false positive reduction to precisely extract the kidney regions. We evaluate our method on publicly available kidney CT data. The experimental results demonstrate that our proposed method is a promising method for accurate kidney segmentation, providing a dice coefficient of 0.95 better than other approaches as well less computational time.
Endoscopic video sequences provide surgeons with much structural information (e.g., vessels and neurovascular bundles) that guides them to accurately manipulate various surgical tools and avoid surgical risks. Unfortunately, it is difficult for surgeons to intuitively perceive these small structures with tiny pulsation motion on endoscopic images. This work proposes a new endoscopic video motion magnification method to accurately generate the amplified pulsation motion that can be intuitively and easily visualized by surgeons. The proposed method explores a new temporal filtering for Eulerian motion magnification method to precisely magnify the tiny pulsation motion and simultaneously suppress noise and artifacts in endoscopic videos. We evaluate our approach on surgical endoscopic videos acquired in robotic prostatectomy. The experimental results demonstrate that our proposed temporal filtering method essentially outperforms other filters in current video motion magnification approaches, while it provides better visual quality and quantitative assessment than other methods.
Three-dimensional (3-D) scene reconstruction from stereoscopic binocular laparoscopic videos is an effective way to expand the limited surgical field and augment the structure visualization of the organ being operated in minimally invasive surgery. However, currently available reconstruction approaches are limited by image noise, occlusions, textureless and blurred structures. In particular, an endoscope inside the body only has the limited light source resulting in illumination non-uniformities in the visualized field. These limitations unavoidably deteriorate the stereo image quality and hence lead to low-resolution and inaccurate disparity maps, resulting in blurred edge structures in 3-D scene reconstruction. This paper proposes an improved stereo correspondence framework that integrates cost-volume filtering with joint upsampling for robust disparity estimation. Joint bilateral upsampling, joint geodesic upsampling, and tree filtering upsampling were compared to enhance the disparity accuracy. The experimental results demonstrate that joint upsampling provides an effective way to boost the disparity estimation and hence to improve the surgical endoscopic scene 3-D reconstruction. Moreover, the bilateral upsampling generally outperforms the other two upsampling methods in disparity estimation.
This paper studies uncalibrated stereo rectification and stable disparity range determination for surgical scene three-dimensional (3-D) reconstruction. Stereoscopic endoscope calibration sometimes is not available and also increases the complexity of the operating-room environment. Stereo from uncalibrated endoscopic cameras is an alternative to reconstruct the surgical field visualized by binocular endoscopes within the body. Uncalibrated rectification is usually performed on the basis of a number of matched feature points (semi-dense correspondence) between the left and the right images of stereo pairs. After uncalibrated rectification, the corresponding feature points can be used to determine the proper disparity range that helps to improve the reconstruction accuracy and reduce the computational time of disparity map estimation. Therefore, the corresponding or matching accuracy and robustness of feature point descriptors is important to surgical field 3-D reconstruction. This work compares four feature detectors: (1) scale invariant feature transform (SIFT), (2) speeded up robust features (SURF), (3) affine scale invariant feature transform (ASIFT), and (4) gauge speeded up robust features (GSURF) with applications to uncalibrated rectification and stable disparity range determination. We performed our experiments on surgical endoscopic video images that were collected during robotic prostatectomy. The experimental results demonstrate that ASIFT outperforms other feature detectors in the uncalibrated stereo rectification and also provides a stable stable disparity range for surgical scene reconstruction.
This paper proposes an adaptive fiducial-free registration method that uses a multiple point selection strategy based on sensor orientation and endoscope radius information. To develop a flexible endoscopy navigation system, since we use an electromagnetic tracker with positional sensors to estimate bronchoscope movements, we must synchronize such tracker and pre-operative image coordinate systems using either marker-based or fiducial-free registration methods. Fiducial-free methods assume that bronchoscopes are operated along bronchial centerlines. Unfortunately, such an assumption is easily violated during interventions. To address such a tough assumption, we utilize an adaptive strategy that generates multiple points in terms of sensor measurements and bronchoscope radius information. From these generated points, we adaptively choose the optimal point, which is the closest to its assigned bronchial centerline, to perform registration. The experimental results from phantom validation demonstrate that our proposed adaptive strategy significantly improved the fiducial-free registration accuracy from at least 5.4 to 2.2 mm compared to current available methods.
Localization of a bronchoscope and estimation of its motion is a core component for constructing a bronchoscopic
navigation system that can guide physicians to perform any bronchoscopic interventions such as the
transbronchial lung biopsy (TBLB) and the transbronchial needle aspiration (TBNA). To overcome the limitations
of current methods, e.g., image registration (IR) and electromagnetic (EM) localizers, this study develops
a new external tracking technique on the basis of an optical mouse (OM) sensor and IR augmented by sequential
Monte Carlo (SMC) sampling (here called IR-SMC). We first construct an external tracking model by an OM
sensor that is uded to directly measure the bronchoscope movement information including the insertion depth
and the rotation of the viewing direction of the bronchoscope. To utilize OM sensor measurements, we employed
IR with SMC sampling to determine the bronchoscopic camera motion parameters. The proposed method was
validated on a dynamic phantom. Experimental results demonstrate that our constructed external tracking prototype
is a perspective means to estimate the bronchoscope motion, compared to the start-of-the-art, especially
for image-based methods, improving the tracking performance by 17.7% successfully processed video images.
This paper presents an improved bronchoscope tracking method for bronchoscopic navigation using scale invariant
features and sequential Monte Carlo sampling. Although image-based methods are widely discussed in
the community of bronchoscope tracking, they are still limited to characteristic information such as bronchial
bifurcations or folds and cannot automatically resume the tracking procedure after failures, which result usually
from problematic bronchoscopic video frames or airway deformation. To overcome these problems, we propose
a new approach that integrates scale invariant feature-based camera motion estimation into sequential Monte
Carlo sampling to achieve an accurate and robust tracking. In our approach, sequential Monte Carlo sampling
is employed to recursively estimate the posterior probability densities of the bronchoscope camera motion parameters
according to the observation model based on scale invariant feature-based camera motion recovery. We
evaluate our proposed method on patient datasets. Experimental results illustrate that our proposed method
can track a bronchoscope more accurate and robust than current state-of-the-art method, particularly increasing
the tracking performance by 38.7% without using an additional position sensor.
Image-guided bronchoscopy usually requires to track the bronchoscope camera position and orientation to align
the preinterventional 3-D computed tomography (CT) images to the intrainterventional 2-D bronchoscopic video
frames. Current state-of-the-art image-based algorithms often fail in bronchoscope tracking due to shortages
of information on depth and rotation around the viewing (running) direction of the bronchoscope camera. To
address these problems, this paper presents a novel bronchoscope tracking method for bronchoscopic navigation
based on a low-cost optical mouse sensor, bronchial structure information, and image registration. We first utilize
an optical mouse senor to automatically measure the insertion depth and the rotation of the viewing direction
along the bronchoscope. We integrate the outputs of such a 2-D sensor by performing a centerline matching
on the basis of bronchial structure information before optimizing the bronchoscope camera motion parameters
during image registration. An assessment of our new method is implemented on phantom data. Experimental
results illustrate that our proposed method is a promising means for bronchoscope tracking, compared to our
previous image-based method, significantly improving the tracking performance.
This paper presents a hybrid camera tracking method that uses electromagnetic (EM) tracking and intensitybased
image registration and its evaluation on a dynamic motion phantom. As respiratory motion can significantly
affect rigid registration of the EM tracking and CT coordinate systems, a standard tracking approach
that initializes intensity-based image registration with absolute pose data acquired by EM tracking will fail
when the initial camera pose is too far from the actual pose. We here propose two new schemes to address this
problem. Both of these schemes intelligently combine absolute pose data from EM tracking with relative motion
data combined from EM tracking and intensity-based image registration. These schemes significantly improve
the overall camera tracking performance. We constructed a dynamic phantom simulating the respiratory motion
of the airways to evaluate these schemes. Our experimental results demonstrate that these schemes can
track a bronchoscope more accurately and robustly than our previously proposed method even when maximum
simulated respiratory motion reaches 24 mm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.