KEYWORDS: Image segmentation, Medical imaging, Kidney, Ultrasonography, Monte Carlo methods, Performance modeling, Image enhancement, Data modeling, Reliability, Uncertainty analysis
The Segment Anything Model (SAM) is a recently developed all-range foundation model for image segmentation. It can use sparse manual prompts such as bounding boxes to generate pixel-level segmentation in natural images but struggles in medical images such as low-contrast, noisy ultrasound images. We propose a refined test-phase prompt augmentation technique designed to improve SAM’s performance in medical image segmentation. The method couples multi-box prompt augmentation and an aleatoric uncertainty-based false-negative (FN) and false-positive (FP) correction (FNPC) strategy. We evaluate the method on two ultrasound datasets and show improvement in SAM’s performance and robustness to inaccurate prompts, without the necessity for further training or tuning. Moreover, we present the Single-Slice-to-Volume (SS2V) method, enabling 3D pixel-level segmentation using only the bounding box annotation from a single 2D slice. Our results allow efficient use of SAM in even noisy, low-contrast medical images. The source code has been released at: https://github.com/MedICL-VU/FNPC-SAM
KEYWORDS: Computed tomography, Image segmentation, Transformers, Windows, Medical imaging, 3D image processing, 3D modeling, Prostate, Magnetic resonance imaging, Data modeling
Convolutional Neural Networks (CNNs) exhibit strong performance in medical image segmentation tasks by capturing high-level (local) information, such as edges and textures. However, due to the limited field of view of convolution kernels, it is hard for CNNs to fully represent global information. Recently, transformers have shown good performance for medical image segmentation due to their ability to better model long-range dependencies. Nevertheless, transformers struggle to capture high-level spatial features as effectively as CNNs. A good segmentation model should learn a better representation from local and global features to be both precise and semantically accurate. In our previous work, we proposed CATS, which is a U-shaped segmentation network augmented with transformer encoder. In this work, we further extend this model and propose CATS v2 with hybrid encoders. Specifically, hybrid encoders consist of a CNN-based encoder path paralleled to a transformer path with a shifted window, which better leverage both local and global information to produce robust 3D medical image segmentation. We fuse the information from the convolutional encoder and the transformer at the skip connections of different resolutions to form the final segmentation. The proposed method is evaluated on three public challenge datasets: Beyond the Cranial Vault (BTCV), Cross-Modality Domain Adaptation (CrossMoDA) and task 5 of Medical Segmentation Decathlon (MSD-5), to segment abdominal organs, vestibular schwannoma (VS) and prostate, respectively. Compared with the state-of-the-art methods, our approach demonstrates superior performance in terms of higher Dice scores. Our code is publicly available at https://github.com/MedICL-VU/CATS.
Automated segmentation of multiple sclerosis (MS) lesions from MRI scans is important to quantify disease progression. In recent years, convolutional neural networks (CNNs) have shown top performance for this task when a large amount of labeled data is available. However, the accuracy of CNNs suffers when dealing with few and/or sparsely labeled datasets. A potential solution is to leverage the information available in large public datasets in conjunction with a target dataset which only has limited labeled data. In this paper, we propose a training framework, SSL2 (self-supervised-semi-supervised), for multi-modality MS lesion segmentation with limited supervision. We adopt self-supervised learning to leverage the knowledge from large public 3T datasets to tackle the limitations of a small 7T target dataset. To leverage the information from unlabeled 7T data, we also evaluate state-of-the-art semi-supervised methods for other limited annotation settings, such as small labeled training size and sparse annotations. We use the shifted-window (Swin) transformer1 as our backbone network. The effectiveness of self-supervised and semi-supervised training strategies is evaluated in our in-house 7T MRI dataset. The results indicate that each strategy improves lesion segmentation for both limited training data size and for sparse labeling scenarios. The combined overall framework further improves the performance substantially compared to either of its components alone. Our proposed framework thus provides a promising solution for future data/label-hungry 7T MS studies.
KEYWORDS: Image segmentation, Medical imaging, Education and training, Content addressable memory, Control systems, Magnetic resonance imaging, Machine learning, Data modeling, White matter, Biomedical applications
Medical image harmonization aims to transform the image ‘style’ among heterogeneous datasets while preserving the anatomical content. It enables data-sensitive learning-based approaches to fully leverage the data power of large multi-site datasets with different image acquisitions. Recently, the attention mechanism has achieved excellent performance on the image-to-image (I2I) translation of natural images. In this work, we further explore the potential of leveraging the attention mechanism to improve the performance of medical image harmonization. Here, we introduce two attention-based frameworks with outstanding performance in the natural I2I scenario in the context of cross-scanner MRI harmonization for the first time. We compare them with the existing commonly used harmonization frameworks by evaluating their ability to enhance the performance of the downstream subcortical segmentation task on T1-weighted (T1w) MRI datasets from 1.5T vs. 3T scanners. Both qualitative and quantitative results prove that the attention mechanism contributes to a noticeable improvement in harmonization ability.
Recently, deep-learning methods have achieved human-level performance on multiple sclerosis (MS) lesion segmentation. However, most established methods are not robust enough for practical use in the real world. They cannot generalize well to images obtained from different clinical sites, or if training and testing datasets contain different MRI modalities. To address these robustness issues, and to bring the deep neural networks closer to clinical use, we propose the addition of data augmentation and modality dropout during training for achieving unsupervised domain generalization. We hypothesize that employing data augmentations can close the gap between different datasets and render the trained models more generalizable. We further hypothesize that the random dropout technique can help the model learn to predict results given any combination of MRI modalities. We conducted an extensive set of comparisons using three publicly available datasets and demonstrate that our method performs better than the baseline without any augmentation and approaches the performance of fully supervised methods. To provide a fair comparison with other MS lesion segmentation methods, we evaluate our methods on the test set of the Longitudinal MS Lesion Segmentation Challenge using the models trained on the other two datasets. The overall score of our approach is substantially higher than the current transfer-learning-based methods and is comparable to the state-of-the-art supervised methods.
Brain extraction, also known as skull stripping, from magnetic resonance images (MRIs) is an essential preprocessing step for many medical image analysis tasks and is also useful as a stand-alone task for estimating the total brain volume. Currently, many proposed methods have excellent performance on T1-weighted images, especially for healthy adults. However, such methods do not always generalize well to more challenging datasets such as pediatric, severely pathological, or heterogeneous data. In this paper, we propose an automatic deep learning framework for brain extraction on T1-weighted MRIs of adult healthy controls, Huntington’s disease patients and pediatric Aicardi Gouti`eres Syndrome (AGS) patients. We examine our method on the PREDICT-HD and the AGS datasets, which are multi-site datasets with different protocols/scanners. Compared to current state-of-the-art methods, our method produced the best segmentations with the highest Dice score, lowest average surface distance and lowest 95-percent Hausdorff distance on both datasets. These results indicate that our method has better accuracy and generalizability for heterogeneous T1-w MRI datasets.
KEYWORDS: 3D modeling, Magnetic resonance imaging, Data modeling, Image segmentation, Performance modeling, Tumors, Acoustics, Convolutional neural networks, 3D magnetic resonance imaging, 3D acquisition
Acoustic neuroma (AN) is a noncancerous and slow-growing tumor that influences the human hearing system. Magnetic resonance images (MRIs) are routinely utilized to monitor tumor progression. Quantifying tumor growth in an automated manner would allow more precise studies, both at the population level and for the clini- cal management of individual patients. In recent years, deep learning methods have shown excellent performance for many medical image segmentation tasks. However, most current methods do not work well on heterogeneous datasets where MRIs are acquired with vastly different protocols. In this paper, we propose a deep learning framework with ensembled convolutional neural networks (CNNs) to segment acoustic neuromas even in hetero- geneous datasets. We ensemble a 2.5D CNN model and a 3D CNN model together, with augmentations added to the model for better inter-dataset segmentation performance. We test our methods on two datasets: the publicly available dataset from the crossMoDA challenge and an in-house dataset. We examine our method with supervised learning on the crossMoDA dataset and directly apply the trained model to the in-house dataset. We use the Dice score, average surface distance (ASD), and 95-percent Hausdorff distance (95HD) as evaluation metrics. Our method has better performance than the baseline methods, not only on intra-dataset segmentation accuracy but also on inter-dataset generalizability.
The subcortical structures of the brain are relevant for many neurodegenerative diseases like Huntington’s disease (HD). Quantitative segmentation of these structures from magnetic resonance images (MRIs) has been studied in clinical and neuroimaging research. Recently, convolutional neural networks (CNNs) have been successfully used for many medical image analysis tasks, including subcortical segmentation. In this work, we propose a 2-stage cascaded 3D subcortical segmentation framework, with the same 3D CNN architecture for both stages. Attention gates, residual blocks and output adding are used in our proposed 3D CNN. In the first stage, we apply our model to downsampled images to output a coarse segmentation. Next, we crop the extended subcortical region from the original image based on this coarse segmentation, and we input the cropped region to the second CNN to obtain the final segmentation. Left and right pairs of thalamus, caudate, pallidum and putamen are considered in our segmentation. We use the Dice coefficient as our metric and evaluate our method on two datasets: the publicly available IBSR dataset and a subset of the PREDICT-HD database, which includes healthy controls and HD subjects. We train our models on only healthy control subjects and test on both healthy controls and HD subjects to examine model generalizability. Compared with the state-of-the-art methods, our method has the highest mean Dice score on all considered subcortical structures (except the thalamus on IBSR), with more pronounced improvement for HD subjects. This suggests that our method may have better ability to segment MRIs of subjects with neurodegenerative disease.
Longitudinal information is important for monitoring the progression of neurodegenerative diseases, such as Huntington's disease (HD). Specifically, longitudinal magnetic resonance imaging (MRI) studies may allow the discovery of subtle intra-subject changes over time that may otherwise go undetected because of inter-subject variability. For HD patients, the primary imaging-based marker of disease progression is the atrophy of subcortical structures, mainly the caudate and putamen. To better understand the course of subcortical atrophy in HD and its correlation with clinical outcome measures, highly accurate segmentation is important. In recent years, subcortical segmentation methods have moved towards deep learning, given the state-of-the-art accuracy and computational efficiency provided by these models. However, these methods are not designed for longitudinal analysis, but rather treat each time point as an independent sample, discarding the longitudinal structure of the data. In this paper, we propose a deep learning based subcortical segmentation method that takes into account this longitudinal information. Our method takes a longitudinal pair of 3D MRIs as input, and jointly computes the corresponding segmentations. We use bi-directional convolutional long short-term memory (C-LSTM) blocks in our model to leverage the longitudinal information between scans. We test our method on the PREDICT-HD dataset and use the Dice coefficient, average surface distance and 95-percent Hausdor distance as our evaluation metrics. Compared to cross-sectional segmentation, we improve the overall accuracy of segmentation, and our method has more consistent performance across time points. Furthermore, our method identifies a stronger correlation between subcortical volume loss and decline in the total motor score, an important clinical outcome measure for HD.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.