PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
TOPICS: Artificial intelligence, Medical imaging, Clinical practice, Pathology, Lung, Imaging devices, Cardiovascular magnetic resonance imaging, Cancer detection, Breast cancer, 3D image enhancement
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Artificial intelligence (AI) presents an opportunity in anatomic pathology to provide quantitative objective support to a traditionally subjective discipline, thereby enhancing clinical workflows and enriching diagnostic capabilities. AI requires access to digitized pathology materials, which, at present, are most commonly generated from the glass slide using whole-slide imaging. Models are developed collaboratively or sourced externally, and best practices suggest validation with internal datasets most closely resembling the data expected in practice. Although an array of AI models that provide operational support for pathology practices or improve diagnostic quality and capabilities has been described, most of them can be categorized into one or more discrete types. However, their function in the pathology workflow can vary, as a single algorithm may be appropriate for screening and triage, diagnostic assistance, virtual second opinion, or other uses depending on how it is implemented and validated. Despite the clinical promise of AI, the barriers to adoption have been numerous, to which inclusion of new stakeholders and expansion of reimbursement opportunities may be among the most impactful solutions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TOPICS: Medical imaging, Data modeling, Medical devices, Education and training, Imaging devices, Performance modeling, Instrument modeling, Machine learning, Design and modelling, Medical device development
To introduce developers to medical device regulatory processes and data considerations in artificial intelligence and machine learning (AI/ML) device submissions and to discuss ongoing AI/ML-related regulatory challenges and activities.
Approach
AI/ML technologies are being used in an increasing number of medical imaging devices, and the fast evolution of these technologies presents novel regulatory challenges. We provide AI/ML developers with an introduction to U.S. Food and Drug Administration (FDA) regulatory concepts, processes, and fundamental assessments for a wide range of medical imaging AI/ML device types.
Results
The device type for an AI/ML device and appropriate premarket regulatory pathway is based on the level of risk associated with the device and informed by both its technological characteristics and intended use. AI/ML device submissions contain a wide array of information and testing to facilitate the review process with the model description, data, nonclinical testing, and multi-reader multi-case testing being critical aspects of the AI/ML device review process for many AI/ML device submissions. The agency is also involved in AI/ML-related activities that support guidance document development, good machine learning practice development, AI/ML transparency, AI/ML regulatory research, and real-world performance assessment.
Conclusion
FDA’s AI/ML regulatory and scientific efforts support the joint goals of ensuring patients have access to safe and effective AI/ML devices over the entire device lifecycle and stimulating medical AI/ML innovation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To integrate and evaluate an artificial intelligence (AI) system that assists in checking endotracheal tube (ETT) placement on chest x-rays (CXRs) in clinical practice.
Approach
In clinical use over 17 months, 214 CXR images were ordered to check ETT placement with AI assistance by intensive care unit (ICU) physicians. The system was built on the SimpleMind Cognitive AI platform and integrated into a clinical workflow. It automatically identified the ETT and checked its placement relative to the trachea and carina. The ETT overlay and misplacement alert messages generated by the AI system were compared with radiology reports as the reference. A survey study was also conducted to evaluate usefulness of the AI system in clinical practice.
Results
The alert messages indicating that either the ETT was misplaced or not detected had a positive predictive value of 42% (21/50) and negative predictive value of 98% (161/164) based on the radiology reports. In the survey, radiologist and ICU physician users indicated that they agreed with the AI outputs and that they were useful.
Conclusions
The AI system performance in real-world clinical use was comparable to that seen in previous experiments. Based on this and physician survey results, the system can be deployed more widely at our institution, using insights gained from this evaluation to make further algorithm improvements and quality assurance of the AI system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lung transplantation is the standard treatment for end-stage lung diseases. A crucial factor affecting its success is size matching between the donor’s lungs and the recipient’s thorax. Computed tomography (CT) scans can accurately determine recipient’s lung size, but donor’s lung size is often unknown due to the absence of medical images. We aim to predict donor’s right/left/total lung volume, thoracic cavity, and heart volume from only subject demographics to improve the accuracy of size matching.
Approach
A cohort of 4610 subjects with chest CT scans and basic demographics (i.e., age, gender, race, smoking status, smoking history, weight, and height) was used in this study. The right and left lungs, thoracic cavity, and heart depicted on chest CT scans were automatically segmented using U-Net, and their volumes were computed. Eight machine learning models [i.e., random forest, multivariate linear regression, support vector machine, extreme gradient boosting (XGBoost), multilayer perceptron (MLP), decision tree, k-nearest neighbors, and Bayesian regression) were developed and used to predict the volume measures from subject demographics. The 10-fold cross-validation method was used to evaluate the performances of the prediction models. R-squared (R2), mean absolute error (MAE), and mean absolute percentage error (MAPE) were used as performance metrics.
Results
The MLP model demonstrated the best performance for predicting the thoracic cavity volume (R2: 0.628, MAE: 0.736 L, MAPE: 10.9%), right lung volume (R2: 0.501, MAE: 0.383 L, MAPE: 13.9%), and left lung volume (R2: 0.507, MAE: 0.365 L, MAPE: 15.2%), and the XGBoost model demonstrated the best performance for predicting the total lung volume (R2: 0.514, MAE: 0.728 L, MAPE: 14.0%) and heart volume (R2: 0.430, MAE: 0.075 L, MAPE: 13.9%).
Conclusions
Our results demonstrate the feasibility of predicting lung, heart, and thoracic cavity volumes from subject demographics with superior performance compared with available studies in predicting lung volumes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Population-based screening programs for the early detection of breast cancer have significantly reduced mortality in women, but they are resource intensive in terms of time, cost, and workload and still have limitations mainly due to the use of 2D imaging techniques, which may cause overlapping of tissues, and interobserver variability. Artificial intelligence (AI) systems may be a valuable tool to assist radiologist when reading and classifying mammograms based on the malignancy of the detected lesions. However, there are several factors that can influence the outcome of a mammogram and thus also the detection capability of an AI system. The aim of our work is to analyze the robustness of the diagnostic ability of an AI system designed for breast cancer detection.
Approach
Mammograms from a population-based screening program were scored with the AI system. The sensitivity and specificity by means of the area under the receiver operating characteristic (ROC) curve were obtained as a function of the mammography unit manufacturer, demographic characteristics, and several factors that may affect the image quality (age, breast thickness and density, compression applied, beam quality, and delivered dose).
Results
The area under the curve (AUC) from the scoring ROC curve was 0.92 (95% confidence interval = 0.89 − 0.95). It showed no dependence with any of the parameters considered, as the differences in the AUC for different interval values were not statistically significant.
Conclusion
The results suggest that the AI system analyzed in our work has a robust diagnostic capability, and that its accuracy is independent of the studied parameters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TOPICS: Image segmentation, Super resolution, 3D modeling, Education and training, 3D image processing, Data modeling, 3D image enhancement, Interpolation, Image resolution, Deep learning
High-resolution late gadolinium enhanced (LGE) cardiac magnetic resonance imaging (MRI) volumes are difficult to acquire due to the limitations of the maximal breath-hold time achievable by the patient. This results in anisotropic 3D volumes of the heart with high in-plane resolution, but low-through-plane resolution. Thus, we propose a 3D convolutional neural network (CNN) approach to improve the through-plane resolution of the cardiac LGE-MRI volumes.
Approach
We present a 3D CNN-based framework with two branches: a super-resolution branch to learn the mapping between low-resolution and high-resolution LGE-MRI volumes, and a gradient branch that learns the mapping between the gradient map of low-resolution LGE-MRI volumes and the gradient map of high-resolution LGE-MRI volumes. The gradient branch provides structural guidance to the CNN-based super-resolution framework. To assess the performance of the proposed CNN-based framework, we train two CNN models with and without gradient guidance, namely, dense deep back-projection network (DBPN) and enhanced deep super-resolution network. We train and evaluate our method on the 2018 atrial segmentation challenge dataset. Additionally, we also evaluate these trained models on the left atrial and scar quantification and segmentation challenge 2022 dataset to assess their generalization ability. Finally, we investigate the effect of the proposed CNN-based super-resolution framework on the 3D segmentation of the left atrium (LA) from these cardiac LGE-MRI image volumes.
Results
Experimental results demonstrate that our proposed CNN method with gradient guidance consistently outperforms bicubic interpolation and the CNN models without gradient guidance. Furthermore, the segmentation results, evaluated using Dice score, obtained using the super-resolved images generated by our proposed method are superior to the segmentation results obtained using the images generated by bicubic interpolation (p < 0.01) and the CNN models without gradient guidance (p < 0.05).
Conclusion
The presented CNN-based super-resolution method with gradient guidance improves the through-plane resolution of the LGE-MRI volumes and the structure guidance provided by the gradient branch can be useful to aid the 3D segmentation of cardiac chambers, such as LA, from the 3D LGE-MRI images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To validate the effectiveness of an approach called batch-balanced focal loss (BBFL) in enhancing convolutional neural network (CNN) classification performance on imbalanced datasets.
Materials and Methods
BBFL combines two strategies to tackle class imbalance: (1) batch-balancing to equalize model learning of class samples and (2) focal loss to add hard-sample importance to the learning gradient. BBFL was validated on two imbalanced fundus image datasets: a binary retinal nerve fiber layer defect (RNFLD) dataset (n = 7,258) and a multiclass glaucoma dataset (n = 7,873). BBFL was compared to several imbalanced learning techniques, including random oversampling (ROS), cost-sensitive learning, and thresholding, based on three state-of-the-art CNNs. Accuracy, F1-score, and the area under the receiver operator characteristic curve (AUC) were used as the performance metrics for binary classification. Mean accuracy and mean F1-score were used for multiclass classification. Confusion matrices, t-distributed neighbor embedding plots, and GradCAM were used for the visual assessment of performance.
Results
In binary classification of RNFLD, BBFL with InceptionV3 (93.0% accuracy, 84.7% F1, 0.971 AUC) outperformed ROS (92.6% accuracy, 83.7% F1, 0.964 AUC), cost-sensitive learning (92.5% accuracy, 83.8% F1, 0.962 AUC), and thresholding (91.9% accuracy, 83.0% F1, 0.962 AUC) and others. In multiclass classification of glaucoma, BBFL with MobileNetV2 (79.7% accuracy, 69.6% average F1 score) outperformed ROS (76.8% accuracy, 64.7% F1), cost-sensitive learning (78.3% accuracy, 67.8.8% F1), and random undersampling (76.5% accuracy, 66.5% F1).
Conclusion
The BBFL-based learning method can improve the performance of a CNN model in both binary and multiclass disease classification when the data are imbalanced.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Diagnosis and surveillance of thoracic aortic aneurysm (TAA) involves measuring the aortic diameter at various locations along the length of the aorta, often using computed tomography angiography (CTA). Currently, measurements are performed by human raters using specialized software for three-dimensional analysis, a time-consuming process, requiring 15 to 45 min of focused effort. Thus, we aimed to develop a convolutional neural network (CNN)-based algorithm for fully automated and accurate aortic measurements.
Approach
Using 212 CTA scans, we trained a CNN to perform segmentation and localization of key landmarks jointly. Segmentation mask and landmarks are subsequently used to obtain the centerline and cross-sectional diameters of the aorta. Subsequently, a cubic spline is fit to the aortic boundary at the sinuses of Valsalva to avoid errors related inclusions of coronary artery origins. Performance was evaluated on a test set of 60 scans with automated measurements compared against expert manual raters.
Result
Compared to training separate networks for each task, joint training yielded higher accuracy for segmentation, especially at the boundary (p < 0.001), but a marginally worse (0.2 to 0.5 mm) accuracy for landmark localization (p < 0.001). Mean absolute error between human and automated was ≤1 mm at six of nine standard clinical measurement locations. However, higher errors were noted in the aortic root and arch regions, ranging between 1.4 and 2.2 mm, although agreement of manual raters was also lower in these regions.
Conclusion
Fully automated aortic diameter measurements in TAA are feasible using a CNN-based algorithm. Automated measurements demonstrated low errors that are comparable in magnitude to those with manual raters; however, measurement error was highest in the aortic root and arch.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Assessing the complex three-dimensional (3D) structure of the cochlea is crucial to understanding the fundamental aspects of signal transduction in the inner ear and is a prerequisite for the development of novel cochlear implants. X-ray phase-contrast computed tomography offers destruction-free 3D imaging with little sample preparation, thus preserving the delicate structure of the cochlea. The use of heavy metal stains enables higher contrast and resolution and facilitates segmentation of the cochlea.
Approach
For μ-CT of small animal and human cochlea, we explore the heavy metal osmium tetroxide (OTO) as a radiocontrast agent and delineate laboratory μ-CT from synchrotron CT. We investigate how phase retrieval can be used to improve the image quality of the reconstructions, both for stained and unstained specimens.
Results
Image contrast for soft tissue in an aqueous solution is insufficient under the in-house conditions, whereas the OTO stain increases contrast for lipid-rich tissue components, such as the myelin sheaths in nervous tissue, enabling contrast-based rendering of the different components of the auditory nervous system. The overall morphology of the cochlea with the three scalae and membranes is very well represented. Further, the image quality of the reconstructions improves significantly when a phase retrieval scheme is used, which is also suitable for non-ideal laboratory μ-CT settings. With highly brilliant synchrotron radiation (SR), we achieve high contrast for unstained whole cochleae at the cellular level.
Conclusions
The OTO stain is suitable for 3D imaging of small animal and human cochlea with laboratory μ-CT, and relevant pathologies, such as a loss of sensory cells and neurons, can be visualized. With SR and optimized phase retrieval, the cellular level can be reached even for unstained samples in aqueous solution, as demonstrated by the high visibility of single hair cells and spiral ganglion neurons.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TOPICS: Education and training, Denoising, Mammography, Data modeling, Breast, Monte Carlo methods, Image processing, Surface plasmons, Image restoration, Digital mammography
Recent research suggests that image quality degradation with reduced radiation exposure in mammography can be mitigated by postprocessing mammograms with denoising algorithms based on convolutional neural networks. Breast microcalcifications, along with extended soft-tissue lesions, are the primary breast cancer biomarkers in a clinical x-ray examination, with the former being more sensitive to quantum noise. We test one such publicly available denoising method to observe if an improvement in detection of small microcalcifications can be achieved when deep learning-based denoising is applied to half-dose phantom scans.
Approach
An existing denoiser model (that was previously trained on clinical data) was applied to mammograms of an anthropomorphic physical phantom with hydroxyapatite microcalcifications. In addition, another model trained and tested using all synthetic (Monte Carlo) data was applied to a similar digital compressed breast phantom. Human reader studies were conducted to assess and compare image quality in a set of binary signal detection 4-AFC experiments, with proportion of correct responses used as a performance metric.
Results
In both physical phantom/clinical system and simulation studies, we saw no apparent improvement in small microcalcification signal detection in denoised half-dose mammograms. However, in a Monte Carlo study, we observed a noticeable jump in 4-AFC scores, when readers analyzed denoised half-dose images processed by the neural network trained on a dataset composed of 50% signal-present (SP) and 50% signal-absent regions of interest (ROIs).
Conclusions
Our findings conjecture that deep-learning denoising algorithms may benefit from enriching training datasets with SP ROIs, at least in cases with clusters of 5 to 10 microcalcifications, each of size ≲240 μm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TOPICS: Image segmentation, Data modeling, Convolution, Medical imaging, Visualization, 3D modeling, Education and training, Deep learning, Visual process modeling, Surgery
Explaining deep learning model decisions, especially those for medical image segmentation, is a critical step toward the understanding and validation that will enable these powerful tools to see more widespread adoption in healthcare. We introduce kernel-weighted contribution, a visual explanation method for three-dimensional medical image segmentation models that produces accurate and interpretable explanations. Unlike previous attribution methods, kernel-weighted contribution is explicitly designed for medical image segmentation models and assesses feature importance using the relative contribution of each considered activation map to the predicted segmentation.
Approach
We evaluate our method on a synthetic dataset that provides complete knowledge over input features and a comprehensive explanation quality metric using this ground truth. Our method and three other prevalent attribution methods were applied to five different model layer combinations to explain segmentation predictions for 100 test samples and compared using this metric.
Results
Kernel-weighted contribution produced superior explanations of obtained image segmentations when applied to both encoder and decoder sections of a trained model as compared to other layer combinations (p < 0.0005). In between-method comparisons, kernel-weighted contribution produced superior explanations compared with other methods using the same model layers in four of five experiments (p < 0.0005) and showed equivalently superior performance to GradCAM++ when only using non-transpose convolution layers of the model decoder (p = 0.008).
Conclusion
The reported method produced explanations of superior quality uniquely suited to fully utilize the specific architectural considerations present in image and especially medical image segmentation models. Both the synthetic dataset and implementation of our method are available to the research community.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
General deep-learning (DL)-based semantic segmentation methods with expert level accuracy may fail in 3D medical image segmentation due to complex tissue structures, lack of large datasets with ground truth, etc. For expeditious diagnosis, there is a compelling need to predict segmentation quality without ground truth. In some medical imaging applications, maintaining the quality of segmentation is crucial to the localized regions where disease is prevalent rather than just globally maintaining high-average segmentation quality. We propose a DL framework to identify regions of segmentation inaccuracies by combining a 3D generative adversarial network (GAN) and a convolutional regression network.
Approach
Our approach is methodologically based on the learned ability to reconstruct the original images identifying the regions of location-specific segmentation failures, in which the reconstruction does not match the underlying original image. We use conditional GAN to reconstruct input images masked by the segmentation results. The regression network is trained to predict the patch-wise Dice similarity coefficient (DSC), conditioned on the segmentation results. The method relies directly on the extracted segmentation related features and does not need to use ground truth during the inference phase to identify erroneous regions in the computed segmentation.
Results
We evaluated the proposed method on two public datasets: osteoarthritis initiative 4D (3D + time) knee MRI (knee-MR) and 3D non-small cell lung cancer CT (lung-CT). For the patch-wise DSC prediction, we observed the mean absolute errors of 0.01 and 0.04 with the independent standard for the knee-MR and lung-CT data, respectively.
Conclusions
This method shows promising results in localizing the erroneous segmentation regions that may aid the downstream analysis of disease diagnosis and prognosis prediction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TOPICS: Education and training, Data modeling, Mammography, Breast cancer, Cancer, Tumor growth modeling, Breast, Tunable filters, Instrument modeling, Cancer detection
Risk-stratified breast cancer screening might improve early detection and efficiency without comprising quality. However, modern mammography-based risk models do not ensure adaptation across vendor-domains and rely on cancer precursors, associated with short-term risk, which might limit long-term risk assessment. We report a cross-vendor mammographic texture model for long-term risk.
Approach
The texture model was robustly trained using two systematically designed case-control datasets. Textural features, indicative of future breast cancer, were learned by excluding samples with diagnosed/potential malignancies from training. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization across vendor-domains. The model was validated in 66,607 consecutively screened Danish women with flavorized Siemens views and 25,706 Dutch women with Hologic-processed views. Performances were evaluated for interval cancers (IC) within 2 years from screening and long-term cancers (LTC) from 2 years after screening. The texture model was combined with established risk factors to flag 10% of women with the highest risk.
Results
In Danish women, the texture model achieved an area under the receiver operating characteristic curve (AUC) of 0.71 and 0.65 for ICs and LTCs, respectively. In Dutch women with Hologic-processed views, the AUCs were not different from AUCs in Danish women with flavorized views. The AUC for texture combined with established risk factors increased to 0.68 for LTCs. The 10% of women flagged as high-risk accounted for 25.5% of ICs and 24.8% of LTCs.
Conclusions
The texture model robustly estimated long-term breast cancer risk while adapting to an unseen processed vendor-domain and identified a clinically relevant high-risk subgroup.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep supervised learning provides an effective approach for developing robust models for various computer-aided diagnosis tasks. However, there is often an underlying assumption that the frequencies of the samples between the different classes of the training dataset are either similar or balanced. In real-world medical data, the samples of positive classes often occur too infrequently to satisfy this assumption. Thus, there is an unmet need for deep-learning systems that can automatically identify and adapt to the real-world conditions of imbalanced data.
Approach
We propose a deep Bayesian ensemble learning framework to address the representation learning problem of long-tailed and out-of-distribution (OOD) samples when training from medical images. By estimating the relative uncertainties of the input data, our framework can adapt to imbalanced data for learning generalizable classifiers. We trained and tested our framework on four public medical imaging datasets with various imbalance ratios and imaging modalities across three different learning tasks: semantic medical image segmentation, OOD detection, and in-domain generalization. We compared the performance of our framework with those of state-of-the-art comparator methods.
Results
Our proposed framework outperformed the comparator models significantly across all performance metrics (pairwise t-test: p < 0.01) in the semantic segmentation of high-resolution CT and MR images as well as in the detection of OOD samples (p < 0.01), thereby showing significant improvement in handling the associated long-tailed data distribution. The results of the in-domain generalization also indicated that our framework can enhance the prediction of retinal glaucoma, contributing to clinical decision-making processes.
Conclusions
Training of the proposed deep Bayesian ensemble learning framework with dynamic Monte-Carlo dropout and a combination of losses yielded the best generalization to unseen samples from imbalanced medical imaging datasets across different learning tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TOPICS: Data modeling, Education and training, RGB color model, Echocardiography, Performance modeling, Deep learning, Motion models, Ablation, 3D modeling, Image classification
The inherent characteristics of transthoracic echocardiography (TTE) images such as low signal-to-noise ratio and acquisition variations can limit the direct use of TTE images in the development and generalization of deep learning models. As such, we propose an innovative automated framework to address the common challenges in the process of echocardiography deep learning model generalization on the challenging task of constrictive pericarditis (CP) and cardiac amyloidosis (CA) differentiation.
Approach
Patients with a confirmed diagnosis of CP or CA and normal cases from Mayo Clinic Rochester and Arizona were identified to extract baseline demographics and the apical 4 chamber view from TTE studies. We proposed an innovative preprocessing and image generalization framework to process the images for training the ResNet50, ResNeXt101, and EfficientNetB2 models. Ablation studies were conducted to justify the effect of each proposed processing step in the final classification performance.
Results
The models were initially trained and validated on 720 unique TTE studies from Mayo Rochester and further validated on 225 studies from Mayo Arizona. With our proposed generalization framework, EfficientNetB2 generalized the best with an average area under the curve (AUC) of 0.96 (±0.01) and 0.83 (±0.03) on the Rochester and Arizona test sets, respectively.
Conclusions
Leveraging the proposed generalization techniques, we successfully developed an echocardiography-based deep learning model that can accurately differentiate CP from CA and normal cases and applied the model to images from two sites. The proposed framework can be further extended for the development of echocardiography-based deep learning models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Generative adversarial networks (GANs) can synthesize various feasible-looking images. We showed that a GAN, specifically a conditional GAN (CGAN), can simulate breast mammograms with normal, healthy appearances and can help detect mammographically-occult (MO) cancer. However, similar to other GANs, CGANs can suffer from various artifacts, e.g., checkerboard artifacts, that may impact the quality of the final synthesized image, as well as the performance of detecting MO cancer. We explored the types of GAN artifacts that exist in mammogram simulations and their effect on MO cancer detection.
Approach
We first trained a CGAN using digital mammograms (FFDMs) of 1366 women with normal/healthy breasts. Then, we tested the trained CGAN on an independent MO cancer dataset with 333 women with dense breasts (97 MO cancers). We trained a convolutional neural network (CNN) on the MO cancer dataset, in which real and simulated mammograms were fused, to identify women with MO cancer. Then, a radiologist who was independent of the development of the CGAN algorithms evaluated the entire MO cancer dataset to identify and annotate artifacts in the simulated mammograms.
Results
We found four artifact types, including checkerboard, breast boundary, nipple-areola complex, and black spots around calcification artifacts, with an overall incidence rate over 69% (the individual incident rate ranged from 9% to 53%) from both normal and MO cancer samples. We then evaluated their potential impact on MO cancer detection. Even though various artifacts existed in the simulated mammogram, we found that it still provided complementary information for MO cancer detection when it was combined with the real mammograms.
Conclusions
We found that artifacts were pervasive in the CGAN-simulated mammograms. However, they did not negatively affect our MO cancer detection algorithm; the simulated mammograms still provided complementary information for MO cancer detection when combined with real mammograms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Acute respiratory distress syndrome (ARDS) is a life-threatening condition that can cause a dramatic drop in blood oxygen levels due to widespread lung inflammation. Chest radiography is widely used as a primary modality to detect ARDS due to its crucial role in diagnosing the syndrome, and the x-ray images can be obtained promptly. However, despite the extensive literature on chest x-ray (CXR) image analysis, there is limited research on ARDS diagnosis due to the scarcity of ARDS-labeled datasets. Additionally, many machine learning-based approaches result in high performance in pulmonary disease diagnosis, but their decisions are often not easily interpretable, which can hinder their clinical acceptance. This work aims to develop a method for detecting signs of ARDS in CXR images that can be clinically interpretable.
Approach
To achieve this goal, an ARDS-labeled dataset of chest radiography images is gathered and annotated for training and evaluation of the proposed approach. The proposed deep classification-segmentation model, Dense-Ynet, provides an interpretable framework for automatically diagnosing ARDS in CXR images. The model takes advantage of lung segmentation in diagnosing ARDS. By definition, ARDS causes bilateral diffuse infiltrates throughout the lungs. To consider the local involvement of lung areas, each lung is divided into upper and lower halves, and our model classifies the resulting lung quadrants.
Results
The quadrant-based classification strategy yields the area under the receiver operating characteristic curve of 95.1% (95% CI 93.5 to 96.1), which allows for providing a reference for the model’s predictions. In terms of segmentation, the model accurately identifies lung regions in CXR images even when lung boundaries are unclear in abnormal images.
Conclusions
This study provides an interpretable decision system for diagnosing ARDS, by following the definition used by clinicians for the diagnosis of ARDS from CXR images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image-Guided Procedures, Robotic Interventions, and Modeling
Transcranial focused ultrasound (tFUS) is a therapeutic ultrasound method that focuses sound through the skull to a small region noninvasively and often under magnetic resonance imaging (MRI) guidance. CT imaging is used to estimate the acoustic properties that vary between individual skulls to enable effective focusing during tFUS procedures, exposing patients to potentially harmful radiation. A method to estimate acoustic parameters in the skull without the need for CT is desirable.
Approach
We synthesized CT images from routinely acquired T1-weighted MRI using a 3D patch-based conditional generative adversarial network and evaluated the performance of synthesized CT (sCT) images for treatment planning with tFUS. We compared the performance of sCT with real CT (rCT) images for tFUS planning using Kranion and simulations using the acoustic toolbox, k-Wave. Simulations were performed for 3 tFUS scenarios: (1) no aberration correction, (2) correction with phases calculated from Kranion, and (3) phase shifts calculated from time reversal.
Results
From Kranion, the skull density ratio, skull thickness, and number of active elements between rCT and sCT had Pearson’s correlation coefficients of 0.94, 0.92, and 0.98, respectively. Among 20 targets, differences in simulated peak pressure between rCT and sCT were largest without phase correction (12.4 % ± 8.1 % ) and smallest with Kranion phases (7.3 % ± 6.0 % ). The distance between peak focal locations between rCT and sCT was <1.3 mm for all simulation cases.
Conclusions
Real and synthetically generated skulls had comparable image similarity, skull measurements, and acoustic simulation metrics. Our work demonstrated similar results for 10 testing cases comparing MR-sCTs and rCTs for tFUS planning. Source code and a docker image with the trained model are available at https://github.com/han-liu/SynCT_TcMRgFUS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image Perception, Observer Performance, and Technology Assessment
TOPICS: Education and training, Signal detection, Imaging systems, Signal attenuation, Breast, Binary data, Information operations, Image restoration, Image processing, Signal processing
The objective assessment of image quality (IQ) has been advocated for the analysis and optimization of medical imaging systems. One method of computing such IQ metrics is through a numerical observer. The Hotelling observer (HO) is the optimal linear observer, but conventional methods for obtaining the HO can become intractable due to large image sizes or insufficient data. Channelized methods are sometimes employed in such circumstances to approximate the HO. The performance of such channelized methods varies, with different methods obtaining superior performance to others depending on the imaging conditions and detection task. A channelized HO method using an AE is presented and implemented across several tasks to characterize its performance.
Approach
The process for training an AE is demonstrated to be equivalent to developing a set of channels for approximating the HO. The efficiency of the learned AE-channels is increased by modifying the conventional AE loss function to incorporate task-relevant information. Multiple binary detection tasks involving lumpy and breast phantom backgrounds across varying dataset sizes are considered to evaluate the performance of the proposed method and compare to current state-of-the-art channelized methods. Additionally, the ability of the channelized methods to generalize to images outside of the training dataset is investigated.
Results
AE-learned channels are demonstrated to have comparable performance with other state-of-the-art channel methods in the detection studies and superior performance in the generalization studies. Incorporating a cleaner estimate of the signal for the detection task is also demonstrated to significantly improve the performance of the proposed method, particularly in datasets with fewer images.
Conclusions
AEs are demonstrated to be capable of learning efficient channels for the HO. The resulting significant increase in detection performance for small dataset sizes when incorporating a signal prior holds promising implications for future assessments of imaging technologies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Biomedical Applications in Molecular, Structural, and Functional Imaging
X-ray phase-contrast tomography (XPCT) is a non-destructive, three-dimensional imaging modality that provides higher contrast in soft tissue than absorption-based CT and allows one to cover the cytoarchitecture from the centi- and millimeter scale down to the nanoscale. To further increase contrast and resolution of XPCT, for example, in view of addressing connectivity issues in the central nervous system (CNS), metal staining is indispensable. However, currently used protocols, for example, based on osmium and/or uranium are less suited for XPCT, due to an excessive β / δ-ratio. In this work, we explore the suitability of different staining agents for XPCT. Particularly, neodymium(III)-acetate (NdAc), which has recently been proposed as a non-toxic, non-radioactive easy to use alternative contrast agent for uranyl acetate (UAc) in electron microscopy, is investigated. Due to its vertical proximity to UAc in the periodic table, similar chemical but better suited optical properties for phase contrast can be expected.
Approach
Differently stained whole eye samples of wild type mouse and tissues of the CNS are embedded into EPON epoxy resin and scanned using synchrotron as well as with laboratory radiation. Phase retrieval is performed on the projection images, followed by tomographic reconstruction, which enables a quantitative analysis based on the reconstructed electron densities. Segmentation techniques and rendering software is used to visualize structures of interest in the sample.
Results
We show that staining neuronal samples with NdAc enhances contrast, in particular for laboratory scans, allowing high-resolution imaging of biological soft tissue in-house. For the example of murine retina, specifically rods and cones as well as the sclera and the Ganglion cell layer seem to be targeted by the stain. A comparison of electron density by the evaluation of histograms allowed to determine quantitative measures to describe the difference between the examined stains.
Conclusion
The results suggest NdAc to be an effective stain for XPCT, with a preferential binding to anionic groups, such as phosphate and carboxyl groups at cell surfaces, targeting certain layers of the retina with a stronger selectivity compared to other staining agents. Due to the advantageous X-ray optical properties, the stain seems particularly well-suited for phase contrast, with a comparably small number density and an overall superior image quality at laboratory sources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To validate a low-dose, single-volume quantitative CT myocardial flow technique in a cardiovascular flow phantom and a swine animal model of coronary artery disease.
Approach
A cardiovascular flow phantom was imaged dynamically over different flow rates (0.97 to 2.45 mL / min / g) using 15 mL of contrast per injection. Six swine (37 ± 8 kg) were also imaged dynamically, with different left anterior descending coronary artery balloon stenoses assessed under intracoronary adenosine stress, using 1 mL / kg of contrast per injection. The resulting images were used to simulate dynamic bolus tracking and peak volume scan acquisition. After which, first-pass single-compartment modeling was performed to derive quantitative flow, where the pre-contrast myocardial attenuation was assumed to be spatially uniform. The accuracy of CT flow was then assessed versus ultrasound and microsphere flow in the phantom and animal models, respectively, using regression analysis.
Results
Single-volume quantitative CT flow measurements in the phantom (QCT_PHANTOM) were related to reference ultrasound flow measurements (QUS) by QCT_PHANTOM = 1.04 QUS − 0.1 (Pearson’s r = 0.98; RMSE = 0.09 mL / min / g). In the animal model (QCT_ANIMAL), they were related to reference microsphere flow measurements (QMICRO) by QCT_ANIMAL = 1.00 QMICRO − 0.05 (Pearson’s r = 0.96; RMSE = 0.48 mL / min / g). The effective dose per CT measurement was 1.21 mSv.
Conclusions
The single-volume quantitative CT flow technique only requires bolus tracking data, spatially uniform pre-contrast myocardial attenuation, and a single volume scan acquired near the peak aortic enhancement for accurate, low-dose, myocardial flow measurement (in mL/min/g) under rest and adenosine stress conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.