KEYWORDS: Breast, Image segmentation, Education and training, Visualization, Magnetic resonance imaging, Tumors, Statistical analysis, Binary data, Image classification, Breast cancer
PurposeCurrent clinical assessment qualitatively describes background parenchymal enhancement (BPE) as minimal, mild, moderate, or marked based on the visually perceived volume and intensity of enhancement in normal fibroglandular breast tissue in dynamic contrast-enhanced (DCE)-MRI. Tumor enhancement may be included within the visual assessment of BPE, thus inflating BPE estimation due to angiogenesis within the tumor. Using a dataset of 426 MRIs, we developed an automated method to segment breasts, electronically remove lesions, and calculate scores to estimate BPE levels.ApproachA U-Net was trained for breast segmentation from DCE-MRI maximum intensity projection (MIP) images. Fuzzy c-means clustering was used to segment lesions; the lesion volume was removed prior to creating projections. U-Net outputs were applied to create projection images of both, affected, and unaffected breasts before and after lesion removal. BPE scores were calculated from various projection images, including MIPs or average intensity projections of first- or second postcontrast subtraction MRIs, to evaluate the effect of varying image parameters on automatic BPE assessment. Receiver operating characteristic analysis was performed to determine the predictive value of computed scores in BPE level classification tasks relative to radiologist ratings.ResultsStatistically significant trends were found between radiologist BPE ratings and calculated BPE scores for all breast regions (Kendall correlation, p<0.001). Scores from all breast regions performed significantly better than guessing (p<0.025 from the z-test). Results failed to show a statistically significant difference in performance with and without lesion removal. BPE scores of the affected breast in the second postcontrast subtraction MIP after lesion removal performed statistically greater than random guessing across various viewing projections and DCE time points.ConclusionsResults demonstrate the potential for automatic BPE scoring to serve as a quantitative value for objective BPE level classification from breast DCE-MR without the influence of lesion enhancement.
KEYWORDS: Image segmentation, Breast, 3D image processing, 3D imaging standards, Magnetic resonance imaging, Education and training, Cross validation, 3D modeling, 3D image enhancement, Artificial intelligence
PurposeGiven the dependence of radiomic-based computer-aided diagnosis artificial intelligence on accurate lesion segmentation, we assessed the performances of 2D and 3D U-Nets in breast lesion segmentation on dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) relative to fuzzy c-means (FCM) and radiologist segmentations.ApproachUsing 994 unique breast lesions imaged with DCE-MRI, three segmentation algorithms (FCM clustering, 2D and 3D U-Net convolutional neural networks) were investigated. Center slice segmentations produced by FCM, 2D U-Net, and 3D U-Net were evaluated using radiologist segmentations as truth, and volumetric segmentations produced by 2D U-Net slices and 3D U-Net were compared using FCM as a surrogate reference standard. Fivefold cross-validation by lesion was conducted on the U-Nets; Dice similarity coefficient (DSC) and Hausdorff distance (HD) served as performance metrics. Segmentation performances were compared across different input image and lesion types.Results2D U-Net outperformed 3D U-Net for center slice (DSC, HD p < 0.001) and volume segmentations (DSC, HD p < 0.001). 2D U-Net outperformed FCM in center slice segmentation (DSC p < 0.001). The use of second postcontrast subtraction images showed greater performance than first postcontrast subtraction images using the 2D and 3D U-Net (DSC p < 0.05). Additionally, mass segmentation outperformed nonmass segmentation from first and second postcontrast subtraction images using 2D and 3D U-Nets (DSC, HD p < 0.001).ConclusionsResults suggest that 2D U-Net is promising in segmenting mass and nonmass enhancing breast lesions from first and second postcontrast subtraction MRIs and thus could be an effective alternative to FCM or 3D U-Net.
KEYWORDS: COVID 19, Chest imaging, Data modeling, Deep learning, Education and training, Performance modeling, Radiography, Medical imaging, Machine learning, Diseases and disorders
PurposeImage-based prediction of coronavirus disease 2019 (COVID-19) severity and resource needs can be an important means to address the COVID-19 pandemic. In this study, we propose an artificial intelligence/machine learning (AI/ML) COVID-19 prognosis method to predict patients’ needs for intensive care by analyzing chest X-ray radiography (CXR) images using deep learning.ApproachThe dataset consisted of 8357 CXR exams from 5046 COVID-19–positive patients as confirmed by reverse transcription polymerase chain reaction (RT-PCR) tests for the SARS-CoV-2 virus with a training/validation/test split of 64%/16%/20% on a by patient level. Our model involved a DenseNet121 network with a sequential transfer learning technique employed to train on a sequence of gradually more specific and complex tasks: (1) fine-tuning a model pretrained on ImageNet using a previously established CXR dataset with a broad spectrum of pathologies; (2) refining on another established dataset to detect pneumonia; and (3) fine-tuning using our in-house training/validation datasets to predict patients’ needs for intensive care within 24, 48, 72, and 96 h following the CXR exams. The classification performances were evaluated on our independent test set (CXR exams of 1048 patients) using the area under the receiver operating characteristic curve (AUC) as the figure of merit in the task of distinguishing between those COVID-19–positive patients who required intensive care following the imaging exam and those who did not.ResultsOur proposed AI/ML model achieved an AUC (95% confidence interval) of 0.78 (0.74, 0.81) when predicting the need for intensive care 24 h in advance, and at least 0.76 (0.73, 0.80) for 48 h or more in advance using predictions based on the AI prognostic marker derived from CXR images.ConclusionsThis AI/ML prediction model for patients’ needs for intensive care has the potential to support both clinical decision-making and resource management.
The coronavirus disease 2019 (COVID-19) pandemic has wreaked havoc across the world. It also created a need for the urgent development of efficacious predictive diagnostics, specifically, artificial intelligence (AI) methods applied to medical imaging. This has led to the convergence of experts from multiple disciplines to solve this global pandemic including clinicians, medical physicists, imaging scientists, computer scientists, and informatics experts to bring to bear the best of these fields for solving the challenges of the COVID-19 pandemic. However, such a convergence over a very brief period of time has had unintended consequences and created its own challenges. As part of Medical Imaging Data and Resource Center initiative, we discuss the lessons learned from career transitions across the three involved disciplines (radiology, medical imaging physics, and computer science) and draw recommendations based on these experiences by analyzing the challenges associated with each of the three associated transition types: (1) AI of non-imaging data to AI of medical imaging data, (2) medical imaging clinician to AI of medical imaging, and (3) AI of medical imaging to AI of COVID-19 imaging. The lessons learned from these career transitions and the diffusion of knowledge among them could be accomplished more effectively by recognizing their associated intricacies. These lessons
learned in the transitioning to AI in the medical imaging of COVID-19 can inform and enhance future AI applications, making the whole of the transitions more than the sum of each discipline, for confronting an emergency like the COVID-19 pandemic or solving emerging problems in biomedicine.
Purpose: We propose a deep learning method for the automatic diagnosis of COVID-19 at patient presentation on chest radiography (CXR) images and investigate the role of standard and soft tissue CXR in this task.
Approach: The dataset consisted of the first CXR exams of 9860 patients acquired within 2 days after their initial reverse transcription polymerase chain reaction tests for the SARS-CoV-2 virus, 1523 (15.5%) of whom tested positive and 8337 (84.5%) of whom tested negative for COVID-19. A sequential transfer learning strategy was employed to fine-tune a convolutional neural network in phases on increasingly specific and complex tasks. The COVID-19 positive/negative classification was performed on standard images, soft tissue images, and both combined via feature fusion. A U-Net variant was used to segment and crop the lung region from each image prior to performing classification. Classification performances were evaluated and compared on a held-out test set of 1972 patients using the area under the receiver operating characteristic curve (AUC) and the DeLong test.
Results: Using full standard, cropped standard, cropped, soft tissue, and both types of cropped CXR yielded AUC values of 0.74 [0.70, 0.77], 0.76 [0.73, 0.79], 0.73 [0.70, 0.76], and 0.78 [0.74,
0.81], respectively. Using soft tissue images significantly underperformed standard images, and
using both types of CXR failed to significantly outperform using standard images alone.
Conclusions: The proposed method was able to automatically diagnose COVID-19 at patient
presentation with promising performance, and the inclusion of soft tissue images did not result
in a significant performance improvement.
Computer-aided diagnosis based on features extracted from medical images relies heavily on accurate lesion segmentation before feature extraction. Using 994 unique breast lesions imaged with dynamic contrast-enhanced (DCE) MRI, several segmentation algorithms were investigated. The first method is fuzzy c-means (FCM), a well-established unsupervised clustering algorithm used on breast MRIs. The second and third methods are based on the convolutional neural network U-Net, a widely-used deep learning method for image segmentation—for two- or three-dimensional MRI data, respectively. The purpose of this study was twofold—1) to assess the performances of 2D (slice-by-slice) and 3D U-Nets in breast lesion segmentation on DCE-MRI trained with FCM segmentations, and 2) compare their performance to that of FCM. Center slice segmentations produced by FCM, 2D U-Net, and 3D U-Net were evaluated using radiologist segmentations as truth, and volumetric segmentations produced by 2D U-Net (slice-by-slice) and 3D U-Net were compared using FCM as a surrogate truth. Five-fold cross-validation was conducted on the U-Nets and Dice similarity coefficient (DSC) and Hausdorff distance (HD) were used as performance metrics. Although 3D U-Net performed well, 2D U-Net outperformed 3D U-Net, both for center slice (DSC p=4.13×10-9, HD p=1.40×10-2) and volume segmentations (DSC p=2.72×10-83, HD p=2.28×10-10). Additionally, 2D U-Net outperformed FCM in center slice segmentation in terms of DSC (p=1.09×10-7). The results suggest that 2D U-Net is promising in segmenting breast lesions and could be an effective alternative to FCM.
In this study, we aim to investigate the role of standard and soft tissue chest radiography (CXR) images in the task of COVID-19 diagnosis at patient presentation using deep learning. The dataset consisted of the initial CXR exams of 6687 patients after their reverse transcription polymerase chain reaction tests for the SARS-CoV-2 virus, 1040 (15.6%) of whom tested positive and 5647 (84.4%) of whom tested negative for COVID-19. Each CXR exam contained a standard image and a soft tissue image obtained either from dual-energy acquisition or postprocessing technology. A curriculum learning technique was employed to train the model on a sequence of gradually more specific and complex tasks by first fine-tuning a model optimized for natural images on a previously established CXR dataset to diagnose a broad spectrum of pathologies, then refining the model on another established dataset to detect pneumonia, and finally fine-tuning the model again on the COVID-19 dataset collected for this study. In the last phase of training, the COVID-19 positive/negative classification was performed on 1) the standard images, 2) the soft tissue images, and 3) the two combined via feature fusion. The classification performances were evaluated on a held-out test set of 1338 cases with the same disease prevalence as training and validation sets using the area under the receiver operating characteristic curve (AUC). The three classification schemes with different inputs overall yielded AUC values of 0.76 [0.72, 0.80], 0.76 [0.73, 0.80], and 0.76 [0.72, 0.79]. When compared using the DeLong test, the three schemes yielded equivalent performances with an equivalence margin of ∆AUC = 0.05 was chosen prima facie. The value of the inclusion of soft tissue images will continue to be investigated in the segmentation and feature extraction of COVID-19 involvement, which may contribute to improving the performance of COVID-19 early diagnosis.
Purpose: This study aims to develop and compare human-engineered radiomics methodologies that use multiparametric magnetic resonance imaging (mpMRI) to diagnose breast cancer.
Approach: The dataset comprises clinical multiparametric MR images of 852 unique lesions from 612 patients. Each MR study included a dynamic contrast-enhanced (DCE)-MRI sequence and a T2-weighted (T2w) MRI sequence, and a subset of 389 lesions were also imaged with a diffusion-weighted imaging (DWI) sequence. Lesions were automatically segmented using the fuzzy C-means algorithm. Radiomic features were extracted from each MRI sequence. Two approaches, feature fusion and classifier fusion, to utilizing multiparametric information were investigated. A support vector machine classifier was trained for each method to differentiate between benign and malignant lesions. Area under the receiver operating characteristic curve (AUC) was used to evaluate and compare diagnostic performance. Analyses were first performed on the entire dataset and then on the subset that was imaged using the three-sequence protocol.
Results: When using the full dataset, the single-parametric classifiers yielded the following AUCs and 95% confidence intervals: AUCDCE = 0.84 [0.82, 0.87], AUCT2w = 0.83 [0.80, 0.86], and AUCDWI = 0.69 [0.62, 0.75]. The two multiparametric classifiers both yielded AUCs of 0.87 [0.84, 0.89] and significantly outperformed all single-parametric methods classifiers. When using the three-sequence subset, the mpMRI classifiers’ performances significantly decreased.
Conclusions: The proposed mpMRI radiomics methods can improve the performance of computer-aided diagnostics for breast cancer and handle missing sequences in the imaging protocol.
Purpose: Task-based image quality assessment using model observers (MOs) is an effective approach to radiation dose and scanning protocol optimization in computed tomography (CT) imaging, once the correlation between MOs and radiologists can be established in well-defined clinically relevant tasks. Conventional MO studies were typically simplified to detection, classification, or localization tasks using tissue-mimicking phantoms, as traditional MOs cannot be readily used in complex anatomical background. However, anatomical variability can affect human diagnostic performance.
Approach: To address this challenge, we developed a deep-learning-based MO (DL-MO) for localization tasks and validated in a lung nodule detection task, using previously validated projection-based lesion-/noise-insertion techniques. The DL-MO performance was compared with 4 radiologist readers over 12 experimental conditions, involving varying radiation dose levels, nodule sizes, nodule types, and reconstruction types. Each condition consisted of 100 trials (i.e., 30 images per trial) generated from a patient cohort of 50 cases. DL-MO was trained using small image volume-of-interests extracted across the entire volume of training cases. For each testing trial, the nodule searching of DL-MO was confined to a 3-mm thick volume to improve computational efficiency, and radiologist readers were tasked to review the entire volume.
Results: A strong correlation between DL-MO and human readers was observed (Pearson’s correlation coefficient: 0.980 with a 95% confidence interval of [0.924, 0.994]). The averaged performance bias between DL-MO and human readers was 0.57%.
Conclusion: The experimental results indicated the potential of using the proposed DL-MO for diagnostic image quality assessment in realistic chest CT tasks.
In this study, we aim to develop a multiparametric breast MRI computer-aided diagnosis (CADx) methodology using residual neural network (ResNet) deep transfer learning to incorporate information from both dynamic contrast-enhanced (DCE)-MRI and T2-weighted (T2w) MRI in the task of distinguishing between benign and malignant breast lesions. This retrospective study included 927 unique lesions from 616 women who underwent breast MR exams. A pre-trained ResNet50 was used to extract features from the maximum intensity projection (MIP) images of the second postcontrast subtraction DCE series and the center slice of the T2w series separately. Support vector machine classifiers were trained on the ResNet features to differentiate between benign and malignant lesions. The benefit of pooling features extracted from multiple levels of the network was examined on DCE MIPs. Three multiparametric methods were investigated, where information from the two sequences was integrated at the image level, feature level, or classifier level. Classification performances were evaluated with five-fold cross-validation using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Using pooled features extracted from multiple layers of the ResNet statistically significantly outperformed only using features extracted from the end of the network (P = .002, 95% CI of ▵AUC: [0.007, 0.029]). The multiparametric classifiers using pooled features yielded AUCImageFusion=0.85±0.01, AUCFeatureFusion=0.87±0.01, and AUCClassifierFusion=0.86±0.01, respectively. The feature fusion method statistically significantly outperformed using DCE alone (P = .01, 95% CI of ▵AUC: [0.004, 0.022]), and all three methods statistically significantly outperformed using T2w alone (P < .001).
The incorporation of diffusion-weighted imaging (DWI) in breast magnetic resonance imaging (MRI) has shown potential in improving the accuracy of breast cancer diagnosis. Since DWI measures possibly complementary biological properties to dynamic contrast-enhanced (DCE) MRI parameters, DWI computer-aided diagnosis (CADx) can potentially improve the performance of current CADx systems in distinguishing between benign and malignant breast lesions. This study was performed on a database of 397 diffusion-weighted breast MR images (69 benign and 328 malignant). Lesions were automatically segmented using a fuzzy C-means method. The apparent diffusion coefficient (ADC)-based radiomic features were extracted and used to train a classifier. Another classifier was trained on convolutional neural network (CNN)-based features extracted by a pre-trained VGG19 network. The outputs from these two classifiers were fused by averaging the posterior probability of malignancy for each case to construct a fusion classifier. The performance evaluation for the three proposed classifiers was performed with five-fold cross-validation. The area under the receiver operating characteristic curve (AUC) was 0.68 (se = 0.04) for the ADC-based classifier, 0.74 (se = 0.03) for the CNN-based classifier, and 0.76 (se = 0.03) for the fusion classifier. The fusion classifier performed significantly better than the ADC-based classifier ( = 0.013). The CNN-based classifier failed to show statistically significant performance difference from the ADC-based classifier or the fusion classifier. The findings demonstrate promising performance of the proposed classifiers and the potential for DWI CADx as well as for the development of multiparametric CADx that incorporates information from both DWI and DCE-MRI in breast lesion classification.
Mathematical model observers (MOs) have become popular in task-based CT image quality assessment, since, once proven to be correlated with human observers (HOs), these MOs can be used to estimate HO performance. However, typical MO studies are limited to phantom data which only involve uniform background. In practice, anatomical background variability and tissue non-uniformity affect HO lesion detection performance. Recently, we have proposed a deep-learning-based MO (DL-MO). In this study, we aim to investigate the correlation between this DL-MO and HOs for a lung-nodule localization task in chest CT. Using a patient database that contains 50 lung cancer screening CT patient cases, 12 different experimental conditions were generated, including 4 radiation dose levels, 3 nodule sizes, 2 nodule types and 3 reconstruction types. These conditions were created by using a validated noise and lesion insertion tool. Four subspecialized radiologists performed the HO study for all 12 conditions individually in a randomized fashion. The DL-MO was trained and tested for the same dataset. The performance of DL-MO and HO was compared across all the experimental conditions. DL-MO performance was strongly correlated with HO performance (Pearson’s correlation coefficient: 0.988 with a 95% confidence interval of [0.894, 0.999]). These results demonstrate the potential to use the proposed DL-MO to predict HO performance for the task of lung nodule localization in chest CT.
Task-based image quality assessment using model observers is promising to provide an efficient, quantitative, and objective approach to CT dose optimization. Before this approach can be reliably used in practice, its correlation with radiologist performance for the same clinical task needs to be established. Determining human observer performance for a well-defined clinical task, however, has always been a challenge due to the tremendous amount of efforts needed to collect a large number of positive cases. To overcome this challenge, we developed an accurate projection-based insertion technique. In this study, we present a virtual clinical trial using this tool and a low-dose simulation tool to determine radiologist performance on lung-nodule detection as a function of radiation dose, nodule type, nodule size, and reconstruction methods. The lesion insertion and low-dose simulation tools together were demonstrated to provide flexibility to generate realistically-appearing clinical cases under well-defined conditions. The reader performance data obtained in this virtual clinical trial can be used as the basis to develop model observers for lung nodule detection, as well as for dose and protocol optimization in lung cancer screening CT.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.