PurposeCells are building blocks for human physiology; consequently, understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions in both health and disease. Hematoxylin and eosin (H&E) is the standard stain used in histological analysis of tissues in both clinical and research settings. Although H&E is ubiquitous and reveals tissue microanatomy, the classification and mapping of cell subtypes often require the use of specialized stains. The recent CoNIC Challenge focused on artificial intelligence classification of six types of cells on colon H&E but was unable to classify epithelial subtypes (progenitor, enteroendocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), and connective subtypes (fibroblasts). We propose to use inter-modality learning to label previously un-labelable cell types on H&E.ApproachWe took advantage of the cell classification information inherent in multiplexed immunofluorescence (MxIF) histology to create cell-level annotations for 14 subclasses. Then, we performed style transfer on the MxIF to synthesize realistic virtual H&E. We assessed the efficacy of a supervised learning scheme using the virtual H&E and 14 subclass labels. We evaluated our model on virtual H&E and real H&E.ResultsOn virtual H&E, we were able to classify helper T cells and epithelial progenitors with positive predictive values of 0.34±0.15 (prevalence 0.03±0.01) and 0.47±0.1 (prevalence 0.07±0.02), respectively, when using ground truth centroid information. On real H&E, we needed to compute bounded metrics instead of direct metrics because our fine-grained virtual H&E predicted classes had to be matched to the closest available parent classes in the coarser labels from the real H&E dataset. For the real H&E, we could classify bounded metrics for the helper T cells and epithelial progenitors with upper bound positive predictive values of 0.43±0.03 (parent class prevalence 0.21) and 0.94±0.02 (parent class prevalence 0.49) when using ground truth centroid information.ConclusionsThis is the first work to provide cell type classification for helper T and epithelial progenitor nuclei on H&E.
Crohn’s disease (CD) is a chronic and relapsing inflammatory condition that affects segments of the gastrointestinal tract. CD activity is determined by histological findings, particularly the density of neutrophils observed on Hematoxylin and Eosin stains (H&E) imaging. However, understanding the broader morphometry and local cell arrangement beyond cell counting and tissue morphology remains challenging. To address this, we characterize six distinct cell types from H&E images and develop a novel approach for the local spatial signature of each cell. Specifically, we create a 10-cell neighborhood matrix, representing neighboring cell arrangements for each individual cell. Utilizing t-SNE for non-linear spatial projection in scatter-plot and Kernel Density Estimation contour-plot formats, our study examines patterns of differences in the cellular environment associated with the odds ratio of spatial patterns between active CD and control groups. This analysis is based on data collected at the two research institutes. The findings reveal heterogeneous nearest-neighbor patterns, signifying distinct tendencies of cell clustering, with a particular focus on the rectum region. These variations underscore the impact of data heterogeneity on cell spatial arrangements in CD patients. Moreover, the spatial distribution disparities between the two research sites highlight the significance of collaborative efforts among healthcare organizations. All research analysis pipeline tools are available at https://github.com/MASILab/cellNN.
Understanding the way cells communicate, co-locate, and interrelate is essential to understanding human physiology. Hematoxylin and eosin (H&E) staining is ubiquitously available both for clinical studies and research. The Colon Nucleus Identification and Classification (CoNIC) Challenge has recently innovated on robust artificial intelligence labeling of six cell types on H&E stains of the colon. However, this is a very small fraction of the number of potential cell classification types. Specifically, the CoNIC Challenge is unable to classify epithelial subtypes (progenitor, endocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), or connective subtypes (fibroblasts, stromal). In this paper, we propose to use inter-modality learning to label previously un-labelable cell types on virtual H&E. We leveraged multiplexed immunofluorescence (MxIF) histology imaging to identify 14 subclasses of cell types. We performed style transfer to synthesize virtual H&E from MxIF and transferred the higher density labels from MxIF to these virtual H&E images. We then evaluated the efficacy of learning in this approach. We identified helper T and progenitor nuclei with positive predictive values of 0.34 ± 0.15 (prevalence 0.03 ± 0.01) and 0.47 ± 0.1 (prevalence 0.07 ± 0.02) respectively on virtual H&E. This approach represents a promising step towards automating annotation in digital pathology.
The Tangram algorithm is a benchmarking method of aligning single-cell data to various forms of spatial data collected from the same region. With this data alignment, the annotation of the single-cell data can be projected to spatial data. However, the cell composition of the single-cell data and spatial data might be different because of heterogeneous cell distribution. Whether the Tangram algorithm can be adapted when the two data have different cell-type ratios has not been discussed in previous works. In our practical application that maps the cell-type classification results of single-cell data to the Multiplex immunofluorescence spatial data, cell-type ratios were different. In this work, both simulation and empirical validation were conducted to quantitatively explore the impact of the mismatched cell-type ratio on the Tangram mapping in different situations. Results show that the cell-type difference has a negative influence on annotation mapping accuracy.
Multiplex immunofluorescence (MxIF) is an emerging imaging technology whose downstream molecular analytics highly rely upon the effectiveness of cell segmentation. In practice, multiple membrane markers (e.g., NaKATPase, PanCK and β-catenin) are employed to stain membranes for different cell types, so as to achieve a more comprehensive cell segmentation since no single marker fits all cell types. However, prevalent watershed-based image processing might yield inferior capability for modeling complicated relationships between markers. For example, some markers can be misleading due to questionable stain quality. In this paper, we propose a deep learning based membrane segmentation method to aggregate complementary information that is uniquely provided by large scale MxIF markers. We aim to segment tubular membrane structure in MxIF data using global (membrane markers z-stack projection image) and local (separate individual markers) information to maximize topology preservation with deep learning. Specifically, we investigate the feasibility of four SOTA 2D deep networks and four volumetric-based loss functions. We conducted a comprehensive ablation study to assess the sensitivity of the proposed method with various combinations of input channels. Beyond using adjusted rand index (ARI) as the evaluation metric, which was inspired by the clDice, we propose a novel volumetric metric that is specific for skeletal structure, denoted asclDiceSKEL. In total, 80 membrane MxIF images were manually traced for 5-fold cross-validation. Our model outperforms the baseline with a 20.2% and 41.3% increase in clDiceSKEL and ARI performance, which is significant (p<0.05) using the Wilcoxon signed rank test. Our work explores a promising direction for advancing MxIF imaging cell segmentation with deep learning membrane segmentation. Tools are available at https://github.com/MASILab/MxIF_Membrane_Segmentation.
Crohn’s disease (CD) is a debilitating inflammatory bowel disease with no known cure. Computational analysis of hematoxylin and eosin (H&E) stained colon biopsy whole slide images (WSIs) from CD patients provides the opportunity to discover unknown and complex relationships between tissue cellular features and disease severity. While there have been works using cell nuclei-derived features for predicting slide-level traits, this has not been performed on CD H&E WSIs for classifying normal tissue from CD patients vs active CD and assessing slide label-predictive performance while using both separate and combined information from pseudo-segmentation labels of nuclei from neutrophils, eosinophils, epithelial cells, lymphocytes, plasma cells, and connective cells. We used 413 WSIs of CD patient biopsies and calculated normalized histograms of nucleus density for the six cell classes for each WSI. We used a support vector machine to classify the truncated singular value decomposition representations of the normalized histograms as normal or active CD with four-fold cross-validation in rounds where nucleus types were first compared individually, the best was selected, and further types were added each round. We found that neutrophils were the most predictive individual nucleus type, with an AUC of 0.92 ± 0.0003 on the withheld test set. Adding information improved cross-validation performance for the first two rounds and on the withheld test set for the first three rounds, though performance metrics did not increase substantially beyond when neutrophils were used alone.
Multiplex immunofluorescence (MxIF) is an emerging technique that allows for staining multiple cellular and histological markers to stain simultaneously on a single tissue section. However, with multiple rounds of staining and bleaching, it is inevitable that the scarce tissue may be physically depleted. Thus, a digital way of synthesizing such missing tissue would be appealing since it would increase the useable areas for the downstream single-cell analysis. In this work, we investigate the feasibility of employing generative adversarial network (GAN) approaches to synthesize missing tissues using 11 MxIF structural molecular markers (i.e., epithelial and stromal). Briefly, we integrate a multi-channel high-resolution image synthesis approach to synthesize the missing tissue from the remaining markers. The performance of different methods is quantitatively evaluated via the downstream cell membrane segmentation task. Our contribution is that we, for the first time, assess the feasibility of synthesizing missing tissues in MxIF via quantitative segmentation. The proposed synthesis method has comparable reproducibility with the baseline method on performance for the missing tissue region reconstruction only, but it improves 40% on whole tissue synthesis that is crucial for practical application. We conclude that GANs are a promising direction of advancing MxIF imaging with deep image synthesis.
Multi-modal learning (e.g., integrating pathological images with genomic features) tends to improve the accuracy of cancer diagnosis and prognosis as compared to learning with a single modality. However, missing data is a common problem in clinical practice, i.e., not every patient has all modalities available. Most of the previous works directly discarded samples with missing modalities, which might lose information in these data and increase the likelihood of overfitting. In this work, we generalize the multi-modal learning in cancer diagnosis with the capacity of dealing with missing data using histological images and genomic data. Our integrated model can utilize all available data from patients with both complete and partial modalities. The experiments on the public TCGA-GBM and TCGA-LGG datasets show that the data with missing modalities can contribute to multi-modal learning, which improvesthe model performance in grade classification of glioma cancer.
The Gut Cell Atlas (GCA), an initiative funded by the Helmsley Charitable Trust, seeks to create a reference platform to understand the human gut, with a specific focus on Crohn’s disease. Although a primary focus of the GCA is on focusing on single-cell profiling, we seek to provide a framework to integrate other analyses on multimodality data such as electronic health record data, radiological images, and histology tissues/images. Herein, we use the research electronic data capture (REDCap) system as the central tool for a secure web application that supports protected health information (PHI) restricted access. Our innovations focus on addressing the challenges with tracking all specimens and biopsies, validating manual data entry at scale, and sharing organizational data across the group. We present a scalable, cross-platform barcode printing/record system that integrates with REDCap. The central informatics infrastructure to support our design is a tuple table to track longitudinal data entry and sample tracking. The current data collection (by December 2020) is illustrated with types and formats of the data that the system collects. We estimate that one terabyte is needed for data storage per patient study. Our proposed data sharing informatics system addresses the challenges with integrating physical sample tracking, large files, and manual data entry with REDCap.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.