The U.S. Food and Drug Administration (FDA) has approved two digital pathology systems for primary diagnosis. These systems produce and consume whole slide images (WSIs) constructed from glass slides using advanced digital slide scanners. WSIs can greatly improve the work ow of pathologists through the development of novel image analytics software for automatic detection of cellular and morphological features and disease diagnosis using histopathology slides. However, the gigabyte size of a WSI poses a serious challenge for storage and retrieval of millions of WSIs. In this paper, we propose a system for scalable storage of WSIs and fast retrieval of image tiles using DRAM. A WSI is partitioned into tiles and sub-tiles using a combination of a space-filling curve, recursive partitioning, and Dewey numbering. They are then stored as a collection of key-value pairs in DRAM. During retrieval, a tile is fetched using key-value lookups from DRAM. Through performance evaluation on a 24-node cluster using 100 WSIs, we observed that, compared to Apache Spark, our system was three times faster to store the 100 WSIs and 1,000 times faster to access a single tile achieving millisecond latency. Such fast access to tiles is highly desirable when developing deep learning-based image analytics solutions on millions of WSIs.
Convolutional neural networks (CNNs) have been popularly used to solve the problem of cell/nuclei classification and segmentation in histopathology images. Despite their pervasiveness, CNNs are fine-tuned on specific, large and labeled datasets as these datasets are hard to collect and annotate. However, this is not a scalable approach. In this work, we aim to gain deeper insights into the nature of the problem. We used a cervical cancer dataset with cells labeled into four classes by an expert pathologist. By employing pre-training on this dataset, we propose a one-shot learning model for cervical cell classification in histopathology tissue images. We extract regional maximum activation of convolutions (R-MAC) global descriptors and train a one-shot learning memory module with the goal of using it for various cancer types and eliminate the need for expensive, difficult to collect, large, labeled whole slide image (WSI) datasets. Our model achieved 94.6% accuracy in detecting the four cell classes on the test dataset. Further, we present our analysis of the dataset and features to better understand and visualize the problem in general.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.