Phytoplankton is highly diversified in species, differing in size, geometries, morphology and biochemical composition. Such diversity plays a critical role in the atmospheric carbon cycle and marine ecosystem. Large-scale quantitation and classification of phytoplankton with taxonomic information is thus of significance in environmental monitoring and even biofuel production. To this end, we report a high-throughput, label-free imaging flow cytometer (>10,000 cells/sec) based on quantitative phase time stretch imaging flow cytometry, combined with a supervised learning strategy for multi-class classification of phytoplankton (13 classes). This is in contrast to the previous demonstrations on integrating machine learning with time-stretch imaging which achieve high-accuracy binary (two-class) image-based classification. We leverage interferometry-free quantitative phase time-stretch imaging which favors generation of high-resolution and high-contrast single-cell (phytoplankton) images with both quantitative phase and amplitude contrasts, we can extract a catalogue of 109 image-content-rich features (44 from the amplitude image and 65 from the phase image), not only limited to sizes, shapes, but also sub-cellular morphology, e.g. local dry mass density statistics. By using the random forest algorithm for feature ranking, we select 30 most significant features for a multi-class SVM model and achieve a high classification accuracy (> 95%) across 13 classes of phytoplankton. Almost 50% of these selected features are derived from the quantitative phase and play an important role in classifying morphologically similar species, e.g. Thalassiosira versus Prorocentrum; Chaetoceros gracilis versus Merismopedia – demonstrating the classification power of this quantitative phase time-stretch imaging flow cytometer required for large-scale high-content screening and analysis.
|