Molecular classification of tumors holds great potential for cancer research, diagnosis, and treatment. In this study, we apply a novel classification technique to cDNA microarray data for discriminating between three subtypes of malignant lymphoma: CD5+ diffuse large B-cell lymphoma, CD5- diffuse large B-cell lymphoma, and mantle cell lymphoma. The proposed technique combines the k-Nearest Neighbor (k-NN) algorithm with optimized data quantization. The feature genes on which the classification is based are selected by ranking them according to their separability criteria computed by taking into account between-class and within-class scatter. The classification errors, estimated using cross-validation, are significantly lower than those produced by classical variants of the k-NN algorithm. Multidimensional scaling and hierarchical clustering dendrograms are used to visualize the separation of the three subtypes of lymphoma.
This paper compares several discrimination methods for the classification of tumors using gene expression data. We introduce variations of known classification methods, and compare the effects of quantizing the data prior to applying various methods, and also discuss the selection of the distance function. The error rates obtained with the new methods are shown to be smaller than those reported in recently published studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.