It has passed about twenty years since clinical information are stored electronically as a hospital information
system since 1980's. Stored data include from accounting information to laboratory data and even patient
records are now started to be accumulated: in other words, a hospital cannot function without the information
system, where almost all the pieces of medical information are stored as multimedia databases. In this paper,
we applied temporal data mining and exploratory data analysis techniques to hospital management data. The
results show several interesting results, which suggests that the reuse of stored data will give a powerful tool for
hospial management.
This paper reports the results of temporal analysis of platelet (PLT) data in chronic hepatitis dataset. First
we briefly introduce a cluster analysis system for temporal data that we have developed. Second, we show the
results of cluster analysis of PLT sequences. Third, we show the results of PLT value-based temporal analysis
aiming at finding years for reaching F4, years elapsed between stages, and their relationships with virus types
and fibrotic stages. The results of cluster analysis indicate that the temporal courses of PLT can be grouped
into several patterns each of which presents similarity in average PLT level and increase/decrease trends. The
results of value-based analysis suggests that liver fibrosis may proceed faster in the exacerbating cases.
KEYWORDS: Liver, Data mining, Bessel functions, Health informatics, Medicine, Time series analysis, Information assurance, Convolution, Multiscale representation, Information technology
This paper proposes a new approach to temporal trajectory analysis for clinical laboratory examinations. When
we select m laboratory examinations, their temporal evolution for one patient can be viewed as a trajectory
in m-dimensional space. Multiscale comparison technique can be applied for segmentation and calculation of
structural similarities of such trajectories. Then, clustering cna be applied to the calculated similarities for
classiffication of these trajectories. The proposed method was evaluated on hepatitis datasets, whose results show
that the clustering captured several interesting patterns for severe chronic hepatitis.
KEYWORDS: Liver, Data mining, Convolution, Medicine, Pattern recognition, Neptunium, Health informatics, Data analysis, Information assurance, Computer intrusion detection
This paper presents a novel method for clustering time-series medical data based on the improved multiscale matching. Multiscale matching, developed originally as a pattern recognition technique, has an ability to compare two shapes by partly changing observation scales. We have made some improvements to the conventional multiscale matching in order to enable the cross-scale, granularity-based comparison of long-term time-series sequences. The key idea is
development of a new segment representation that eludes the problem of shrinkage. We induced shape parameters of a segment at high scale directly from the base segments at the lowest scale, instead of using shapes represented by multiscale description. We examined the usefulness of the method on the cylinder-bell-funnel dataset and chronic hepatitis dataset. The results demonstrated that the dissimilarity matrix produced by the proposed method, conbined with conventional clustering techniques, lead to the successful
clustering for both synthetic and real-world data.
KEYWORDS: Probability theory, Data mining, Picosecond phenomena, Statistical analysis, Health informatics, Medicine, Information assurance, Surgery, Computer intrusion detection, Network security
A contingency table summarizes the conditional frequencies of two attributes and shows how these two attributes are dependent on each other with the information on a partition of universe generated by these attributes. Thus, this table can be viewed as a relation between two attributes with respect to information granularity.
This paper focuses on several characteristics of linear and statistical independence in a contingency table from the viewpoint of granular computing, which shows that statistical independence in a contingency table is a special form of linear dependence. The discussions also show that when a contingency table is viewed as a matrix, called a contingency matrix, its rank is equal to 1.0. Thus, the degree of independence, rank plays a very important role in extracting a probabilistic model from a given contingency table.
Furthermore, it is found that in some cases, partial rows or columns will satisfy the condition of statistical independence, which can be viewed as a solving process of Diophatine equations.
This paper presents a comparative study about the characteristics of clustering methods for inhomogeneous time-series medical datasets. Using various combinations of comparison methods and grouping methods, we performed clustering experiments of the hepatitis data set and evaluated validity of the results. The results suggested that (1) complete-linkage (CL) criterion in agglomerative hierarchical clustering (AHC) outperformed average-linkage (AL) criterion in
terms of the interpretability of a dendrogram and clustering results, (2) combination of dynamic time warping (DTW) and CL-AHC constantly produced interpretable results, (3) combination of DTW and rough clustering (RC) would be used to find the core sequences of the clusters, (4) multiscale matching may suffer from the treatment of 'no-match' pairs, however, the problem may be eluded by using RC as a subsequent grouping method.
KEYWORDS: Data mining, Data acquisition, Quantitative analysis, Head, Medicine, Convolution, Phase shifts, Health informatics, Data processing, Electroencephalography
This paper reports characteristics of dissimilarity measures used in the multiscale matching. Multiscale matching is a method for comparing two planar curves by partially changing observation scales. Throughout all scales, it finds the best set of pairs of partial contours that contains no miss-matched or over-matched contours and that minimizes the accumulated differences between the partial contours. In order to make this method applicable to comparison of the temporal sequences, we have proposed a dissimilarity measure that compares subsequences according to the following aspects: rotation angle, length, phase and gradient. However, it empirically became apparent that it was difficult to understand from the results that which aspects were really contributed to the resultant dissimilarity of the sequences. In order to investigate fundamental characteristics of the dissimilarity measure, we performed quantitative analysis of the induced dissimilarities using simple sine wave and its variants. The results showed that differences on the amplitude, phase and trends were respectively captured by the terms on rotation angle, phase and gradient, although they also showed weakness on the linearity.
One of the key concepts in data mining is to give a suitable partition of datasets in an automatic way. On one hand, classification method is to find the partitions given by combinations of attribute-value pairs which are best fit to the partition given by target concepts. On the other hand, clustering method is to find the partitions which best characterize given datasets by using a similarity measure. Therefore, the choice of distance or similarity measures are one of the most important research topics in data mining. However, such empirical comparisons have never been studied in the literature. In this paper, several types of similarity measures were compared in the following three clinical contexts: the first one is for datasets composed of only categorical attributes. The second one is for those of mixture of categorical and numerical attributes. The final one is for those of only numerical attributes. Experimental results show that simple similarity measures perform as well as new proposed measures.
This paper presents an automated method for segmenting CT images of the fractured foot. Segmentation boundary is determined by fuzzy inference with two types of knowledge acquired from orthopedic surgeons. Knowledge of joint is used to determine the boundary of adjacent normal bones. It gives higher degree to the articular cartilage according to local structure (parallelity) and intensity distribution around a joint part. Knowledge of fragment is used to find a contact place of fragments. It evaluates Euclidian distance map (EDM) of the contact place and gives higher degree to the narrow part. Each of the knowledge is represented by fuzzy if-then rules, which can provide degrees for segmentation boundary. By evaluating the degrees in region growing process, a whole foot bone is decomposed into each of anatomically meaningful bones and fragments. An experiment was done on CT images of the subjects who have depressed fractures on their calcanei. The method could effectively give higher degrees on the essential boundary, suppressing generation of useless boundary caused by the internal cavities in the bone. Each of the normal bones and fragments were correctly segmented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.