Local cover image
Local cover image
Local cover image
Local cover image

On stability of optimal clusters / Toni Kathrina L. Yap.

By: Material type: TextTextLanguage: English Publication details: 2006Description: 84 leavesSubject(s): Dissertation note: Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2006 Abstract: The evaluation of the quality of the clustering results is important in clustering analysis. Two of the widely-used approaches to evaluate the quality of clustering results are the validity index and the stability index. This paper compares the stable number of clusters based on the stability index M(K) proposed by Levine and Domany (2001) with the optimal number of clusters from the eight validity indices namely the C index, DB index, Dunn index, Silhouette index, SD index, SD index, CH index, and KL index. The stability index M(K) identified the stable number of clusters with varying dilution factors. K-means algorithm was used to cluster the five data sets. The data sets include the Iris, E. coli, processed Cleveland heart disease, glass and the water treatment. The quality of the clustering results represented in the number of computed clusters was evaluated using the validity index and the stability index. Result of the analysis showed that the stability index was consistent in identifying stable clustering at K= 2 for all the data sets. On the other hand, the eight validity indices performed differently for different data sets. Only Dunn index and CH index were consistent with the stability index in identifying optimal number of clusters that best fit the data. SD index, SD index and Silhouette index fairly performed in identifying optimal number of clusters that are stable clustering solution for all the data sets. The C index and the DB index were likely to favor large number of clusters. KL index was the least performer among the indices identifying the stable number of clusters only with two of the data sets used.
List(s) this item appears in: BS Applied Mathematics
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Cover image Item type Current library Collection Call number Status Date due Barcode
Thesis Thesis University Library Theses Room-Use Only LG993.5 2006 A64 Y36 (Browse shelf(Opens below)) Not For Loan 3UPML00011725
Thesis Thesis University Library Archives and Records Preservation Copy LG993.5 2006 A64 Y36 (Browse shelf(Opens below)) Not For Loan 3UPML00034064

Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2006

The evaluation of the quality of the clustering results is important in clustering analysis. Two of the widely-used approaches to evaluate the quality of clustering results are the validity index and the stability index. This paper compares the stable number of clusters based on the stability index M(K) proposed by Levine and Domany (2001) with the optimal number of clusters from the eight validity indices namely the C index, DB index, Dunn index, Silhouette index, SD index, SD index, CH index, and KL index. The stability index M(K) identified the stable number of clusters with varying dilution factors. K-means algorithm was used to cluster the five data sets. The data sets include the Iris, E. coli, processed Cleveland heart disease, glass and the water treatment. The quality of the clustering results represented in the number of computed clusters was evaluated using the validity index and the stability index. Result of the analysis showed that the stability index was consistent in identifying stable clustering at K= 2 for all the data sets. On the other hand, the eight validity indices performed differently for different data sets. Only Dunn index and CH index were consistent with the stability index in identifying optimal number of clusters that best fit the data. SD index, SD index and Silhouette index fairly performed in identifying optimal number of clusters that are stable clustering solution for all the data sets. The C index and the DB index were likely to favor large number of clusters. KL index was the least performer among the indices identifying the stable number of clusters only with two of the data sets used.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image Local cover image
 
University of the Philippines Mindanao
The University Library, UP Mindanao, Mintal, Tugbok District, Davao City, Philippines
Email: library.upmindanao@up.edu.ph
Contact: (082)295-7025
Copyright @ 2022 | All Rights Reserved