Local cover image
Local cover image
Local cover image
Local cover image

Modified K-modes clustering algorithms for categorical data sets with missing values/ Lyna Mie C.Daisog.

By: Material type: TextTextLanguage: English Publication details: 2000Description: 67 leavesSubject(s): Dissertation note: Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2000 Abstract: Clustering is a process of organizing objects in a database into groups such that objects within the same cluster have a high degree of similarity, while objects from different clusters have a high degree of dissimilarity. However, clustering data sets including those with categorical attributes can only be done when the data set is complete. This problem was addressed with the existing methods. The usual way done in handling missing values is by deleting the missing data and considering only complete data points in clustering, and preprocess imputation. However, these methods might jeopardize the quality of resulting clusters. This study modifies K-modes algorithm in order to handle missing values. The first modified algorithm makes use of available information while the second one uses imputation during clustering stage. The performance of the modified algorithms was compared to existing methods namely, casewise deletion, mode imputation, and K-nearest neighbor (KNN) imputation. Modified algorithms produced high quality of resulting clusters compared to case deletion and mode imputation. Although KNN imputation came out to the most stable method in handling missing values, the modified algorithm using available case approach was found out to have resulting clusters close to those of KNN. The methods were used to cluster actual incomplete data set to verify their performance and similar behavior of results was observed.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Cover image Item type Current library Collection Call number Status Date due Barcode
Thesis Thesis University Library Theses Room-Use Only LG993.5 2006 A64 D35 (Browse shelf(Opens below)) Not For Loan 3UPML00011618
Thesis Thesis University Library Archives and Records Preservation Copy LG993.5 2000 A64 D35 (Browse shelf(Opens below)) Not For Loan 3UPML00021979

Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2000

Clustering is a process of organizing objects in a database into groups such that objects within the same cluster have a high degree of similarity, while objects from different clusters have a high degree of dissimilarity. However, clustering data sets including those with categorical attributes can only be done when the data set is complete. This problem was addressed with the existing methods. The usual way done in handling missing values is by deleting the missing data and considering only complete data points in clustering, and preprocess imputation. However, these methods might jeopardize the quality of resulting clusters. This study modifies K-modes algorithm in order to handle missing values. The first modified algorithm makes use of available information while the second one uses imputation during clustering stage. The performance of the modified algorithms was compared to existing methods namely, casewise deletion, mode imputation, and K-nearest neighbor (KNN) imputation. Modified algorithms produced high quality of resulting clusters compared to case deletion and mode imputation. Although KNN imputation came out to the most stable method in handling missing values, the modified algorithm using available case approach was found out to have resulting clusters close to those of KNN. The methods were used to cluster actual incomplete data set to verify their performance and similar behavior of results was observed.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image Local cover image
 
University of the Philippines Mindanao
The University Library, UP Mindanao, Mintal, Tugbok District, Davao City, Philippines
Email: library.upmindanao@up.edu.ph
Contact: (082)295-7025
Copyright @ 2022 | All Rights Reserved