Modified K-modes clustering algorithms for categorical data sets with missing values/ (Record no. 529)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 02155nam a2200241 4500 |
001 - CONTROL NUMBER | |
control field | UPMIN-00000010893 |
003 - CONTROL NUMBER IDENTIFIER | |
control field | UPMIN |
005 - DATE AND TIME OF LATEST TRANSACTION | |
control field | 20230105163930.0 |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION | |
fixed length control field | 230105b |||||||| |||| 00| 0 eng d |
040 ## - CATALOGING SOURCE | |
Original cataloging agency | DLC |
Transcribing agency | UPMin |
Modifying agency | upmin |
041 ## - LANGUAGE CODE | |
Language code of text/sound track or separate title | eng |
090 ## - LOCALLY ASSIGNED LC-TYPE CALL NUMBER (OCLC); LOCAL CALL NUMBER (RLIN) | |
Classification number (OCLC) (R) ; Classification number, CALL (RLIN) (NR) | LG993.5 2000 |
Local cutter number (OCLC) ; Book number/undivided call number, CALL (RLIN) | A64 D35 |
100 1# - MAIN ENTRY--PERSONAL NAME | |
Personal name | Daisog, Lyna Mie C. |
9 (RLIN) | 1075 |
245 00 - TITLE STATEMENT | |
Title | Modified K-modes clustering algorithms for categorical data sets with missing values/ |
Statement of responsibility, etc. | Lyna Mie C.Daisog. |
260 ## - PUBLICATION, DISTRIBUTION, ETC. | |
Date of publication, distribution, etc. | 2000 |
300 ## - PHYSICAL DESCRIPTION | |
Extent | 67 leaves. |
502 ## - DISSERTATION NOTE | |
Dissertation note | Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2000 |
520 3# - SUMMARY, ETC. | |
Summary, etc. | Clustering is a process of organizing objects in a database into groups such that objects within the same cluster have a high degree of similarity, while objects from different clusters have a high degree of dissimilarity. However, clustering data sets including those with categorical attributes can only be done when the data set is complete. This problem was addressed with the existing methods. The usual way done in handling missing values is by deleting the missing data and considering only complete data points in clustering, and preprocess imputation. However, these methods might jeopardize the quality of resulting clusters. This study modifies K-modes algorithm in order to handle missing values. The first modified algorithm makes use of available information while the second one uses imputation during clustering stage. The performance of the modified algorithms was compared to existing methods namely, casewise deletion, mode imputation, and K-nearest neighbor (KNN) imputation. Modified algorithms produced high quality of resulting clusters compared to case deletion and mode imputation. Although KNN imputation came out to the most stable method in handling missing values, the modified algorithm using available case approach was found out to have resulting clusters close to those of KNN. The methods were used to cluster actual incomplete data set to verify their performance and similar behavior of results was observed. |
658 ## - INDEX TERM--CURRICULUM OBJECTIVE | |
Main curriculum objective | Undergraduate Thesis |
Curriculum code | AMAT200 |
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN) | |
a | Fi |
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN) | |
a | UP |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Source of classification or shelving scheme | Library of Congress Classification |
Koha item type | Thesis |
Withdrawn status | Lost status | Source of classification or shelving scheme | Damaged status | Status | Collection | Home library | Current library | Shelving location | Date acquired | Source of acquisition | Accession Number | Total Checkouts | Full call number | Barcode | Date last seen | Price effective from | Koha item type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Library of Congress Classification | Not For Loan | Preservation Copy | University Library | University Library | Archives and Records | 2006-06-27 | donation | UAR-T-gd744 | LG993.5 2000 A64 D35 | 3UPML00021979 | 2022-09-21 | 2022-09-21 | Thesis | ||||
Library of Congress Classification | Not For Loan | Room-Use Only | College of Science and Mathematics | University Library | Theses | 2006-06-27 | donation | CSM-T-gd1429 | LG993.5 2006 A64 D35 | 3UPML00011618 | 2022-09-21 | 2022-09-21 | Thesis |