MARC details
000 -LEADER |
fixed length control field |
02804nam a22003013a 4500 |
001 - CONTROL NUMBER |
control field |
UPMIN-00002323666 |
003 - CONTROL NUMBER IDENTIFIER |
control field |
UPMIN |
005 - DATE AND TIME OF LATEST TRANSACTION |
control field |
20230105143709.0 |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
fixed length control field |
221014b |||||||| |||| 00| 0 eng d |
040 ## - CATALOGING SOURCE |
Original cataloging agency |
DLC |
Transcribing agency |
UPMin |
Modifying agency |
upmin |
041 ## - LANGUAGE CODE |
Language code of text/sound track or separate title |
eng |
090 #0 - LOCALLY ASSIGNED LC-TYPE CALL NUMBER (OCLC); LOCAL CALL NUMBER (RLIN) |
Classification number (OCLC) (R) ; Classification number, CALL (RLIN) (NR) |
LG993.5 2006 |
Local cutter number (OCLC) ; Book number/undivided call number, CALL (RLIN) |
C6 G33 |
100 ## - MAIN ENTRY--PERSONAL NAME |
Personal name |
Gabiana, Marie Lou Manalili. |
9 (RLIN) |
986 |
245 #2 - TITLE STATEMENT |
Title |
A modified K-modes algorithm for clustering categorical data sets with missing values using bhattacharyya distance function / |
Statement of responsibility, etc. |
Marie Lou Manalili Gabiana. |
260 ## - PUBLICATION, DISTRIBUTION, ETC. |
Date of publication, distribution, etc. |
2006 |
300 ## - PHYSICAL DESCRIPTION |
Extent |
61 leaves. |
502 ## - DISSERTATION NOTE |
Dissertation note |
Thesis (BS Computer Science -- University of the Philippines Mindanao, 2006 |
520 3# - SUMMARY, ETC. |
Summary, etc. |
Clustering can be defined as the process of organizing objects in a database into cluster/groups such that objects within the same cluster hav ea high degree of similarity, while objects belonging to different clusters have a high degree of dissimalirity. This study clusters data sets and utilized K-modes algorithm for clustering. However, this algorithm is arranged only for complete data sets and not for data sets which contains missing values. This led to the modification of the K-modes algorithm incorporated with the Bhattacharyya distance. There were two modifications; the first modification was the availbale case analyis which uses the availbale information left on the data set while the second modification was the adaptive imputation which imputes missing data during clustering stage. The performances of these modifications were compared with the performances of the existing methods namely; attribute deletion, mode imputation, KNN imputation and K-modes clustering using Chi-square distance. The two modifications produced goofd quality of clustering results compared with K-modes after attribute deletion and K-modes after mode iputation. These modifications were also competitive with regards to K-modes after KNN imputation. The first modification using Bhattcharyya distance produced higher quality resluts compared with forst modification using Chi-square distance. The second modification using Bhattacharyya distance on the other hand produced poorer quality results compared with second modification using Chi-sqaure distance. However, differences between the results in second modifications of both distance functions were not that high. The two modifications using Bhattacharyya distance were later used to cluster an actual incomplete data set to verify further the clustering perfomances. |
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Bhattacharyya distance. |
9 (RLIN) |
987 |
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Clustering. |
9 (RLIN) |
366 |
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
K-modes algorithm. |
9 (RLIN) |
988 |
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Categorical data. |
9 (RLIN) |
989 |
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Missing values. |
9 (RLIN) |
990 |
658 ## - INDEX TERM--CURRICULUM OBJECTIVE |
Main curriculum objective |
Undergraduate Thesis |
Curriculum code |
AMAT200, |
Source of term or code |
BSAM |
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN) |
a |
Fi |
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN) |
a |
UP |
942 ## - ADDED ENTRY ELEMENTS (KOHA) |
Source of classification or shelving scheme |
Library of Congress Classification |
Koha item type |
Thesis |