Normal view MARC view ISBD view

Modified K-mean clustering algorithm for fixed numeric and categorical data sets with missing values / (Record no. 2476)

MARC details
000 -LEADER
fixed length control field	02345nam a22002893a 4500
001 - CONTROL NUMBER
control field	UPMIN-00004810112
003 - CONTROL NUMBER IDENTIFIER
control field	UPMIN
005 - DATE AND TIME OF LATEST TRANSACTION
control field	20230201170028.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	230201b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency	DLC
Transcribing agency	UPMin
Modifying agency	upmin
041 ## - LANGUAGE CODE
Language code of text/sound track or separate title	eng
090 #0 - LOCALLY ASSIGNED LC-TYPE CALL NUMBER (OCLC); LOCAL CALL NUMBER (RLIN)
Classification number (OCLC) (R) ; Classification number, CALL (RLIN) (NR)	LG993.5 2010
Local cutter number (OCLC) ; Book number/undivided call number, CALL (RLIN)	A64 M34
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name	Madarang, Jennelle Rizza M.
9 (RLIN)	2021
245 ## - TITLE STATEMENT
Title	Modified K-mean clustering algorithm for fixed numeric and categorical data sets with missing values /
Statement of responsibility, etc.	Jennelle Rizza M. Madarang
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Date of publication, distribution, etc.	2010
300 ## - PHYSICAL DESCRIPTION
Extent	65 leaves.
502 ## - DISSERTATION NOTE
Dissertation note	Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2010
520 3# - SUMMARY, ETC.
Summary, etc.	Clustering is a data mining technique that aims to organize a given set of objects into groups or clusters such that objects within the same cluster are more similar to each other than to data objects in other clusters. However, most of the clustering algorithms deal with complete and with either numeric or categorical data sets only, but not mixed. Ahmad and Dey (2007) proposed an algorithm for clustering complete mixed data sets. In order to deal with incomplete data sets or missing values, modification of the proposed algorithm of Ahmad and Dey (2007) was done. The modification combined two techniques of handling missing values which are available case analysis which uses the available information left on the data set, and the adaptive imputation which imputes missing data during the clustering stage. The performance of the modified algorithm was tested in two data sets, small and large, and was compared to other existing methods namely, case deletion, mean and mode imputation, and kNN imputation using the Adjusted Ran Index, modified algorithm produced fair quality of resulting clusters in the small data set. It was competitive with regards to K-mean after mean and mode imputation and K-mean after kNN imputation. However, the quality of the resulting clusters on large data set is very poor on all methods. It seemed that as the size of the data set becomes bigger the modified K-mean algorithm performed worse.
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Clustering
9 (RLIN)	366
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	K-mean algorithm
9 (RLIN)	2022
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Missing values
9 (RLIN)	990
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Mixed numeric and categorical data
9 (RLIN)	2023
658 ## - INDEX TERM--CURRICULUM OBJECTIVE
Main curriculum objective	Undergraduate Thesis
Curriculum code	AMAT200
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN)
a	Fi
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN)
a	UP
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme	Library of Congress Classification
Koha item type	Thesis

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Damaged status	Status	Collection	Home library	Current library	Shelving location	Date acquired	Source of acquisition	Accession Number	Total Checkouts	Full call number	Barcode	Date last seen	Price effective from
		Library of Congress Classification		Not For Loan	Preservation Copy	University Library	University Library	Archives and Records	2010-07-06	donation	UAR-T-gd1575		LG993.5 2010 A64 M34	3UPML00033262	2022-10-05	2022-10-05
		Library of Congress Classification		Not For Loan	Room-Use Only	College of Science and Mathematics	University Library	Theses	2010-05-13	donation	CSM-T-gd2248		LG993.5 2010 A64 M34	3UPML00012582	2022-10-05