Clustering of datasets with missing values using principal feature analysis as a feature selection tool / (Record no. 2244)

MARC details
000 -LEADER
fixed length control field 02133nam a22003373a 4500
001 - CONTROL NUMBER
control field UPMIN-00003211650
003 - CONTROL NUMBER IDENTIFIER
control field UPMIN
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20230208143956.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 230208b |||||||| |||| 00| 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency DLC
Modifying agency upmin
Transcribing agency UPMin
041 ## - LANGUAGE CODE
Language code of text/sound track or separate title eng
090 #0 - LOCALLY ASSIGNED LC-TYPE CALL NUMBER (OCLC); LOCAL CALL NUMBER (RLIN)
Classification number (OCLC) (R) ; Classification number, CALL (RLIN) (NR) LG993.5 2008
Local cutter number (OCLC) ; Book number/undivided call number, CALL (RLIN) A64 P44
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name Pelpinosas, Frank B.
9 (RLIN) 2176
245 ## - TITLE STATEMENT
Title Clustering of datasets with missing values using principal feature analysis as a feature selection tool /
Statement of responsibility, etc. Frank B. Pelpinosas.
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Date of publication, distribution, etc. 2008
300 ## - PHYSICAL DESCRIPTION
Extent 51 leaves.
502 ## - DISSERTATION NOTE
Dissertation note Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2008
520 ## - SUMMARY, ETC.
Summary, etc. One of the most prevalent problems
520 3# - SUMMARY, ETC.
Summary, etc. One of the most prevalent problems in clustering is the presence of redundant and irrelevant features, which could damage and misguide the clustering results of the data. Principal Feature Analysis is used as a filter feature selection tool to reduce highly dimensional datasets into smaller dimensions yet preserving the original structure of the data. The problem is worsened with the presence of missing values in the data. The study provides a comparison of the clustering results of the complete (base) datasets and imputed datasets using K-NN and mean imputation across three levels of degradation. The features retained by PFA were used to cluster the samples and were assessed using the Adjusted Rand Index. Results showed that PFA indeed had reduced the dimensions of the data. Principal Feature Analysis also can hardly drop some feature seven when charges in the levels of degradation appear. Both feature retention and cluster recovery were negatively affected by the number of missing values in the data in all the comparison.
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Clustering.
9 (RLIN) 366
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Feature selections.
9 (RLIN) 2177
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Missing values.
9 (RLIN) 990
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element PFA Principal feature analysis.
9 (RLIN) 2178
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Datasets.
9 (RLIN) 1958
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Adjusted Rand Index.
9 (RLIN) 2100
650 17 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element MCAR (Missing completely at random)
9 (RLIN) 2103
658 ## - INDEX TERM--CURRICULUM OBJECTIVE
Main curriculum objective Undergraduate Thesis
Curriculum code AMAT200,
Source of term or code BSAM
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN)
a Fi
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN)
a UP
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Library of Congress Classification
Koha item type Thesis
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Status Collection Home library Current library Shelving location Date acquired Source of acquisition Accession Number Total Checkouts Full call number Barcode Date last seen Price effective from Koha item type
    Library of Congress Classification   Not For Loan Preservation Copy University Library University Library Archives and Records 2009-10-01 donation UAR-T-gd1301   LG993.5 2008 A64 P44 3UPML00032898 2022-10-05 2022-10-05 Thesis
    Library of Congress Classification   Not For Loan Room-Use Only College of Science and Mathematics University Library Theses 2008-12-10 donation CSM-T-gd2037   LG993.5 2008 A64 P44 3UPML00012278 2022-10-05 2022-10-05 Thesis
 
University of the Philippines Mindanao
The University Library, UP Mindanao, Mintal, Tugbok District, Davao City, Philippines
Email: library.upmindanao@up.edu.ph
Contact: (082)295-7025
Copyright @ 2022 | All Rights Reserved