Local cover image
Local cover image
Local cover image
Local cover image

Modified principal feature analysis (MPFA) as a feature selection algorithm for clustering large data sets within missing values.

By: Material type: TextTextLanguage: English Publication details: 2010Description: 63 leavesSubject(s): Dissertation note: Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2010 Abstract: Clustering is a technique of positioning objects into groups that objects within the same group exhibit a high degree of similarity, while objects from different groups manifest a high degree of disparity. Unfortunately, high-dimensional datasets often contain unimportant features that can adversely affect the performance of clustering algorithms. Feature selection has emerged as a reduction technique that chooses only the important features from data. It is commonly applied in preparation for clustering. However, the use of clustering and feature selection is limited only to complete datasets. This study modified Principal Feature Analysis in order to handle missing values. Modified Principal Feature Analysis (MPFA) makes use of all the available information in the data. MPFA was compared to case deletion, mean imputation and KNN imputation, which are common methods of handling missing values. In general, MPFA reduced the datasets with a very low percentage of retention and whose clustering results are low of quality. Also, in comparison with the existing approaches, MPFA exhibited the least satisfactory performance. This is due to inappropriate use of correlation and erroneous choice of data sets used. The competing approaches were further applied to an actual incomplete datasets and similar ranking of performance was observed.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Cover image Item type Current library Collection Call number Status Date due Barcode
Thesis Thesis University Library General Reference Room-Use Only LG993.5 2010 A64 A48 (Browse shelf(Opens below)) Not For Loan 3UPML00012572
Thesis Thesis University Library Archives and Records Preservation Copy LG993.5 2010 A64 A48 (Browse shelf(Opens below)) Not For Loan 3UPML00033354

College of Science and Mathematics

Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2010

Clustering is a technique of positioning objects into groups that objects within the same group exhibit a high degree of similarity, while objects from different groups manifest a high degree of disparity. Unfortunately, high-dimensional datasets often contain unimportant features that can adversely affect the performance of clustering algorithms. Feature selection has emerged as a reduction technique that chooses only the important features from data. It is commonly applied in preparation for clustering. However, the use of clustering and feature selection is limited only to complete datasets. This study modified Principal Feature Analysis in order to handle missing values. Modified Principal Feature Analysis (MPFA) makes use of all the available information in the data. MPFA was compared to case deletion, mean imputation and KNN imputation, which are common methods of handling missing values. In general, MPFA reduced the datasets with a very low percentage of retention and whose clustering results are low of quality. Also, in comparison with the existing approaches, MPFA exhibited the least satisfactory performance. This is due to inappropriate use of correlation and erroneous choice of data sets used. The competing approaches were further applied to an actual incomplete datasets and similar ranking of performance was observed.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image Local cover image
 
University of the Philippines Mindanao
The University Library, UP Mindanao, Mintal, Tugbok District, Davao City, Philippines
Email: library.upmindanao@up.edu.ph
Contact: (082)295-7025
Copyright @ 2022 | All Rights Reserved