Local cover image
Local cover image
Local cover image
Local cover image

Exploratory analysis on DPCoA on small data sets with missing values using imputation methods / Kristine Karen C. Mahinay

By: Material type: TextTextLanguage: English Publication details: 2010Description: 64 leavesSubject(s): Dissertation note: Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2010 Abstract: A new ordination method, DPCoA, allows comparison among several communities containing species that differ in taxonomic features. However, missing information in a data set are inevitable, and DPCoA does not have an internal method that can handle missing value in a data set. As an introductory and exploratory analysis, the study determined how the commonly used imputation methods, namely, mean imputation, k-nearest neighbor imputation, regression imputation, and expectation-maximization imputation, used as pre-processing step to DPCoA for incomplete abundance data sets affect the quadratic entropy and DPCoA plots of the sites studied. Also, different levels of degradation -1%, 5% and 15% - were investigated as to how well these imputation approaches behave when high amount of missing values are present in the data set. Rao DIVCs generated and DPCoA plots obtained from the complete data and the imputed data sets were compared using Spearman rank correlation and Procrustes analysis, respectively. Results showed that the imputation methods employed yield high Spearman correlation coefficients and correlation of Procrustes rotation when missing values are relatively small as they estimate close to the real value that was lost. Consequently, as greater amounts of missing values exist, a weaker performance of the imputation methods, especially that of the expectation-maximization imputation, cold be obtained. Although it is expected that expectation-maximization imputation yield good estimates, a lower Spearman coefficient and correlation of Procrustes rotation were computed. This would lead to higher risk of misinterpretation in the relationships of several communities. However, since this study is time-bound, it is suggested that this study be repeated several times to further evaluate the performance of the imputation methods.
List(s) this item appears in: BS Applied Mathematics
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Cover image Item type Current library Collection Call number Status Date due Barcode
University Library Theses Room-Use Only LG993.5 2010 A64 M35 (Browse shelf(Opens below)) Not For Loan 3UPML00012583
University Library Archives and Records Preservation Copy LG993.5 2010 A64 M35 (Browse shelf(Opens below)) Not For Loan 3UPML00033348

Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2010

A new ordination method, DPCoA, allows comparison among several communities containing species that differ in taxonomic features. However, missing information in a data set are inevitable, and DPCoA does not have an internal method that can handle missing value in a data set. As an introductory and exploratory analysis, the study determined how the commonly used imputation methods, namely, mean imputation, k-nearest neighbor imputation, regression imputation, and expectation-maximization imputation, used as pre-processing step to DPCoA for incomplete abundance data sets affect the quadratic entropy and DPCoA plots of the sites studied. Also, different levels of degradation -1%, 5% and 15% - were investigated as to how well these imputation approaches behave when high amount of missing values are present in the data set. Rao DIVCs generated and DPCoA plots obtained from the complete data and the imputed data sets were compared using Spearman rank correlation and Procrustes analysis, respectively. Results showed that the imputation methods employed yield high Spearman correlation coefficients and correlation of Procrustes rotation when missing values are relatively small as they estimate close to the real value that was lost. Consequently, as greater amounts of missing values exist, a weaker performance of the imputation methods, especially that of the expectation-maximization imputation, cold be obtained. Although it is expected that expectation-maximization imputation yield good estimates, a lower Spearman coefficient and correlation of Procrustes rotation were computed. This would lead to higher risk of misinterpretation in the relationships of several communities. However, since this study is time-bound, it is suggested that this study be repeated several times to further evaluate the performance of the imputation methods.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image Local cover image
 
University of the Philippines Mindanao
The University Library, UP Mindanao, Mintal, Tugbok District, Davao City, Philippines
Email: library.upmindanao@up.edu.ph
Contact: (082)295-7025
Copyright @ 2022 | All Rights Reserved