TY - BOOK AU - Labayan, Luchie Marie A. TI - Hierarchical clustering for mixed dataset based on variance and entropy PY - 2009/// KW - Aggregation KW - Clustering KW - Dissimilarity coefficients KW - Distance hierarchy KW - Entropy KW - Variance KW - Mixed data KW - Hierarchical clustering KW - Dissimilarity measures KW - Datasets KW - Undergraduate Thesis KW - AMAT200 N1 - Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2009 N2 - Hsu's coefficient for mixed data was modified in terms of the aggregation technique and the entropy-based distance function the efficiently cluster mixed datasets. There are six proposed dissimilarly coefficients which use variance for numerical attributes and entropy such as Shannon's weighted entropy. Havrda-Charvat's structural a-entropy and Jensen-Shannon divergence for categorical attributes. The type of data has a significant effect on the clustering produced accompanied by the aggregation function used and the entropy measure employed. For data whose categorical values have no level of similarity. The proposed dissimilarity coefficients that are closely related and generated similar dendrograms are those which have the same aggregation function. Based on the performance, the proposed dissimilarity coefficients that used De Carvallo's dissimilarity measure as aggregation function produced better clustering solution. On the other hand, for data whose categorical values have different degrees of similarity, only the proposed dissimilarity coefficients that used Shannon's entropy weighted by the distance in the distance hierarchy deviated from the group. The proposed dissimilarity coefficients that used De Carvallo's extension of Ichino and Yaguchi's dissimilarity as aggregation function worked well in clustering. The six proposed dissimilarity coefficients performed better with mixed data compared to the existing dissimilarity measures for mixed data ER -