An Efficient Density based Improved K- Medoids Clustering Algorithm Abstract

The CSE Project is about clustering. In the process of the cluster analysis the samples are grouped based on the similar characteristics according to the PCA scores. The ANOVA is used to compare dimensions in corresponding clusters between PCA and FA. 

PCA is a procedure that transforms highly correlated variables to a smaller number of principal components there after cluster analysis is used to divide the dimensions of the samples into three types. 

K-means clustering aims at the optimization to minimize the distance of each sample from the centre of the cluster to which the sample belongs. But the disadvantage of k-means clustering is that the number of the clusters should be specified before. Efforts have been made to find an automatic strategy to determine the number of clusters. The cluster analysis is based on the factor scores of the two key factors, i.e., height factor and girth factor. Then samples are then made into three clusters. The first cluster and the third cluster are overlapping at points of height dimensions. 

The popular methods we are using are principal component analysis (PCA) and Factor analysis (FA) though both of them are capable of reducing the dimensions of variances in sample the such that differences exists between them where PCA analyses all variance present in the data set, while FA analysis only common variances. 

We can conclude that cluster analysis plays a vital role and cluster analysis was developed to divide samples into groups with homogenous characteristics according to the PCA scores. And the ANOVA is used to compare dimensions in corresponding clusters between PCA and FA. 

Leave a Reply

Your email address will not be published. Required fields are marked *