Using cluster analysis, persons or objects can be categorized into groups having similar measurments of number of characteristics or attributes. Cluster analysis techniques form groups such that the simimilarity among the group members is maximized.
The similarity measures used in clustering are:
Distance measure
Correlation coefficient
Association coefficient
Distance measure
For distance measure, the Euclidean distance between two points is used. Actually, the distance measure is commented as dissimilarity measure also.
In two dimensional space, if the coordinates of two points are (x1,y1) and (x2, y2), then the distance between two points is
d12 = SQRT[ (x2-x1)2 + (y2-y1)2]
In the case of a cluster of points, the representative point of the cluster for calculating the distance between a cluster and a point or another cluster is the centroid.
The coordinates of the centroid are calculated by the formula:
X = Sum of xi (i = 1 to n)/n
Where X = x coordinate of the centroid
n = number of points in the centroid
Association coefficient
Where number of attributes are there for each object, a value of zero is assigned for an attribute in case of an object if it is not present in the object. If the attribute is present value of one assigned. For each object, the total score is found out and it is termed as the association coefficient.
Clustering Techniques
Hierarchical clustering and non-nierarchical clustering is a classification of clustering techniques.
In hierarchical clustering there is bottom up approach termed as agglomerative method. In this methods, all objects are initially treated as independent clusters and clustering starts with reduction in number of clusters till one cluster of all objects is formed. No doubt at some point in the process, there will be best cluster output.
In the top-down approach termed as divisive method, at the start all objects are included in a single cluster and clustering process breaks them into more number of clusters.
In non hierarchical techniques, some inital solution is used and clustering takes place. In these techniques, objects are allowed to change clusters at the various steps in the process.
Some of the popular hierarchical clustering techniques are:
Single linkage clustering method
Complete linkage clustering method
Average linkage clustering method
Ward's method
Cnetroid method
___________________________________________________________________________________________
Further References
Hierarchical Clustering Methods - Presentation in PDF file
Research papers
Multi parameter Hierarchical Clustering Methods, 2009
Included in chapter data analysis
http://knol.google.com/k/narayana-rao-k-v-s-s/-/2utb2lsm2k7a/3531
Narayana Rao - 08 Dec 2010