Using cluster analysis, persons or objects can be categorized into groups having similar measurments of number of characteristics or attributes. Cluster analysis techniques form groups such that the simimilarity among the group members is maximized.

The similarity measures used in clustering are:

Distance measure

Correlation coefficient

Association coefficient

Distance measure

For distance measure, the Euclidean distance between two points is used. Actually, the distance measure is commented as dissimilarity measure also.

In two dimensional space, if the coordinates of two points are (x1,y1) and (x2, y2), then the distance between two points is

d12 = SQRT[ (x2-x1)^{2 } + (y2-y1)^{2}]

In the case of a cluster of points, the representative point of the cluster for calculating the distance between a cluster and a point or another cluster is the centroid.

The coordinates of the centroid are calculated by the formula:

X = Sum of x_{i} (i = 1 to n)/n

Where X = x coordinate of the centroid

n = number of points in the centroid

Association coefficient

Where number of attributes are there for each object, a value of zero is assigned for an attribute in case of an object if it is not present in the object. If the attribute is present value of one assigned. For each object, the total score is found out and it is termed as the association coefficient.

Clustering Techniques

Hierarchical clustering and non-nierarchical clustering is a classification of clustering techniques.

In hierarchical clustering there is bottom up approach termed as agglomerative method. In this methods, all objects are initially treated as independent clusters and clustering starts with reduction in number of clusters till one cluster of all objects is formed. No doubt at some point in the process, there will be best cluster output.

In the top-down approach termed as divisive method, at the start all objects are included in a single cluster and clustering process breaks them into more number of clusters.

In non hierarchical techniques, some inital solution is used and clustering takes place. In these techniques, objects are allowed to change clusters at the various steps in the process.

Some of the popular hierarchical clustering techniques are:

Single linkage clustering method

Complete linkage clustering method

Average linkage clustering method

Ward's method

Cnetroid method

___________________________________________________________________________________________

Further References

Hierarchical Clustering Methods - Presentation in PDF file

Research papers

Multi parameter Hierarchical Clustering Methods, 2009

## Included in chapter data analysis

http://knol.google.com/k/narayana-rao-k-v-s-s/-/2utb2lsm2k7a/3531

Narayana Rao - 08 Dec 2010