Saturday, May 5, 2012

Cluster Analysis

Cluster Analysis

Cluster Analysis

Authors

 
 
Using cluster analysis, persons or objects can be categorized into groups having similar measurments of number of characteristics or attributes. Cluster analysis techniques form groups such that the simimilarity among the group members is maximized.
 
The similarity measures used in clustering are:
 
Distance measure
Correlation coefficient
Association coefficient
 
Distance measure
 
For distance measure, the Euclidean distance between two points is used. Actually, the distance measure is commented as dissimilarity measure also.
 
In two dimensional space, if the coordinates of two points are (x1,y1) and (x2, y2), then the distance between two points is
 
d12  = SQRT[ (x2-x1) + (y2-y1)2]
 
In the case of a cluster of points, the representative point of the cluster for calculating the distance between a cluster and a point or another cluster is the centroid.
 
The coordinates of the centroid are calculated by the formula:
 
X  = Sum of xi (i = 1 to n)/n
 
Where X = x coordinate of the centroid
n = number of points in the centroid
 
Association coefficient
 
Where number of attributes are there for each object, a value of zero is assigned for an attribute in case of an object if it is not present in the object. If the attribute is present value of one assigned. For each object, the total score is found out and it is termed as the association coefficient.
 
Clustering Techniques
 
Hierarchical clustering and non-nierarchical clustering is a classification of clustering techniques.
 
In hierarchical clustering there is bottom up approach termed as agglomerative method. In this methods, all objects are initially treated as independent clusters and clustering starts with reduction in number of clusters till one cluster of all objects is formed. No doubt at some point in the process, there will be best cluster output.
 
In the top-down approach termed as divisive method, at the start all objects are included in a single cluster and clustering process breaks them into more number of clusters.
 
In non hierarchical techniques, some inital solution is used and clustering takes place. In these techniques, objects are allowed to change clusters at the various steps in the process.
 
Some of the popular hierarchical clustering techniques are:
 
Single linkage clustering method
Complete linkage clustering method
Average linkage clustering method
Ward's method
Cnetroid method
 
 
 
 
 
___________________________________________________________________________________________
 
Further References
 
 
Hierarchical Clustering Methods - Presentation in PDF file
 
 
 
 
 
Research papers
 
Multi parameter Hierarchical Clustering Methods,   2009
 
 
 

Comments

Included in chapter data analysis

http://knol.google.com/k/narayana-rao-k-v-s-s/-/2utb2lsm2k7a/3531

Narayana Rao - 08 Dec 2010

No comments:

Post a Comment