A. M. Fahim and G. Saake and A. M. Salem and F. A. Torkey and M. A. Ramadan DCBOR A Density Clustering Based on Outlier Removal
2917 - 2922
2008
2
9
International Journal of Computer, Electrical, Automation, Control and Information Engineering http://waset.org/publications/10072
http://waset.org/publications/21
World Academy of Science, Engineering and Technology
Data clustering is an important data exploration technique
with many applications in data mining. We present an enhanced
version of the well known single link clustering algorithm. We will
refer to this algorithm as DCBOR. The proposed algorithm alleviates
the chain effect by removing the outliers from the given dataset.
So this algorithm provides outlier detection and data clustering
simultaneously. This algorithm does not need to update the distance
matrix, since the algorithm depends on merging the most knearest
objects in one step and the cluster continues grow as long as possible
under specified condition. So the algorithm consists of two phases;
at the first phase, it removes the outliers from the input dataset. At
the second phase, it performs the clustering process. This algorithm
discovers clusters of different shapes, sizes, densities and requires
only one input parameter; this parameter represents a threshold for
outlier points. The value of the input parameter is ranging from 0 to
1. The algorithm supports the user in determining an appropriate
value for it. We have tested this algorithm on different datasets
contain outlier and connecting clusters by chain of density points,
and the algorithm discovers the correct clusters. The results of
our experiments demonstrate the effectiveness and the efficiency of
DCBOR.
International Science Index 21, 2008