Improved K-means algorithm automatix acquisiotion of initial clustering center
- Details
- Category: Information technologies, systems analysis and administration
- Last Updated on 23 June 2016
- Published on 23 June 2016
- Hits: 4140
Authors:
Guangbin Sun, China University of Petroleum, Beijing, China
Hongqi Li, China University of Petroleum, Beijing, China
Haiying Huang, Daqing Oilfield Engineering Co,Ltd, Daqing, Heilongjiang, China
Abstract:
Purpose. The traditional K-means algorithm requires the K value, and it is sensitive to the initial clustering center. Different initial clustering centers often correspond to the different clustering results, and the K value is always required. Aiming at these shortcomings, the article proposes a method for getting the clustering center based on the density and max-min distance means. The selection of the clustering center and classification can be carried out simultaneously.
Methodology. According to the densities of objects, the noise was eliminated and the densest object was selected as the first clustering center. The max-min distance method was used to search the other best cluster centers, at the same time, the cluster, which the object belongs to, was decided.
Findings. Clustering results are related to the selection of parameters θ. If the sample distribution is unknown, only test method can be used through multiple test optimization. With prior knowledge for the selection of θ, it can be converged quickly. Therefore, θ should be optimized.
Originality. This article proposes the new method based on the density to get the first initial clustering center, and then the new method based on the maximum and minimum value. The improved algorithm obtained through experimental analysis insures higher and stable accuracy.
Practical value. The experiments showed that the algorithm allows for automatic obtaining of the k clustering centers and have a higher clustering accuracy in unknown datasets processing.
Список литературы / References
1. Celebi, M.E., Kingravi, H.A. and Vela, P.A., 2013. A comparative study of efficient initialization methods for the k-means clustering algorithm. ExpertSystems with Applications, vol.40, no.1, pp. 200‒210.
2. Tran T.N. and Drab K., Daszykowski M., 2013. Revised DBSCAN algorithm to cluster data with dense adjacent clusters. Chemometrics and Intelligent Laboratory Systems, vol.120, pp.92‒96.
3. Chakraborty, S. and Nagwani, N.K. 2014. Analysis and study of Incremental DBSCAN clustering algorithm. Eprint ArXiv, vol.1406, no.4754, pp. 401‒410.
4. Smiti, A. and Eloudi, Z. 2013., Soft DBSCAN: Improving DBSCAN Clustering method using fuzzy set theory. In: Proc. of the 6thInternational Conf. On Human System Interaction (HSI), pp. 380‒385.
5. Onoda, T., Sakai, M. and Yamada, S.2012. Careful seeding method based on independent components analysis for k-means clustering. Journal of Emerging Technologies in Web Intelligence, vol.4 no.1, pp. 51‒59.
6. Reddy, D., Jana, P.K. and Member, I.S., 2012. Initialization for K-means clustering using Voronoi diagram, Procedia Technology, vol.4, pp. 395‒400.
7. Zhang, Y.J. and Cheng, E. 2013. An optimized method for selection of the initial centers of k-means clustering.Integrated Uncertainty in Knowledge Modelling and Decision Making. Springer Berlin Heidelberg, pp. 149‒156.
8. Frank, A. and Asuncion A. 2012, UCI machine learning repository.Availableat: <http:// archive.ics.uci.edu/ml> (2012-05-20)
2016_02_Guangbin | |
2016-06-21 1.22 MB 916 |
Older news items:
- Ensemble classification algorithm based improved SMOTE for imbalanced data - 23/06/2016 21:59
- A differential clustering algorithm based on elite strategy - 23/06/2016 21:54
- Method of Image Denoising Based on Sparse Representation and Adaptive dictionary - 23/06/2016 21:51
- Improved binaryanity-collision algorithm for RFID - 23/06/2016 21:49
- Similarity distance based approach for outlier detection by matrix calculation - 23/06/2016 21:47
- Formation of an automated traffic capacity calculation system of rail networks for freight flows of mining and smelting enterprises - 23/06/2016 21:42