期刊文献+

基于网格的二次K-means聚类算法 预览 被引量:4

Two times K-means algorithm based on grid
在线阅读 下载PDF
分享 导出
摘要 传统的K-means算法是一种常用的聚类算法,但它对于初始聚类中心敏感,容易受到"噪声"和孤立点的影响,由此提出了一种基于网格的二次K-means聚类算法.此算法先将空间划分为多个大小相等的网格,然后根据给定的密度阈值来计算出密集网格,对密集网格中的点进行初次聚类,将初次聚类结果的均值点作为第二次聚类的初始均值点,从而消除了"噪声"和孤立点的影响,并且保证了信息的完整,实验证明此算法是有效的. Classical K-means is a popular clustering algorithm,but it's sensitive to initial mean points,and is mostly influenced by noisy and abnormal data.So the paper provides a two times K-means algorithm based on grid.Firstly,the algorithm divides the space to many equal grids,and then gets dense grid.The algorithm deals with the points in dense grid to firstly clustering.In secondly clustering,the algorithm uses the mean points that are results of firstly clustering as initial mean points of second times.So it can remove the influence of noisy and abnormal data,and keep the completeness of information.Experiments prove the algorithm is effective.
作者 欧阳浩 陈波 王萌 黄镇谨 OUYANG Hao,CHEN Bo,WANG Meng,HUANG Zheng-jin(Department of Computer Engineering,Guangxi University of Technology,Liuzhou 545006,China)
出处 《广西工学院学报》 CAS 2012年第1期 24-27,33,共5页 Journal of Guangxi University of Technology
基金 广西科技攻关计划项目(桂科攻0992006-13) 广西工学院博士基金项目(院科博11Z05)资助
关键词 数据挖掘 聚类 K-均值算法 网格 data mining clustering K-means grid
作者简介 欧阳浩。讲师,硕士,研究方向:数据挖掘、人工智能,E—mail:ouyanghao@tom.com.
  • 相关文献

参考文献6

二级参考文献24

  • 1孙红岩,孙晓鹏,李华.基于K-means聚类方法的三维点云模型分割[J].计算机工程与应用,2006,42(10):42-45. 被引量:13
  • 2贺玲,吴玲达,蔡益朝.数据挖掘中的聚类算法综述[J].计算机应用研究,2007,24(1):10-13. 被引量:152
  • 3Hung S Y,Yen D C,Wang H Y.Applying data mining to telecom churn management[J].Expert Systems with Applications,2006,31 (3) : 515-524. 被引量:1
  • 4Han J,Kamber M.Data Mining Concepts and Techniques[M].[S.1.]: Morgan Kaufmann Publishers, 2001. 被引量:1
  • 5Bradley P,Fayyad U.Refining Initial Points for K-means Clustering[C]//Proceedings of the 15th ICML, Madison, 1998 : 91-99. 被引量:1
  • 6Dhillon I,Guan Y,Kogan J.Refining clusters in high dimensional data[C]//The 2nd SIAM ICDM,Workshop on Clustering High Dimensional Data, Arlington, 2002. 被引量:1
  • 7Pelleg D,Moore A.X-means:extending K-means with efficient estimation of the number of the clusters[C]//Proceedings of the 17th ICML, 2000. 被引量:1
  • 8Sarafis I,Zalzala A M S,Trinder P W.A genetic rule-based data clustering toolkit[C]//Congress on Evolutionary Computation (CEC), Honolulu, 2002. 被引量:1
  • 9Strehl A,Ghosh J.A scalable approach to balanced,high-dimensional clustering of market baskets [C]//Proceedings of the 17th International Conference on High Performance Computing.Bangalore: Springer, 2000: 525-536. 被引量:1
  • 10Claudio M.A practical yet meaningful approach to customer segmentation[J].Journal of Consumer Marketing, 1998,15(5) :494-504. 被引量:1

共引文献15

同被引文献44

引证文献4

二级引证文献4

投稿分析

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部 意见反馈