A K-Means Algorithm Based on Feature Weighting
作者: 徐艳付学良李宏慧董改芳王晴
作者单位: 1内蒙古农业大学计算机与信息工程学院,内蒙古 呼和浩特
刊名: Computer Science and Application, 2018, Vol.8 (8), pp.1164-1171
中文刊名: 计算机科学与应用, 2018, Vol.8 (8), pp.1164-1171
来源数据库: Hans Pubs Journal
DOI: 10.12677/CSA.2018.88128
英文摘要: Cluster analysis is a statistical analysis technique that divides the research objects into relatively homogeneous groups. The core of cluster analysis is to find useful clusters of objects. K-means clustering algorithm has been receiving much attention from scholars because of its excellent speed and good scalability. However, the traditional K-means algorithm does not consider the influence of each attribute on the final clustering result, which makes the accuracy of clustering have a certain impact. In response to the above problems, this thesis proposes an improved feature weighting algorithm. The improved algorithm uses the information entropy and ReliefF feature selection algorithm to weight the features and correct the distance function between clustering objects, so that the...
中文摘要: 聚类分析是将研究对象分为相对同质的群组的统计分析技术,聚类分析的核心就是发现有用的对象簇。K-means聚类算法由于具有出色的速度和良好的可扩展性,一直备受广大学者的关注。然而,传统的K-means算法,未考虑各个属性对于最终聚类结果的影响差异性,这使得聚类的精度有一定的影响。针对上述问题,本文提出一种改进的特征加权算法。改进算法通过采用信息熵和ReliefF特征选择算法对特征进行加权选择,修正聚类对象间的距离函数,使算法达到更准确更高效的聚类效果。仿真实验结果表明,与传统的K-means算法相比,改进后的算法聚类结果稳定,聚类的精度有明显提升。
全文获取路径: PDF下载  汉斯出版社