一种面向主题的领域服务聚类方法

李  征; 王  健; 张  能; 李  昭; 何成万; 何克清

一种面向主题的领域服务聚类方法

A Topic-Oriented Clustering Approach for Domain Services

摘要

摘要: 随着互联网上服务资源规模的快速增长，如何高效、准确地发现服务成为一个亟待解决的关键问题.服务聚类是促进服务发现的一种重要技术.但是，现有服务聚类方法只对单一类型的服务文档进行聚类，并且没有考虑服务的领域特性.针对该问题，在对服务进行领域分类的基础上，提出了一种基于概率、融合领域特性的服务聚类模型——领域服务聚类模型(domain service clustering model, DSCM)，然后基于该模型提出了一种面向主题的服务聚类方法.最后通过ProgrammableWeb网站提供的真实服务集对提出的方法进行了验证.实验结果表明，该方法可以准确地对不同类型的服务文档进行聚类.与经典的潜在狄利克雷分配(latent Dirichlet allocation, LDA),K-means等方法相比，该方法在聚类纯度和F-measure指标上均具有更好的效果，从而为按需服务发现与服务组合提供更好的支持.

Abstract: With the development of SOA and SaaS technologies, the scale of services on the Internet shows a trend of rapid growth. Faced with the abundant and heterogeneous services, how to efficiently and accurately discover user desired services becomes a key issue in service-oriented software engineering. Services clustering is an important technology to facilitate services discovery. However, the existing clustering approaches are only for a single type of service documents, and they do not consider the domain characteristic of services. To avoid these limitations, on the basis of domain-oriented services classification, this paper proposes a services clustering model named as DSCM based on probability and domain characteristic, and then proposes a topic-oriented clustering approach for domain services based on the DSCM model. The proposed clustering approach can cluster services described in WSDL, OWL-S, and text, which can effectively solve the problem of single service document type. Finally, experiments are conducted on real services from ProgrammableWeb to demonstrate the effectiveness of the proposed approach. Experimental results show that the proposed approach can cluster services more accurately. Compared with the approaches of classical latent Dirichlet allocation (LDA) and K-means, the proposed approach can achieve better in the purity of cluster and F-measure, which can greatly promote on demand services discovery and composition.

HTML全文

参考文献(0)

施引文献

资源附件(0)