Abstract:
Clustering validity index plays an important role to show whether a clustering is good enough. Most of current indexes are based on statistical theory and fuzzy theory. Limited by the basic theories, these indexes would give some incorrect indication in some special cases. In this paper, a new index of clustering validity index which is based on the theory of modal logic is presented. The clustering is described by Kripke structures, where the similarity is defined as a binary relation on the data set. Each cluster is represented by a propositional sentence so that the result of clustering can be represented by logical formulas. According to minimum description length principle, the clustering validity index is built by veracity and complexity of the representation. Since this new index imposes no additional restrictive conditions of the similarity measurement for clustering, it is therefore more universal than current ones which usually contain default measurement of similarity. The experiments to compare the new index with the common indexes are also shown in this paper. The experimental results show that this new index is consistent with others in the normal case as well as more effective in some special cases such as the two rings data set.