Abstract:
In this paper a novel direct clustering algorithm based on generalized information distance (GID) is put forward. Firstly, based on information theory, a basic concept of measure of diversity is given and an inequality about measure of diversity is proved. Based on this inequality, a concept of increment of diversity is discussed and a defined. Secondly, by analyzing distance measure, two new concepts of generalized information distance (GID) and improved generalized information distance (IGID) are proposed, and a new direct clustering algorithm based on GID and IGID is designed. Finally this algorithm is applied to soil fertility data processing, and compared with hierarchical clustering algorithm (HCA). The results of simulation application show that the algorithm presented here is feasible and effective. Because of simplicity of algorithm and robustness. It provides a new research approach for studies of pattern recognition theory.