Abstract:
Content-based image retrieval (CBIR) has been a focal point of multimedia technology since the 1990’s, in which automatic image annotation is an important but highly challenging problem. Image annotation is treated as an image classification task in which each class label is considered as a distinct keyword. Keywords are usually associated with images instead of individual regions in the training data set. This poses a major challenge for any learning strategy. A new procedure to learn the correspondence between image regions and keywords under Multiple-Instance Learning (MIL) framework is presented as Heuristic Support Vector Machine-based MIL algorithm (HSVM-MIL). It extends the conventional Support Vector Machine (SVM) to the MIL setting by introducing alternative generalizations of the maximum margin used in SVM classification. The learning approach leads to a hard mixed integer program that can be solved iteratively in a heuristic optimization. In each iteration, HSVM-MIL tries to change the class label of only one instance to minimize the classification risk. Because its classification aims at individual image regions, the algorithm can directly estimate the correspondence between image regions and keywords while most MIL algorithms can not do this. Finally the HSVM-MIL algorithm is evaluated on both image annotation data sets and the benchmark MUSK data sets. Compared with other MIL methods, it demonstrates high performance in classification accuracy.