Abstract:
A cross media retrieval approach is proposed to solve the problem of cross media correlation measuring between different modalities, such as image and audio data. First both intra and crossmedia correlations among multimodality datasets are explored. Intramedia correlation measures the similarity between multimedia data of the same modality, and crossmedia correlation measures how similar in semantic level two multimedia objects of different modalities are. Cross media correlation is very difficult to measure because of the heterogeneity in lowlevel features. For example, images are represented with visual feature vectors and audio clips are represented with heterogeneous auditory feature vectors. Intramedia correlation is calculated based on geodesic distance, and crossmedia correlation is estimated according to link information among WebPages. Then both kinds of correlations are formalized in a crossmedia correlation graph. Based on this graph crossmedia retrieval is enabled by the weight of the shortest path. A unique relevance feedback technique is developed to update the knowledge of multimodal correlations by learning from user behaviors, and to enhance the retrieval performance in a progressive manner. This approach breakthroughs the limitation of modality during retrieval process, and is applicable for querybyexample and crossretrieval multimedia applications. Experiment results on imageaudio dataset are encouraging, and show that the performance of the approach is effective.