Abstract:
Fast audio retrieval is demanding due to the high dimension nature and increasingly larger volume of audios in the Internet. Although audio fingerprinting can greatly reduce its dimension while keeping audio identifiable, the dimension of audio fingerprints is still too high to scale up for big audio data. The number of audios to be checked has to be small enough. This paper proposes a robust and fast audio retrieval method for big audio data, which combines audio fingerprinting with filtering-and-refining method. An audio middle fingerprint is devised with considerable small dimension for quickly filtering most likely audios, by applying bag-of-features(BoF) technique on the classical Philips audio fingerprint, which can reduce the search scope with a 130 times speed gain compared with the Fibonacci Hashing retrieval. A matching algorithm is developed to reduce the computational complexity by comparing the samples at fixed interval of two audios with thresholds, which results in a maximal speed gain of 140 times. Experimental results show that the average time of retrieving audio clips of different length in about 100000 audios is less than 1s. After applying MP3 conversion, resampling, and random shearing, the recall rates are all above 99.47%, and the theoretical accuracy is close to 100%.