Abstract:
In almost all the existing audio authentication strategies, the smallest authentication entity is a segment with fixed duration, typically a frame whose size is consistent with both protection stage and verification stage. Such fixed-length segmentation has brought about at least two problems: Firstly, if the audio data to be authenticated has undergone some synchronization manipulations such as time-scale modification (TSM) or random cropping, the displacement between the frame sequences of the tested audio and its original may cause an unreliable authentication result; Secondly, authentication based on those fixed-length frames might not be friendly to users since in most cases a frame cannot satisfactorily cover a complete semantic entity of the audio data. In this paper, a novel music content authentication scheme based on note segmentation and fuzzy classification is proposed. We break the restriction of the traditional fixed-length segmentation strategies used in previous audio authentication methods and partition each track into a sequence of musical notes which correspond to minimum meaningful semantic elements of the music. Feature extraction and robust Hashing are performed on each note-based segment, thus settling the synchronization problem that remains a big challenge in most existing audio authentication algorithms. In the verification stage, three newly defined measures that characterize statistical and temporal properties of the distortion between the original and the dubious music, are exploited and further combined with fuzzy classification to make the final authentication decision. Moreover, tamped regions are also located for unauthentic music signals. Experimental results show that the proposed method features a high distinction between acceptable manipulations and malicious ones, and at the same time achieves satisfying performance in tamper localization.