基于音乐内容分析的音频认证算法

汪竹蓉; 李  伟; 朱碧磊; 李晓强

基于音乐内容分析的音频认证算法

Audio Authentication Based on Music Content Analysis

摘要

摘要: 提出一种新颖的基于音符分割和模糊分类的音乐内容认证方法.该算法打破了传统音频认证所采用的固定长度分割方式，将音乐信号分割成一系列具有完整语义信息的不等长音符片段作为认证的基本单元，结合动态时间规整DTW(dynamic time warping)对齐技术，有效解决了大多数现有算法都存在的对同步失真脆弱的问题.在每个音符片段计算基于半阶音符类Chroma的鲁棒Hash值，根据原始音乐与待认证音乐之间Hash值差异的统计特性和时间分布特性，对3种新定义的度量指标进行模糊分类从而得到最终的认证结果.对于未通过认证的音乐信号系统还可以进行篡改区域的检测.实验结果表明，该算法能够有效区分可容许操作和恶意篡改，同时在篡改定位方面具有较高的精度.

Abstract: In almost all the existing audio authentication strategies, the smallest authentication entity is a segment with fixed duration, typically a frame whose size is consistent with both protection stage and verification stage. Such fixed-length segmentation has brought about at least two problems: Firstly, if the audio data to be authenticated has undergone some synchronization manipulations such as time-scale modification (TSM) or random cropping, the displacement between the frame sequences of the tested audio and its original may cause an unreliable authentication result; Secondly, authentication based on those fixed-length frames might not be friendly to users since in most cases a frame cannot satisfactorily cover a complete semantic entity of the audio data. In this paper, a novel music content authentication scheme based on note segmentation and fuzzy classification is proposed. We break the restriction of the traditional fixed-length segmentation strategies used in previous audio authentication methods and partition each track into a sequence of musical notes which correspond to minimum meaningful semantic elements of the music. Feature extraction and robust Hashing are performed on each note-based segment, thus settling the synchronization problem that remains a big challenge in most existing audio authentication algorithms. In the verification stage, three newly defined measures that characterize statistical and temporal properties of the distortion between the original and the dubious music, are exploited and further combined with fuzzy classification to make the final authentication decision. Moreover, tamped regions are also located for unauthentic music signals. Experimental results show that the proposed method features a high distinction between acceptable manipulations and malicious ones, and at the same time achieves satisfying performance in tamper localization.

HTML全文

参考文献(0)

施引文献

资源附件(0)