• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Shi Wenhua, Ni Yongjing, Zhang Xiongwei, Zou Xia, Sun Meng, Min Gang. Deep Neural Network Based Monaural Speech Enhancement with Sparse Non-Negative Matrix Factorization[J]. Journal of Computer Research and Development, 2018, 55(11): 2430-2438. DOI: 10.7544/issn1000-1239.2018.20170580
Citation: Shi Wenhua, Ni Yongjing, Zhang Xiongwei, Zou Xia, Sun Meng, Min Gang. Deep Neural Network Based Monaural Speech Enhancement with Sparse Non-Negative Matrix Factorization[J]. Journal of Computer Research and Development, 2018, 55(11): 2430-2438. DOI: 10.7544/issn1000-1239.2018.20170580

Deep Neural Network Based Monaural Speech Enhancement with Sparse Non-Negative Matrix Factorization

More Information
  • Published Date: October 31, 2018
  • In this paper, a monaural speech enhancement method combining deep neural network (DNN) with sparse non-negative matrix factorization (SNMF) is proposed. This method takes advantage of the sparse characteristic of speech signal in time-frequency (T-F) domain and the spectral preservation characteristic of DNN presented in speech enhancement, aiming to resolve the distortion problem introduced by low SNR situation and unvoiced components without structure characteristics in conventional non-negative matrix factorization (NMF) method. Firstly, the magnitude spectrogram matrix of noisy speech is decomposed by NMF with sparse constraint to obtain the corresponding coding matrix coefficients of speech and noise dictionary. The speech and noise dictionary are pre-trained independently. Then Wiener filtering method is used to get the separated speech and noise. DNN is employed to model the non-linear function which maps the log magnitude spectrum of the separated speech from Wiener filter to the target clean speech. Evaluations are conducted on the IEEE dataset, both stationary and non-stationary types of noise are selected to demonstrate the effectiveness of the proposed method. The experimental results show that the proposed method could effectively suppress the noise and preserve the speech component from the corrupted speech signal. It has better performance than the baseline methods in terms of perceptual quality and log-spectral distortion.
  • Related Articles

    [1]Wang Mengru, Yao Yunzhi, Xi Zekun, Zhang Jintian, Wang Peng, Xu Ziwen, Zhang Ningyu. Safety Analysis of Large Model Content Generation Based on Knowledge Editing[J]. Journal of Computer Research and Development, 2024, 61(5): 1143-1155. DOI: 10.7544/issn1000-1239.202330965
    [2]Shi Jiaoli, Huang Chuanhe, He Kai, Shen Xieyang, Hua Chao. An Access Control Method Supporting Multi-User Collaborative Edit in Cloud Storage[J]. Journal of Computer Research and Development, 2017, 54(7): 1603-1616. DOI: 10.7544/issn1000-1239.2017.20151135
    [3]Wang Shaopeng, Wen Yingyou, Li Zhi, and Zhao Hong. A Fast Processing Algorithm on Section Disjoint Query of Data Stream[J]. Journal of Computer Research and Development, 2014, 51(5): 1136-1148.
    [4]Zhu Huaijie, Wang Jiaying, Wang Bin, and Yang Xiaochun. Location Privacy Preserving Obstructed Nearest Neighbor Queries[J]. Journal of Computer Research and Development, 2014, 51(1): 115-125.
    [5]Yang Zexue, Hao Zhongxiao. Group Obstacle Nearest Neighbor Query in Spatial Database[J]. Journal of Computer Research and Development, 2013, 50(11): 2455-2462.
    [6]Zhu Yangyong, Dai Dongbo, and Xiong Yun. A Survey of the Research on Similarity Query Technique of Sequence Data[J]. Journal of Computer Research and Development, 2010, 47(2): 264-276.
    [7]Wang Bin, Guo Qing, Li Zhongbo, Yang Xiaochun. Index Structures for Supporting Block Edit Distance[J]. Journal of Computer Research and Development, 2010, 47(1): 191-199.
    [8]Xu Shifeng, Gao Jun, Yang Dongqing, and Wang Tengjiao. Pass-Count-Based Path Query on Big Graph Datasets[J]. Journal of Computer Research and Development, 2010, 47(1): 96-103.
    [9]Yang Yuedong, Wang Lili, and Hao Aimin. Motion String: A Motion Capture Data Representation for Behavior Segmentation[J]. Journal of Computer Research and Development, 2008, 45(3): 527-534.
    [10]He Honghui, Wang Lizhen, and Zhou Lihua. pgi-distance: An Efficient Method Supporting Parallel KNN-join Process[J]. Journal of Computer Research and Development, 2007, 44(10): 1774-1781.

Catalog

    Article views (1154) PDF downloads (523) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return