• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhao Bo, Huang Shujian, Dai Xinyu, Yuan Chunfeng, Huang Yihua. Parallel Algorithm for Hierarchical Phrase Machine Translation Based on Distributed Memory Storage[J]. Journal of Computer Research and Development, 2014, 51(12): 2724-2732. DOI: 10.7544/issn1000-1239.2014.20131335
Citation: Zhao Bo, Huang Shujian, Dai Xinyu, Yuan Chunfeng, Huang Yihua. Parallel Algorithm for Hierarchical Phrase Machine Translation Based on Distributed Memory Storage[J]. Journal of Computer Research and Development, 2014, 51(12): 2724-2732. DOI: 10.7544/issn1000-1239.2014.20131335

Parallel Algorithm for Hierarchical Phrase Machine Translation Based on Distributed Memory Storage

More Information
  • Published Date: November 30, 2014
  • In recent years, in order to improve the accuracy of SMT (statistical machine translation) system, massive corpus has been widely applied to train language and translation models. As the scale of the language and translation models increase, computing performance becomes a challenging issue for SMT, which makes existing single-machine translation algorithms and systems difficult to complete the computation in time, especially when dealing with online translation. In order to overcome the limitations of single-machine translation decoding algorithm and improve the computing performance of large-scale SMT toward a practical online translation system, this paper proposes a distributed and parallel translation decoding algorithm and framework by adopting a distributed storage and parallel query mechanism upon both the language and translation models. We develop a hierarchical phrase parallel decoder by using a distributed memory database to store and query large-scale translation and language model tables. To further improve the speed of parallel decoding, we also make three additional optimizations: 1) Transform the synchronous rules in translation model table and the Trie data structure of language model table into a Hash indexed key-value structure for use in the distributed memory database; 2) Modify the cube-pruning algorithm to make it suitable for batch query; 3) Adopt and optimize the batch query for language model and translation model tables to reduce the network overhead. Our implemented algorithm can achieve fast decoding of SMT based on large-scale corpus and provide excellent scalability. Experimental results show that, compared with the single-machine decoder, our parallel decoder can reach 2.7 times of speedup for single sentence translation and reach 11.7 times of speedup for batch translation jobs, achieving significant performance improvement.
  • Related Articles

    [1]Wei Jia, Zhang Xingjun, Wang Longxiang, Zhao Mingqiang, Dong Xiaoshe. MC2 Energy Consumption Model for Massively Distributed Data Parallel Training of Deep Neural Network[J]. Journal of Computer Research and Development, 2024, 61(12): 2985-3004. DOI: 10.7544/issn1000-1239.202330164
    [2]Duan Zhuohui, Liu Haikun, Zhao Jinwei, Liu Yihang, Liao Xiaofei, Jin Hai. A Reconfigurable Cache Consistency Mechanism for Distributed Memory Pool[J]. Journal of Computer Research and Development, 2023, 60(9): 1960-1972. DOI: 10.7544/issn1000-1239.202330448
    [3]Chen Maotang, Zheng Sheng’an, You Litong, Wang Jingyu, Yan Tian, Tu Yaofeng, Han Yinjun, Huang Linpeng. A Distributed Persistent Memory File System Based on RDMA Multicast[J]. Journal of Computer Research and Development, 2021, 58(2): 384-396. DOI: 10.7544/issn1000-1239.2021.20200369
    [4]Chen Bo, Lu Youyou, Cai Tao, Chen Youmin, Tu Yaofeng, Shu Jiwu. A Consistency Mechanism for Distributed Persistent Memory File System[J]. Journal of Computer Research and Development, 2020, 57(3): 660-667. DOI: 10.7544/issn1000-1239.2020.20190074
    [5]Niu Panpan, Wang Xiangyang, Yang Siyu, Wen Taotao, Yang Hongying. A Blind Watermark Decoder in Nonsubsampled Shearlet Domain Using Bivariate Weibull Distribution[J]. Journal of Computer Research and Development, 2019, 56(7): 1454-1469. DOI: 10.7544/issn1000-1239.2019.20180278
    [6]Hillel Avni, Wang Peng. Persistent Transactional Memory for Databases[J]. Journal of Computer Research and Development, 2018, 55(2): 305-318. DOI: 10.7544/issn1000-1239.2018.20170863
    [7]Lin Fei, Sun Yong, Ding Hong, Ren Yizhi. Self Stabilizing Distributed Transactional Memory Model and Algorithms[J]. Journal of Computer Research and Development, 2014, 51(9): 2046-2057. DOI: 10.7544/issn1000-1239.2014.20130058
    [8]Cai Wanwei, Tai Yunfang, Liu Qi, Zhang Ge. Memory Virtulization on MIPS Architecture[J]. Journal of Computer Research and Development, 2013, 50(10): 2247-2252.
    [9]He Yanxiang, Wu Wei, Chen Yong, Li Qing'an, Liu Jianbo. A Kind of Safe Typed Memory Model for C-Like Languages[J]. Journal of Computer Research and Development, 2012, 49(11): 2440-2449.
    [10]Jiang Guiyuan, Zhang Guiling, and Zhang Dakun. A Distributed Parallel Algorithm for SIFT Feature Extraction[J]. Journal of Computer Research and Development, 2012, 49(5): 1130-1141.

Catalog

    Article views (1272) PDF downloads (585) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return