• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhu Jizhao, Jia Yantao, Xu Jun, Qiao Jianzhong, Wang Yuanzhuo, Cheng Xueqi. SparkCRF: A Parallel Implementation of CRFs Algorithm with Spark[J]. Journal of Computer Research and Development, 2016, 53(8): 1819-1828. DOI: 10.7544/issn1000-1239.2016.20160197
Citation: Zhu Jizhao, Jia Yantao, Xu Jun, Qiao Jianzhong, Wang Yuanzhuo, Cheng Xueqi. SparkCRF: A Parallel Implementation of CRFs Algorithm with Spark[J]. Journal of Computer Research and Development, 2016, 53(8): 1819-1828. DOI: 10.7544/issn1000-1239.2016.20160197

SparkCRF: A Parallel Implementation of CRFs Algorithm with Spark

More Information
  • Published Date: July 31, 2016
  • Condition random fields has been successfully applied to various applications in text analysis, such as sequence labeling, Chinese words segmentation, named entity recognition, and relation extraction in nature language processing. The traditional CRFs tools in single-node computer meet many challenges when dealing with large-scale texts. For one thing, the personal computer experiences the performance bottleneck; For another, the server fails to tackle the analysis efficiently. And upgrading hardware of the server to promote the capability of computing is not always feasible due to the cost constrains. To tackle these problems, in light of the idea of “divide and conquer”, we design and implement SparkCRF, which is a kind of distributed CRFs running on cluster environment based on Apache Spark. We perform three experiments using NLPCC2015 and the 2nd International Chinese Word Segmentation Bakeoff datasets, to evaluate SparkCRF from the aspects of performance, scalability and accuracy. Results show that: 1)compared with CRF++, SparkCRF runs almost 4 times faster on our cluster in sequence labeling task; 2)it has good scalability by adjusting the number of working cores; 3)furthermore, SparkCRF has comparable accuracy to the state-of-the-art CRF tools, such as CRF++ in the task of text analysis.
  • Related Articles

    [1]Liu Yongzhi, Qin Guiyun, Liu Pengtao, Hu Chengyu, Guo Shanqing. Provably Secure Public Key Authenticated Encryption with Keyword Search Based on SGX[J]. Journal of Computer Research and Development, 2023, 60(12): 2709-2724. DOI: 10.7544/issn1000-1239.202220478
    [2]Guo Sixu, He Shen, Su Li, Zhang Xing, Zhou Fucai, Zhang Xinyue. Top-k Boolean Searchable Encryption Scheme Based on Multiple Keywords[J]. Journal of Computer Research and Development, 2022, 59(8): 1841-1852. DOI: 10.7544/issn1000-1239.20200605
    [3]Yang Ningbin, Zhou Quan, Xu Shumei. Public-Key Authenticated Encryption with Keyword Search Without Pairings[J]. Journal of Computer Research and Development, 2020, 57(10): 2125-2135. DOI: 10.7544/issn1000-1239.2020.20200318
    [4]Guo Lifeng, Li Zhihao, Hu Lei. Efficient Public Encryption Scheme with Keyword Search for Cloud Storage[J]. Journal of Computer Research and Development, 2020, 57(7): 1404-1414. DOI: 10.7544/issn1000-1239.2020.20190671
    [5]Xu Guangwei, Shi Chunhong, Wang Wentao, Pan Qiao, Li Feng. Multi-Keyword Searchable Encryption Algorithm Based on Semantic Extension[J]. Journal of Computer Research and Development, 2019, 56(10): 2193-2206. DOI: 10.7544/issn1000-1239.2019.20190378
    [6]Li Yuxi, Zhou Fucai, Xu Jian, Xu Zifeng. Multiple-Keyword Encrypted Search with Relevance Ranking on Dual-Server Model[J]. Journal of Computer Research and Development, 2018, 55(10): 2149-2163. DOI: 10.7544/issn1000-1239.2018.20180433
    [7]Chen Dongdong, Cao Zhenfu, Dong Xiaolei. Online/Offline Ciphertext-Policy Attribute-Based Searchable Encryption[J]. Journal of Computer Research and Development, 2016, 53(10): 2365-2375. DOI: 10.7544/issn1000-1239.2016.20160416
    [8]Han Jun, Fan Ju, Zhou Lizhu. Semantic-Enhanced Spatial Keyword Search[J]. Journal of Computer Research and Development, 2015, 52(9): 1954-1964. DOI: 10.7544/issn1000-1239.2015.20140686
    [9]Guo Lifeng and Lu Bo. Efficient Proxy Re-encryption with Keyword Search Scheme[J]. Journal of Computer Research and Development, 2014, 51(6): 1221-1228.
    [10]Tang Mingzhu, Yang Yan, Guo Xuequan, Shen Zhonghui, Zhong Yingli. KWSDS: A Top-k Keyword Search System in Relational Databases[J]. Journal of Computer Research and Development, 2012, 49(10): 2251-2259.

Catalog

    Article views (2125) PDF downloads (780) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return