• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Ma Anxiang, Zhang Bin, Gao Kening, Qi Peng, and Zhang Yin. Deep Web Data Extraction Based on Result Pattern[J]. Journal of Computer Research and Development, 2009, 46(2): 280-288.
Citation: Ma Anxiang, Zhang Bin, Gao Kening, Qi Peng, and Zhang Yin. Deep Web Data Extraction Based on Result Pattern[J]. Journal of Computer Research and Development, 2009, 46(2): 280-288.

Deep Web Data Extraction Based on Result Pattern

More Information
  • Published Date: February 14, 2009
  • With the rapid development of World Wide Web, how to improve the efficiency and precision of Deep Web data extraction has already become more and more important for effective Deep Web data integration. However, the bottleneck problem of the improvement of efficiency and precision of Deep Web data extraction is repeatedly semantic annotating and the existing of nested attributes. The definition of result pattern is given, and a novel approach to Deep Web data extraction based on result pattern is proposed. The approach includes two stages which are result pattern generation and data extraction based on result pattern. According to the feature of Deep Web result pages, the definition of feature matrix of Web page data is given. By constructing and analyzing the feature matrix of Web page data, result pattern can be easily obtained. Attribute semantic annotating is completed during the stage of result pattern generation. In this way, repeatedly semantic annotating is resolved well. At the same time, an effective method to divide nested attributes is also proposed. Experimental results show that Deep Web data extraction based on result pattern improves the efficiency and precision, and lays a solid foundation for Deep Web data integration.
  • Related Articles

    [1]Zhang Li, Zhang Bin, Huang Liping, Zhu Zhiliang. A Personalized Web Service Quality Prediction Approach Based on Invoked Feature Model[J]. Journal of Computer Research and Development, 2013, 50(5): 1066-1075.
    [2]Wang Ying, Zuo Xianglin, Zuo Wanli, Wang Xin. Interface Integration of Deep Web Based on Ontology[J]. Journal of Computer Research and Development, 2012, 49(11): 2383-2394.
    [3]Dong Yongquan, Li Qingzhong, Ding Yanhui, Peng Zhaohui. Constrained Conditional Random Fields for Semantic Annotation of Web Data[J]. Journal of Computer Research and Development, 2012, 49(2): 361-371.
    [4]Kou Yue, Li Dong, Shen Derong, Yu Ge, Nie Tiezheng. D-EEM: A DOM-Tree Based Entity Extraction Mechanism for Deep Web[J]. Journal of Computer Research and Development, 2010, 47(5): 858-865.
    [5]Shen Derong, Ma Ye, Nie Tiezheng, Kou Yue, and Yu Ge. A Query Relaxation Strategy Applied in a Deep Web Data Integration System[J]. Journal of Computer Research and Development, 2010, 47(1): 88-95.
    [6]Huang Rui, Shi Zhongzhi. A New Approach to Heterogeneous Semantic Search on the Web[J]. Journal of Computer Research and Development, 2008, 45(8): 1338-1345.
    [7]Jing Tao, Zuo Wanli, Sun Jigui, Che Haiyan. Semantic Annotation of Chinese Web Pages: From Sentences to RDF Representations[J]. Journal of Computer Research and Development, 2008, 45(7): 1221-1231.
    [8]Ye Lei and Zhang Bin. A Method of Web Service Discovery Based on Functional Semantics[J]. Journal of Computer Research and Development, 2007, 44(8): 1357-1364.
    [9]Xue Xiaobing, Han Jieling, Jiang Yuan, and Zhou Zhihua. Link Recommendation in Web Index Page Based on Multi-Instance Learning Techniques[J]. Journal of Computer Research and Development, 2007, 44(3).
    [10]Guo Zhixin, Jin Hai, and Chen Hanhua. Semantic Document Reference Metadata Extraction in SemreX[J]. Journal of Computer Research and Development, 2006, 43(8): 1368-1374.

Catalog

    Article views (1037) PDF downloads (529) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return