• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Ru Liyun, Li Zhichao, Ma Shaoping. Indexing Page Collection Selection Method for Search Engine[J]. Journal of Computer Research and Development, 2014, 51(10): 2239-2247. DOI: 10.7544/issn1000-1239.2014.20130340
Citation: Ru Liyun, Li Zhichao, Ma Shaoping. Indexing Page Collection Selection Method for Search Engine[J]. Journal of Computer Research and Development, 2014, 51(10): 2239-2247. DOI: 10.7544/issn1000-1239.2014.20130340

Indexing Page Collection Selection Method for Search Engine

More Information
  • Published Date: September 30, 2014
  • With the rapid development of the Internet, the number of pages is growing explosively. This presents a huge challenge for search engines which provide Web page search services. There are also lots of similar or even the exact same content pages and low-quality pages. In term of search engine, indexing such pages is no significant effect for retrieval results, but increases the search engine indexing and retrieval burden. A page selection algorithm is proposed to build indexing page collection from massive Web data for search engine. One hand, signature-based cluster algorithm is used to filter the similar pages to compress the size of the indexing page collection; on the other hand it combines a variety of features of the page dimensions and user dimensions, to ensure the quality of the collection. This algorithm is not only able to quickly cluster and select pages, but also achieve a higher compression ratio while still preserving the amount of information present in the indexing page collection. Experiments with actual page collections show that the size of indexing page collection selected by the proposed algorithm is about the entire page collection by 1/3, and can meet the vast majority of user click needs, with a strong practical.
  • Related Articles

    [1]He Jianhao, Li Lüzhou. An Overview of Quantum Optimization[J]. Journal of Computer Research and Development, 2021, 58(9): 1823-1834. DOI: 10.7544/issn1000-1239.2021.20210276
    [2]Xu Wenpeng, Wang Weiming, Li Hang, Yang Zhouwang, Liu Xiuping, Liu Ligang. Topology Optimization for Minimal Volume in 3D Printing[J]. Journal of Computer Research and Development, 2015, 52(1): 38-44. DOI: 10.7544/issn1000-1239.2015.20140108
    [3]Wen Renqiang, Zhong Shaobo, Yuan Hongyong, Huang Quanyi. Emergency Resource Multi-Objective Optimization Scheduling Model and Multi-Colony Ant Optimization Algorithm[J]. Journal of Computer Research and Development, 2013, 50(7): 1464-1472.
    [4]Wu Jianhui, Zhang Jing, Li Renfa, Liu Zhaohua. A Multi-Subpopulation PSO Immune Algorithm and Its Application on Function Optimization[J]. Journal of Computer Research and Development, 2012, 49(9): 1883-1898.
    [5]Tang Kezong, Liu Bingxiang, Yang Jingyu, Sun Tingkai. Double Center Particle Swarm Optimization Algorithm[J]. Journal of Computer Research and Development, 2012, 49(5): 1086-1094.
    [6]Sun Dayang, Liu Yanheng, Yang Dong, Wang Aimin. Lifetime Optimizing Scheme of WSN[J]. Journal of Computer Research and Development, 2012, 49(1): 193-201.
    [7]Liu Chun'an, Wang Yuping. Dynamic Multi-Objective Optimization Evolutionary Algorithm Based on New Model[J]. Journal of Computer Research and Development, 2008, 45(4): 603-611.
    [8]Cui Zhendong, Wang Xicheng. Optimization Design of Turbine Engine Foundation on Grid[J]. Journal of Computer Research and Development, 2007, 44(10): 1652-1660.
    [9]Ma Ming, Zhou Chunguang, Zhang Libiao, Ma Jie. Fuzzy Neural Network Optimization by a Multi-Objective Particle Swarm Optimization Algorithm[J]. Journal of Computer Research and Development, 2006, 43(12): 2104-2109.
    [10]Lei Kaiyou and Qiu Yuhui. A Study of Constrained Layout Optimization Using Adaptive Particle Swarm Optimizer[J]. Journal of Computer Research and Development, 2006, 43(10): 1724-1731.
  • Cited by

    Periodical cited type(2)

    1. 张皓宇,单薇薇,方晓,王艳. 基于云桌面技术的虚拟专用网络动态资源分配方法. 电子设计工程. 2021(15): 189-193 .
    2. 刘思,张德干,刘晓欢,张婷,吴昊. 一种基于判定区域的AODV路由的自适应修复算法. 计算机研究与发展. 2020(09): 1898-1910 . 本站查看

    Other cited types(0)

Catalog

    Article views (1254) PDF downloads (941) Cited by(2)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return