Survey on DBSCAN Acceleration Algorithms for Large Scale Data

Chen Yewang; Cao Hailu; Chen Yi; Kang Zhao; Lei Zhen; Du Jixiang

doi:10.7544/issn1000-1239.202220311

Journal of Computer Research and Development > 2023 > 60(9): 2028-2047. > DOI: 10.7544/issn1000-1239.202220311

Chen Yewang, Cao Hailu, Chen Yi, Kang Zhao, Lei Zhen, Du Jixiang. Survey on DBSCAN Acceleration Algorithms for Large Scale Data[J]. Journal of Computer Research and Development, 2023, 60(9): 2028-2047. DOI: 10.7544/issn1000-1239.202220311

Citation:

PDF (2203 KB)

Survey on DBSCAN Acceleration Algorithms for Large Scale Data

Chen Yewang^{1, 2, 5, 6, 7,},
Cao Hailu¹,
Chen Yi^2, ,,
Kang Zhao³,
Lei Zhen⁴,
Du Jixiang^{1, 6}

1.
College of Computer Science and Technology, Huaqiao University, Xiamen, Fujian 361021
2.
Beijing Key Laboratory of Big Data Technology for Food Safety (Beijing Technology and Business University), Beijing 100048
3.
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731
4.
State Key Laboratory of Pattern Recognition (Institute of Automation, Chinese Academy of Sciences), Beijing 100190
5.
Xiamen Key Laboratory of Data Security and Blockchain Technology (Huaqiao University), Xiamen, Fujian 361021
6.
Fujian Key Laboratory of Big Data Intelligence and Security (Huaqiao University), Xiamen, Fujian 361021
7.
Jiangsu Provincial Key Laboratory for Computer Information Processing Technology (Soochow University), Suzhou, Jiangsu 215006

Funds: This work was supported by the National Natural Science Foundation of China (61673186, 71771094, 61876068, 61972010), the Guidance Science and Technology Plan of Fujian Province (2021H0019), and the Natural Science Foundation of Fujian Province (2020J05059, 2021J01317).

More Information

Author Bio:
Chen Yewang: born in 1978. PhD, associate professor, master supervisor. His main research interests include natural language processing, machine learning, and pattern recognition

Cao Hailu: born in 1996. Master. Her main research interests include machine learning and pattern recognition

Chen Yi: born in 1963. PhD, professor, PhD supervisor. Her main research interests include information visualization, visual analytics and big data technology for food quality and safety, including high-dimensional, hierarchical, spatio-temporal, and graph data visual analytics

Kang Zhao: born in 1983. PhD, associate professor, master supervisor. Member of CCF. His main research interests include unsupervised machine learning, graph signal process, social media analysis and knowledge graph

Lei Zhen: born in 1983. PhD, professor, PhD supervisor. His main research interests include computer vision, pattern recognition, image processing, and face recognition

Du Jixiang: born in 1977. PhD, professor, PhD supervisor. His main research interests include image processing and pattern recognition
Received Date: April 17, 2022
Revised Date: August 22, 2022
Available Online: April 13, 2023

Graphical Abstract

Abstract

Abstract

DBSCAN (density-based spatial clustering of applications with noise) is one of the most widely used and studied density clustering algorithms for its simplicity and easy implementation. However, the high time complexity (O(n²)) yields large-scale data that it is unable to deal with, due to that DBSCAN has great number of redundant distance computations in the process of calculating density. Therefore, accelerating it, which aims to make it suitable for big data environment, has become a research hotspot, and much fruitful work has emerged. From the perspective of acceleration goals, these efforts can be broadly divided into two categories: reducing redundant computations and parallelization. In terms of specific acceleration means, they can be divided into six main categories: distributed technique, sampling, approximation, fast neighbor, space division and GPU acceleration. According to this classification, the existing work is thoroughly combed and cross compared. It is found that the fusion acceleration algorithms of multiple technologies are better than those that only use single acceleration technology; approximate fuzziness, parallelism and distribution are the most effective methods to accelerate DBSCAN at present; high-dimensional data are still difficult to deal with. In addition, the applications of fast DBSCAN in many fields are tracked and reported. Finally, the future direction of rapid DBSCAN is prospected.
- fast DBSCAN,
- density clustering,
- clustering algorithm,
- big data,
- data mining

FullText(HTML)

References (98)

References

[1]	Ester M, Kriegel H P, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise [C] //Proc of the 2nd Int Conf on Knowledge Discovery and Data Mining (KDD). Menlo Park, CA: AAAI, 1996: 226−231
[2]	Likas A, Vlassis N, Verbeek J J. The global k-means clustering algorithm[J]. Pattern recognition, 2003, 36(2): 451−461 doi: 10.1016/S0031-3203(02)00060-2
[3]	Xu Xiaowei, Jäger J, Kriegel H P. A fast parallel clustering algorithm for large spatial databases[M] //High Performance Data Mining. Berlin: Springer, 1999: 263−290
[4]	Patwary M M A, Palsetia D, Agrawal A, et al. A new scalable parallel DBSCAN algorithm using the disjoint-set data structure[C] //Proc of the Int Conf on High Performance Computing, Networking, Storage and Analysis. Piscataway, NJ: IEEE, 2012[2021-10-08]. https://ieeexplore.ieee.org/ abstract/ document/6468492
[5]	Patwary M, Ali M, Blair J, et al. Experiments on union-find algorithms for the disjoint-set data structure[C] // Proc of Int Symp on Experimental Algorithms. Berlin: Springer, 2010: 411−423
[6]	He Yaobin, Tan Haoyu, Luo Wuman, et al. MR-DBSCAN: An efficient parallel density-based clustering algorithm using MapReduce[C] //Proc of the 17th Int Conf on Parallel and Distributed Systems. Piscataway, NJ: IEEE, 2011: 473−480
[7]	He Yaobin, Tan Haoyu, Luo Wuman, et al. MR-DBSCAN: A scalable MapReduce-based DBSCAN algorithm for heavily skewed data[J]. Frontiers of Computer Science, 2014, 8(1): 83−99 doi: 10.1007/s11704-013-3158-3
[8]	Leutenegger S T, Lopez M A, Edgington J. STR: A simple and efficient algorithm for R-tree packing[C] //Proc of the 13th Int Conf on Data Engineering. Piscataway, NJ: IEEE, 1997: 497−506
[9]	Song H, Lee J G . RP-DBSCAN: A superfast parallel DBSCAN algorithm based on random partitioning[C] //Proc of the Int Conf on Management of Data. New York: ACM, 2018: 1173−1187
[10]	Dai Biru, Lin I-Chang. Efficient Map/Reduce-based DBSCAN algorithm with optimized data partition[C] //Proc of the 5th IEEE Int Conf on Cloud Computing. Piscataway, NJ: IEEE, 2012: 59−66
[11]	Noticewala M, Vaghela D. MR-IDBSCAN: Efficient parallel incremental DBSCAN algorithm using MapReduce[J]. International Journal of Computer Applications, 2014, 3(4): 13−17
[12]	Cordova I, Moh T S. DBSCAN on resilient distributed datasets[C] //Proc of the Int Conf on High Performance Computing & Simulation. Piscataway, NJ: IEEE, 2015: 531−540
[13]	Luo Guangchun, Luo Xiaoyu, Gooch T F, et al. A parallel DBSCAN algorithm based on Spark[C] //Proc of the IEEE Int Conf on Big Data and Cloud Computing, Social Computing and Networking, Sustainable Computing and Communications. Piscataway, NJ: IEEE, 2016: 548−553
[14]	Lulli A, Dell'Amico M, Michiardi P, et al. NG-DBSCAN: Scalable density-based clustering for arbitrary data[J]. Proceedings of the VLDB Endowment, 2016, 10(3): 157−168 doi: 10.14778/3021924.3021932
[15]	Han Dianwei, Agrawal A, Liao Weikeng, et al. A novel scalable DBSCAN algorithm with Spark[C] //Proc of the IEEE Int Parallel and Distributed Processing Symp Workshops. Piscataway, NJ: IEEE, 2016: 1393−1402
[16]	Bentley J L. Multidimensional binary search trees used for associative searching[J]. Communications of the ACM, 1975, 18(9): 509−517 doi: 10.1145/361002.361007
[17]	Chen Youguang, Ruys W, Biros G. KNN-DBSCAN: A DBSCAN in high dimensions[J]. arXiv preprint, arXiv: 2009.04552, 2020
[18]	Borah B, Bhattacharyya D K. An improved sampling-based DBSCAN for large spatial databases[C] //Proc of the Int Conf on Intelligent Sensing and Information Processing. Piscataway, NJ: IEEE, 2004: 92−96
[19]	Tsai C F, Liu Chiwei. KIDBSCAN: A new efficient data clustering algorithm[C] // Proc of the Int Conf on Artificial Intelligence and Soft Computing. Berlin: Springer, 2006: 702−711
[20]	Tsai C F, Huang Tangwei. QIDBSCAN: A quick density-based clustering technique[C] //Proc of the Int Symp on Computer, Consumer and Control. Piscataway, NJ: IEEE, 2012: 638−641
[21]	Viswanath P, Pinkesh R. l-DBSCAN: A fast hybrid density based clustering method[C] //Proc of the 18th Int Conf on Pattern Recognition. Piscataway, NJ: IEEE, 2006, 1: 912−915
[22]	Viswanath P, Babu V S. Rough-DBSCAN: A fast hybrid density based clustering method for large data sets[J]. Pattern Recognition Letters, 2009, 30(16): 1477−1488 doi: 10.1016/j.patrec.2009.08.008
[23]	Luchi D, Rodrigues A L, Varejão F M. Sampling approaches for applying DBSCAN to large datasets[J]. Pattern Recognition Letters, 2019, 117: 90−96 doi: 10.1016/j.patrec.2018.12.010
[24]	Jang J, Jiang H. DBSCAN++: Towards fast and scalable density clustering[C] //Proc of the 36th Int Conf on Machine learning. New York: ACM, 2019: 3019−3029
[25]	Gonzalez T F. Clustering to minimize the maximum intercluster distance[J]. Theoretical Computer Science, 1985, 38: 293−306 doi: 10.1016/0304-3975(85)90224-5
[26]	Jiang H, Jang J, Lacki J. Faster DBSCAN via subsampled similarity queries[C] //Advances in Neural Information Processing Systems33. New York: Curran Associates, Inc.: 22407-22419
[27]	Brecheisen S, Kriegel H P, Pfeifle M. Parallel density-based clustering of complex objects[C] //Proc of the Pacific-Asia Conf on Knowledge Discovery and Data Mining. Berlin: Springer, 2006: 179−188
[28]	Kim J H, Choi J H, Yoo K H, et al. AA-DBSCAN: An approximate adaptive DBSCAN for finding clusters with varying densities[J]. The Journal of Supercomputing, 2019, 75(1): 142−169 doi: 10.1007/s11227-018-2380-z
[29]	Gan Junhao, Tao Yufei. DBSCAN revisited: Mis-claim, un-fixability, and approximation[C] //Proc of the ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2015: 519−530
[30]	Schubert E, Sander J, Ester M, et al. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN[J]. ACM Transactions on Database Systems, 2017, 42(3): 1−21
[31]	Chen Yewang, Zhou Lida, Pei Songwen, et al. KNN-BLOCK DBSCAN: Fast clustering for large-scale data[J]. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2019, 51(6): 3939−3953
[32]	Chen Yewang, Zhou Lida, Bouguila N, et al. BLOCK-DBSCAN: Fast clustering for large scale data[J]. Pattern Recognition, 2021, 109: 107624 doi: 10.1016/j.patcog.2020.107624
[33]	Andoni A, Indyk P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions[C] //Proc of the 47th Annual IEEE Symp on Foundations of Computer Science . Piscataway, NJ: IEEE, 2006: 459−468
[34]	Beygelzimer A, Kakade S, Langford J. Cover trees for nearest neighbor[C] // Proc of the 23rd Int Conf on Machine Learning. New York: ACM, 2006: 97−104
[35]	Muja M, Lowe D G. Scalable nearest neighbor algorithms for high dimensional data[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(11): 2227−2240 doi: 10.1109/TPAMI.2014.2321376
[36]	Jones P W, Osipov A, Rokhlin V. Randomized approximate nearest neighbors algorithm[J]. Proceedings of the National Academy of Sciences, 2011, 108(38): 15679−15686 doi: 10.1073/pnas.1107769108
[37]	Chen Yewang, Singh J P, Zhou Lida, et al. FRS: Fast range search by pruning unnecessary distance computations based on KD tree[C] //Porc of the IEEE Int Conf on Data Mining Workshops. Piscataway, NJ: IEEE, 2017: 1160−1165
[38]	Chen Yewang, Zhou Lida, Tang Yi, et al. Fast neighbor search by using revised KD tree[J]. Information Sciences, 2019, 472: 145−162 doi: 10.1016/j.ins.2018.09.012
[39]	Becker A, Ducas L, Gama N, et al. New directions in nearest neighbor searching with applications to lattice sieving[C] //Proc of the 27th ACM-SIAM Symp on Discrete Algorithms. New York: ACM, 2016: 10−24
[40]	Agarwal P K, Aronov B, Har-Peled S, et al. Nearest-neighbor searching under uncertainty II[J/OL]. ACM Transactions on Algorithms, 2016, 13(1)[2021-10-08]. https://dl.acm.org/doi/abs/10.1145/2955098
[41]	Guttman A. R-trees: A dynamic index structure for spatial searching[C] //Proc of the ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1984: 47−57
[42]	Wu Yipu, Guo Jinjiang, Zhang Xue Jie. A linear DBSCAN algorithm based on LSH[C] //Proc of the Int Conf on Machine Learning and Cybernetics. Piscataway, NJ: IEEE, 2007: 2608−2614
[43]	Li Shashan. An improved DBSCAN algorithm based on the neighbor similarity and fast nearest neighbor query[J]. Artificial Intelligence in Parallel and Distributed Computing, 2020, 8: 47468−47476
[44]	Chen Yewang, Tang Shenyu, Bouguila N, et al. A fast clustering algorithm based on pruning unnecessary distance computations in DBSCAN for high-dimensional data[J]. Pattern Recognition, 2018, 83: 375−387 doi: 10.1016/j.patcog.2018.05.030
[45]	Kumar K M, Reddy A R M. A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method[J]. Pattern Recognition, 2016, 58: 39−48 doi: 10.1016/j.patcog.2016.03.008
[46]	Agrawal R, Gehrke J, Gunopulos D, et al. Automatic subspace clustering of high dimensional data for data mining applications[C]//Proc of the ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1998: 94−105
[47]	Wang Wei, Yang Jiong, Muntz R. STING: A statistical information grid approach to spatial data mining [C] //Proc of the 23rd Int Conf on Very Large Databases. New York : ACM, 1997: 186−195
[48]	Hinneburg A, Keim D A. Optimal grid-clustering: Towards breaking the curse of dimensionality in high-dimensional clustering[C] //Proc of the 25th Int Conf on Very Large Databases. New York: ACM, 1999: 506−517
[49]	Mahran S, Mahar K. Using grid for accelerating density-based clustering[C] //Proc of the 8th IEEE Int Conf on Computer and Information Technology. Piscataway, NJ: IEEE, 2008: 35−40
[50]	Huang Ming, Bian Fuling. A grid and density based fast spatial clustering algorithm[C] //Proc of the Int Conf on Artificial Intelligence and Computational Intelligence. Piscataway, NJ: IEEE, 2009: 260−263
[51]	曾东海. 基于网格密度和空间划分树的聚类算法研究[D]. 厦门: 厦门大学, 2006 Zeng Donghai. The study of clustering clgorithm based on grid-density and spatial partition tree[D]. Xiamen: Xiamen University , 2006 (in Chinese)
[52]	Gunawan A, de Berg M. A faster algorithm for DBSCAN[D]. Eindhoven: Eindhoven University of Technology, 2013
[53]	Chen Xiaoyun, Min Yufang, Zhao Yan, et al. GMDBSCAN: Multi-density DBSCAN cluster based on grid[C] //Proc of the IEEE Int Conf on e-Business Engineering. Piscataway, NJ: IEEE, 2008: 780−783
[54]	Zhang Linmeng, Xu Zhigao, Si Fengqi. GCMDDBSCAN: Multi-density DBSCAN based on grid and contribution[C] //Proc of the 11th IEEE Int Conf on Dependable, Autonomic and Secure Computing. Piscataway, NJ: IEEE, 2013: 502−507
[55]	Wang Lang, Li Haiqing. Clustering algorithm based on grid and density for data stream[C/OL] //Proc of the Int Conf on Materials Science, Energy Technology, Power Engineering. Maryland: AIP, 2017[2021-11-13]. https://aip.scitation.org/doi/abs/10.1063/1.4982567
[56]	Kumari S, Goyal P, Sood A, et al. Exact, fast and scalable parallel DBSCAN for commodity platforms[C/OL] //Proc of the 18th Int Conf on Distributed Computing and Networking. New York: ACM, 2017[2021-11-13]. https://dl.acm.org/doi/abs/10.1145/3007748.3007773
[57]	Sakai T, Tamura K, Kitakami H. Cell-based DBSCAN algorithm using minimum bounding rectangle criteria[C] //Proc of the Int Conf on Database Systems for Advanced Applications. Berlin: Springer, 2017: 133−144
[58]	Boonchoo T, Ao Xiang, Liu Yang, et al. Grid-based DBSCAN: Indexing and inference[J]. Pattern Recognition, 2019, 90: 271−284 doi: 10.1016/j.patcog.2019.01.034
[59]	Böhm C, Noll R, Plant C, et al. Density-based clustering using graphics processors[C] //Proc of the 18th ACM Conf on Information and Knowledge Management. New York: ACM, 2009: 661−670
[60]	Loh W K, Yu H. Fast density-based clustering through dataset partition using graphics processing units[J]. Information Sciences, 2015, 308: 94−112 doi: 10.1016/j.ins.2014.10.023
[61]	Andrade G, Ramos G, Madeira D, et al. G-DBSCAN: A GPU accelerated algorithm for density-based clustering[J]. Procedia Computer Science, 2013, 8: 369−378
[62]	Thapa R J, Trefftz C, Wolffe G. Memory-efficient implementation of a graphics processor-based cluster detection algorithm for large spatial databases[C/OL] //Proc of the IEEE Int Conf on Electro/Information Technology. Piscataway, NJ: IEEE, 2010[2021-11-13]. https://ieeexplore. ieee. org/abstract/document/5612134
[63]	Cal P, Woźniak M. Data preprocessing with GPU for DBSCAN algorithm[C] //Proc of the 8th Int Conf on Computer Recognition Systems(CORES). Berlin: Springer, 2013: 793−801
[64]	Mustafa H, Leal E, Gruenwald L. An experimental comparison of GPU techniques for DBSCAN clustering[C] //Proc of the IEEE Int Conf on Big Data. Piscataway, NJ: IEEE, 2019: 3701−3710
[65]	Welton B, Samanas E, Miller B P. Mr. Scan: Extreme scale density-based clustering using a tree-based network of GPGPU nodes[C/OL] //Proc of the Int Conf on High Performance Computing, Networking, Storage and Analysis. Piscataway, NJ: IEEE, 2013[2021-11-13]. https://ieeexplore.ieee.org/abstract /document/6877517
[66]	Roth P C, Arnold D C, Miller B P. MRNet: A software-based multicast/reduction network for scalable tools[C/OL] //Proc of the IEEE Conf on Supercomputing. Piscataway, NJ: IEEE, 2003[2021−11-13].https://ieeexplore.ieee.org/abstract/document/1592924
[67]	Welton B, Miller B P. The anatomy of Mr. Scan: A dissection of performance of an extreme scale GPU-based clustering algorithm[C] //Proc of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems. Piscataway, NJ: IEEE, 2014: 54−60
[68]	Loh W K, Moon Y S, Park Y H. Fast density-based clustering using graphics processing units[J]. Transactions on Information and Systems, 2014, 97(7): 1947−1951
[69]	Qian Quan, Zhao Shuai, Xiao Chaojie, et al. Multi-level grid based clustering and GPU parallel implementations[C] //Proc of the 14th Inte Symp on Pervasive Systems, Algorithms and Networks & the 11th Int Conf on Frontier of Computer Science and Technology & the 3rd Int Symp of Creative Computing. Piscataway, NJ: IEEE, 2017: 397−402
[70]	Chang Kaishiang, Peng Yiwen, Chen Weimei. Density-based clustering algorithm for GPGPU computing[C] //Proc of the Int Conf on Applied System Innovation. Piscataway, NJ: IEEE, 2017: 774−777
[71]	Gowanlock M, Rude C M, Blair D M, et al. A hybrid approach for optimizing parallel clustering throughput using the GPU[J]. IEEE Transactions on Parallel and Distributed Systems, 2018, 30(4): 766−777
[72]	Prokopenko A, Lebrun-Grandie D, Arndt D. Fast tree-based algorithms for DBSCAN on GPUs[J]. arXiv preprint, arXiv: 2103. 05162, 2021
[73]	李朋. 聚类分析中新聚类有效性指标的研究[D]. 合肥: 安徽大学, 2018 Li Peng. Research of new clustering validity index in cluster analysis[D]. Hefei: Anhui University, 2018 (in Chinese)
[74]	朱文婕. 模糊聚类有效性指标研究[D]. 合肥: 合肥工业大学, 2009 Zhu Wengjie. Research of cluster validity index for fuzzy clustering[D]. Hefei: Hefei University of Technology, 2009 (in Chinese)
[75]	Kuhn H W. The hungarian method for the assignment problem[J]. Naval Research Logistics Quarterly, 1955, 2(1/2): 83−97
[76]	Langhnoja S G, Barot M P, Mehta D B. Web usage mining to discover visitor group with common behavior using DBSCAN clustering algorithm[J]. International Journal of Engineering and Innovative Technology, 2013, 2(7): 169−173
[77]	Furqon M T, Muflikhah L. Clustering the potential risk of TSUNAMI using density-based spatial clustering of application with noise[J/OL]. Journal of Environmental Engineering and Sustainable Technology, 2016, 3(1). [2021-11-25]. https://jeest.ub.ac.id/index.php/jeest/article/view/38
[78]	Shen Jianbing, Hao Xiaopeng, Liang Zhiyuan et al. Real-time superpixel segmentation by DBSCAN clustering algorithm[J]. IEEE Transactions on Image Processing, 2016, 25(12): 5933−5942 doi: 10.1109/TIP.2016.2616302
[79]	Wagner T, Feger R, Stelzer A. Modification of DBSCAN and application to range/Doppler/DoA measurements for pedestrian recognition with an automotive radar system[C] //Proc of the European Radar Conf. Piscataway, NJ: IEEE, 2015: 269−272
[80]	Lashkov A L, Rubinsky S V, Eistrikh-Heller P A. Application of the DBSCAN algorithm to detect hydrophobic clusters in protein structures[J]. Crystallography Reports, 2019, 64(3): 524−532 doi: 10.1134/S1063774519030179
[81]	Li Yang, Wang Guowei, Chen Yu, et al. Application of DBSCAN algorithm in precision fertilization decision of Maize[C] //Proc of the Int Conf on Computer and Computing Technologies in Agriculture. Berlin: Springer, 2017: 453−459
[82]	Zhu Liang, He Fei, Tong Yifei, et al. Fault detection and diagnosis of belt weigher using improved DBSCAN and Bayesian regularized neural network[J]. Mechanics, 2015, 21(1): 70−77
[83]	Chernov A V, Savvas I K, Butakova M A. Detection of point anomalies in railway intelligent control system using fast clustering techniques[C] //Proc of the 3rd Int Conf on Intelligent Information Technologies for Industry. Berlin: Springer, 2018: 267−276
[84]	Ghallab H, Fahmy H, Nasr M. Detection outliers on Internet of things using big data technology[J]. Egyptian Informatics Journal, 2020, 21(3): 131−138 doi: 10.1016/j.eij.2019.12.001
[85]	Garg S, Kaur K, Batra S, et al. A multi-stage anomaly detection scheme for augmenting the security in IoT-enabled applications[J]. Future Generation Computer Systems, 2020, 104: 105−118 doi: 10.1016/j.future.2019.09.038
[86]	Santhisree K, Damodaram A, Appaji S V, et al. Web usage data clustering using DBSCAN algorithm and set similarities[C] //Proc of the Int Conf on Data Storage and Data Engineering. Piscataway, NJ : IEEE, 2010: 220−224
[87]	Yu Jian, Lu Xiaolin, Yu Yimin. Web document clustering based on web log mining[C] //Proc of the 10th WSEAS Int Conf on Computers. New York: ACM, 2006: 143−147
[88]	Scitovski S. A density-based clustering algorithm for earthquake zoning[J]. Computers & Geosciences, 2018, 110: 90−95
[89]	Kellner D, Klappstein J, Dietmayer K. Grid-based DBSCAN for clustering extended objects in radar data[C] //Proc of the IEEE Int Vehicles Symp. Piscataway, NJ: IEEE, 2012: 365−370
[90]	Xia Dawen, Bai Yu, Zheng Yongling, et al. A parallel SP-DBSCAN algorithm on Spark for waiting spot recommendation[J]. Multimedia Tools and Applications, 2022, 81(3): 4015−4038 doi: 10.1007/s11042-021-11639-9
[91]	Bandyopadhyay S K, Paul T U. Segmentation of brain tumour from mri image analysis of k-means and DBSCAN clustering[J]. International Journal of Research in Engineering and Science, 2013, 1(1): 48−57
[92]	Baselice F, Coppolino L, D'Antonio S, et al. A DBSCAN based approach for jointly segment and classify brain MR images[C] //Proc of the 37th Annual Int Conf of the IEEE Engineering in Medicine and Biology Society. Piscataway, NJ: IEEE, 2015: 2993−2996
[93]	Sander J, Ester M, Kriegel H P, et al. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications[J]. Data Mining and Knowledge Discovery, 1998, 2(2): 169−194 doi: 10.1023/A:1009745219419
[94]	Kurumalla S, Rao P S. K-nearest neighbor based DBSCAN clustering algorithm for image segmentation[J]. Journal of Theoretical and Applied Information Technology, 2016, 92(2): 395−402
[95]	刘奇旭,陈艳辉,尼杰硕,等. 基于机器学习的工业互联网入侵检测综述[J]. 计算机研究与发展,2022,59(5):994−1014 doi: 10.7544/issn1000-1239.20211147 Liu Qixu, Chen Yanhui, Ni Jieshuo, et al. Survey on machine learning-based anomaly detection for industrial internet[J]. Journal of Computer Research and Development, 2022, 59(5): 994−1014 (in Chinese) doi: 10.7544/issn1000-1239.20211147
[96]	朱素霞,王蕾,孙广路. 满足本地差分隐私的分类变换扰动机制[J]. 计算机研究与发展,2020,59(2):430−439 Zhu Suxia, Wang Lei, Sun Guanglu. A perturbation mechanism for classified transfor mation satisfying local differential privacy[J]. Journal of Computer Research and Development, 2020, 59(2): 430−439 (in Chinese)
[97]	Yan Ming, Chen Yewang, Chen Yi, et al. A lightweight weakly supervised learning segmentation algorithm for imbalanced image based on rotation density peaks[J]. Knowledge-Based Systems, 2022, 244: 108513 doi: 10.1016/j.knosys.2022.108513
[98]	Jeff J, Douze M, Jégou H. Billion-scale similarity search with GPUs[J]. IEEE Transactions on Big Data, 2019, 7(3): 535-547

[1]	Chen Yewang, Shen Lianlian, Zhong Caiming, Wang Tian, Chen Yi, Du Jixiang. Survey on Density Peak Clustering Algorithm[J]. Journal of Computer Research and Development, 2020, 57(2): 378-394. DOI: 10.7544/issn1000-1239.2020.20190104
[2]	Zhao Huihui, Zhao Fan, Chen Renhai, Feng Zhiyong. Efficient Index and Query Algorithm Based on Geospatial Big Data[J]. Journal of Computer Research and Development, 2020, 57(2): 333-345. DOI: 10.7544/issn1000-1239.2020.20190565
[3]	Xu Zhengguo, Zheng Hui, He Liang, Yao Jiaqi. Self-Adaptive Clustering Based on Local Density by Descending Search[J]. Journal of Computer Research and Development, 2016, 53(8): 1719-1728. DOI: 10.7544/issn1000-1239.2016.20160136
[4]	Gong Shufeng, Zhang Yanfeng. EDDPC: An Efficient Distributed Density Peaks Clustering Algorithm[J]. Journal of Computer Research and Development, 2016, 53(6): 1400-1409. DOI: 10.7544/issn1000-1239.2016.20150616
[5]	Meng Xiaofeng, Zhang Xiaojian. Big Data Privacy Management[J]. Journal of Computer Research and Development, 2015, 52(2): 265-281. DOI: 10.7544/issn1000-1239.2015.20140073
[6]	Liu Yahui, Zhang Tieying, Jin Xiaolong, Cheng Xueqi. Personal Privacy Protection in the Era of Big Data[J]. Journal of Computer Research and Development, 2015, 52(1): 229-247. DOI: 10.7544/issn1000-1239.2015.20131340
[7]	Liu Zhuo, Yang Yue, Zhang Jianpei, Yang Jing, Chu Yan, Zhang Zebao. An Adaptive Grid-Density Based Data Stream Clustering Algorithm Based on Uncertainty Model[J]. Journal of Computer Research and Development, 2014, 51(11): 2518-2527. DOI: 10.7544/issn1000-1239.2014.20130869
[8]	Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong. MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams[J]. Journal of Computer Research and Development, 2014, 51(10): 2277-2294. DOI: 10.7544/issn1000-1239.2014.20130718
[9]	Wang Ning, Li Jie. Two-Tiered Correlation Clustering Method for Entity Resolution in Big Data[J]. Journal of Computer Research and Development, 2014, 51(9): 2108-2116. DOI: 10.7544/issn1000-1239.2014.20131345
[10]	Xie Kunwu, Bi Xiaoling, and Ye Bin. Clustering Algorithm of High-Dimensional Data Based on Units[J]. Journal of Computer Research and Development, 2007, 44(9): 1618-1623.

Cited By

Cited by

Periodical cited type(5)

1.	丁强龙，叶惠珠，袁弘强，李志新. 大规模时空轨迹数据连接查询效率优化实践. 计算机系统应用. 2024(05): 1-14 .
2.	于平. 融合改进DBSCAN聚类和多种进化策略的改进蝗虫优化算法. 仪表技术与传感器. 2024(05): 98-105+112 .
3.	王赟. 通信大数据安全监管平台的设计与实践. 湖南邮电职业技术学院学报. 2024(03): 8-13+19 .
4.	李杰，李蓝青，曹帅，戴上. 基于改进灰狼算法优化和极限学习机的电网电力负荷预测. 微型电脑应用. 2024(11): 75-77+82 .
5.	武晓朦，袁榕泽，李英量，朱琦. 基于新冠病毒群体免疫算法的有源配电网优化调度. 系统仿真学报. 2023(12): 2692-2702 .