Architecture and Technology of OceanBase Distributed Relational Database

Yang Zhenkun; Yang Chuanhui; Han Fusheng; Wang Guoping; Yang Zhifeng; Cheng Xiaojun

doi:10.7544/issn1000-1239.202330835

Journal of Computer Research and Development > 2024 > 61(3): 540-554. > DOI: 10.7544/issn1000-1239.202330835 CSTR: 32373.14.issn1000-1239.202330835

Yang Zhenkun, Yang Chuanhui, Han Fusheng, Wang Guoping, Yang Zhifeng, Cheng Xiaojun. Architecture and Technology of OceanBase Distributed Relational Database[J]. Journal of Computer Research and Development, 2024, 61(3): 540-554. DOI: 10.7544/issn1000-1239.202330835

Citation:

PDF (1548 KB)

Architecture and Technology of OceanBase Distributed Relational Database

Beijing OceanBase Technology Co., Ltd., Beijing 100102

More Information

Author Bio:
Yang Zhenkun: born in 1965. PhD. CCF fellow. One of the 1st Cheung Kong Scholars, Peking University. His main research interests include distributed system and database system

Yang Chuanhui: born in 1985. Master. Member of CCF. His main research interests include distributed system and database system

Han Fusheng: born in 1985. Bachelor. His main research interests include transaction processing, storage engine, and consensus protocol in database system

Wang Guoping: born in 1986. PhD. His main research interests include query processing and optimization in database system

Yang Zhifeng: born in 1983. Master. His main research interests include query processing and resource scheduling in database system

Cheng Xiaojun: born in 1982. Master. His main research interests include distributed system test and database system test
Received Date: October 18, 2023
Revised Date: December 13, 2023
Available Online: December 21, 2023

Graphical Abstract

Abstract

Abstract

Relational database is the key information infrastructure of today’s society. The Internet and digitization have brought high concurrency and massive data. Due to their centralized architectures, the processing power and storage capacity of traditional relational databases are stretched. OceanBase is a distributed relational database based on commodity PC servers. It achieves online horizontal scalability, automatic lossless disaster recovery from data center failure and high-ratio data compression. It has been used in finance, government affairs, telecommunication systems, Internet, etc. We introduce the architecture and some key technologies of OceanBase, including distributed transaction processing, LSM-tree-based storage system and distributed SQL optimizer. In addition, we explain in detail the high availability and data consistency of OceanBase, which can ensure that RPO is 0 and RTO is less than 8 seconds. At the same time, it also introduces OceanBase’s multi-tenant mechanism, which adopts a native multi-tenant design within the cluster to implement multiple independent database services in the cluster. Based on the Sysbench and TPC-H evaluation benchmarks, comparative experimental results show that 1) in a stand-alone mode, the performance of OceanBase is 1.27 times to over 2 times that of MySQL; 2) in a single-master mode, the performance of OceanBase is 1.25 times to nearly 2 times that of MySQL; 3) in a multi-master mode, the performance of OceanBase is 1.09 to 3.1 times that of MySQL, and for complex OLAP queries, the performance of OceanBase is 6 to 327 times that of MySQL.
- relational database,
- distributed transaction,
- LSM-tree-based storage,
- distributed SQL optimizer,
- multi-tenant

FullText(HTML)

References (30)

References

[1]	Codd E F. A relational model of data for large shared data banks[J]. Communications of the ACM, 1970, 13(6): 377−387 doi: 10.1145/362384.362685
[2]	Yang Zhenkun, Yang Chuanhui, Han Fusheng, et al. OceanBase: A 707 million tpmC distributed relational database system[J]. Proceedings of the VLDB Endowment, 2022, 15(12): 3385−3397 doi: 10.14778/3554821.3554830
[3]	Yang Zhifeng, Xu Quanqing, Gao Shanyan, et al. OceanBase Paetica: A hybrid shared-nothing/shared-everything database for supporting single machine and distributed cluster[J]. Proceedings of the VLDB Endowment, 2023, 16(12): 3728−3740 doi: 10.14778/3611540.3611560
[4]	Serlin O. TPC-C Details: 60, 880, 800 tpmC [EB/OL]. [2023-11-25]. https://www.tpc.org/1799
[5]	Serlin O. TPC-H Result Details: 15, 265, 305 QphH@30000GB [EB/OL]. [2023-11-25]. https://www.tpc.org/3375
[6]	Lamport L. The part-time parliament[J]. ACM Transactions on Computer Systems, 1998, 16(2): 133−169 doi: 10.1145/279227.279229
[7]	Gray J. The transaction concept: Virtues and limitations[C]//Proc of Int Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1981: 144−154
[8]	Mohan C, Lindsay B, Obermarck R. Transaction management in the R* distributed database management system[J]. ACM Transactions on Database Systems, 1986, 11(4): 378−396 doi: 10.1145/7239.7266
[9]	Berenson H, Bernstein P, Gray J, et al. A critique of ANSI SQL isolation levels[C]//Proc of the 1995 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1995: 1−10
[10]	Bernstein P, Goodman N. Multiversion concurrency control−Theory and algorithms[J]. ACM Transactions on Database Systems, 1983, 8(4): 465−483 doi: 10.1145/319996.319998
[11]	O’Neil P, Cheng E, Gawlick D, et al. The log-structured merge-tree (LSM-tree)[J]. Acta Informatica, 1996, 33(4): 351−385 doi: 10.1007/s002360050048
[12]	Selinger P, Astrahan M, Chamberlin D, et al. Access path selection in a relational database management system[C]//Proc of the ACM SIGMOD Conf on Management of Data. New York: ACM, 1979: 23−34
[13]	Graefe G, McKenna W. The Volcano optimizer generator: Extensibility and efficient search[C]//Proc of the IEEE Conf on Data Engineering. Piscataway, NJ: IEEE, 1993: 209−218
[14]	Graefe G. The Cascades framework for query optimization[J]. IEEE Data Engineering Bulletin, 1995, 18(3): 19−29
[15]	Levy A, Mumick I, Sagiv Y. Query optimization by predicate move-around[C]//Proc of Int Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1994: 96–107
[16]	Kim W. On optimizing an SQL-like nested query[J]. ACM Transactions on Database Systems, 1982, 7(3): 443−469 doi: 10.1145/319732.319745
[17]	Chaudhuri S, Shim K. An overview of cost-based optimization of queries with aggregates[J]. IEEE Data Engineering Bulletin, 1995, 18(3): 3−9
[18]	Kornacker M, Behm A, Bittorf V, et al. Impala: A modern, open-source SQL engine for Hadoop[C]//Proc of the 7th Biennial Conf on Innovative Data Systems Research. New York: ACM, 2015: 1−10
[19]	Oracle. Adaptive SQL Plan Management (SPM) in Oracle Database 12c Release 1 (12.1) [EB/OL]. [2023-11-25]. https://oracle-base.com/articles/12c/adaptive-sql-plan-management-12cr1
[20]	何宝宏. 中国通信标准化协会.数据库发展研究报告(2023)[EB/OL]. [2023-07-04]. https://www.c114.com.cn/market/39/a1236668.html He Baohong. China Communications Standards Association. Database Development Research Report (2023) [EB/OL]. [2023-07-04]. https://www.c114.com.cn/market/39/a1236668.html(in Chinese)
[21]	Ghemawat S, Gobioff H, Leung S. The Google file system[C]//Proc of the 19th Symp on Operating Systems Principles. Berkeley, CA: USENIX Association, 2003: 29−43
[22]	Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters[C]//Proc of the 6th Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2012: 137−150
[23]	Chang F, Dean J, Ghemawat S, et al. Bigtable: A distributed storage system for structured data[J]. ACM Transactions on Computer Systems, 2008, 26(2): 1−26
[24]	DeCandia G, Hastorun D, Jampani M, et al. Dynamo: Amazon’s highly available key-value store[C]//Proc of the ACM Symp on Operating Systems Principles. Berkeley, CA: USENIX Association, 2007: 205–220
[25]	Peng D, Dabek F. Large-scale incremental processing using distributed transactions and notifications[C]//Proc of the USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2010: 1–15
[26]	Corbett J, Dean J, Epstein M, et al. Spanner: Google’s globally-distributed database[C]//Proc of the 10th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2012: 251–264
[27]	Bacon D, Bales N, Bruno N, et al. Spanner: Becoming a SQL system[C]//Proc of the 2017 ACM Int Conf on Management of Data. New York: ACM, 2017: 331–343
[28]	Taft R, Sharif I, Matei A, et al. CockroachDB: The resilient geo-distributed SQL database[C]//Proc of the 2020 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2020: 1493−1509
[29]	Cook B. YugabyteDB [EB/OL]. [2023-11-25]. https://www. yugabyte. com
[30]	Yang Zhenkun. OceanBase [EB/OL]. [2023-11-25]. https://github.com/oceanbase

[1]	Yan Yunxue, Ma Ming, Jiang Han. An Efficient Privacy Preserving 4PC Machine Learning Scheme Based on Secret Sharing[J]. Journal of Computer Research and Development, 2022, 59(10): 2338-2347. DOI: 10.7544/issn1000-1239.20220514
[2]	Dong Ye, Hou Wei, Chen Xiaojun, Zeng Shuai. Efficient and Secure Federated Learning Based on Secret Sharing and Gradients Selection[J]. Journal of Computer Research and Development, 2020, 57(10): 2241-2250. DOI: 10.7544/issn1000-1239.2020.20200463
[3]	Qin Chuan, Chang Chin Chen, Guo Cheng. Perceptual Robust Image Hashing Scheme Based on Secret Sharing[J]. Journal of Computer Research and Development, 2012, 49(8): 1690-1698.
[4]	Wang Gang, Wen Tao, Guo Quan, Ma Xuebin. An Efficient and Secure Group Key Management Scheme in Mobile Ad Hoc Networks[J]. Journal of Computer Research and Development, 2010, 47(5): 911-920.
[5]	Zhang Haibo, Wang Xiaofei, and Huang Youpeng. General Results on Secret Sharing Based on General Access Structure[J]. Journal of Computer Research and Development, 2010, 47(2): 207-215.
[6]	Huang Dongping, Liu Duo, and Dai Yiqi. Weighted Threshold Secret Sharing[J]. Journal of Computer Research and Development, 2007, 44(8): 1378-1382.
[7]	Pang Liaojun, Jiang Zhengtao, and Wang Yumin. A Multi-Secret Sharing Scheme Based on the General Access Structure[J]. Journal of Computer Research and Development, 2006, 43(1): 33-38.
[8]	Wang Guilin, Qing Sihan. Security Notes on Two Cheat-Proof Secret Sharing Schemes[J]. Journal of Computer Research and Development, 2005, 42(11): 1924-1927.
[9]	Sui Hongfei, Chen Jian'er, Chen Songqiao, and Zhu Nafei. Secret Sharing-Based Rerouting in Rerouting-Based Anonymous Communication Systems[J]. Journal of Computer Research and Development, 2005, 42(10): 1660-1666.
[10]	Guo Yuanbo, Ma Jianfeng, Wang Yadi. An Efficient Secret Sharing Scheme Realizing Graph-Based Adversary Structures[J]. Journal of Computer Research and Development, 2005, 42(5): 877-882.

Supplements (1)

Supplements
Other Related Supplements
- Video
  https://www.bilibili.com/video/BV13c41147Ts/?spm_id_from=333.1387.search.video_card.click&vd_source=b27d2d557b6a91b712aa544f6c38158d

Cited By

Cited by

Periodical cited type(4)

1.	张海锋，耿中宝. 基于动态密钥的5G无线通信数据加密方法研究. 西安文理学院学报(自然科学版). 2025(02): 29-34 .
2.	奉钰鑫，何凯，魏银珍. 基于区块链的工业互联网安全防护研究与实践. 网络空间安全. 2024(03): 113-117 .
3.	王后珍，秦婉颖，刘芹，余纯武，沈志东. 基于身份的群组密钥分发方案. 计算机研究与发展. 2023(10): 2203-2217 . 本站查看
4.	何智旺，王化群. 面向车联网的匿名组密钥分发方案. 网络与信息安全学报. 2023(05): 127-137 .