ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2017, Vol. 54 ›› Issue (2): 248-257.doi: 10.7544/issn1000-1239.2017.20170005

Special Issue: 2017科学大数据管理专题

Previous Articles     Next Articles

Data Management Challenges and Real-Time Processing Technologies in Astronomy

Yang Chen1, Weng Zujian1, Meng Xiaofeng1, Ren Wei1, Xin Rihui1, Wang Chunkai1, Du Zhihui2, Wan Meng3, Wei Jianyan3   

  1. 1(School of Information, Renmin University of China, Beijing 100872);2(Department of Computer Science and Technology, Tsinghua University, Beijing 100084);3(National Astronomical of Observatories, Chinese Academy of Sciences, Beijing 100012)
  • Online:2017-02-01

Abstract: In recent years, many large telescopes, which can produce petabytes or exabytes of data, have come out. These telescopes are not only beneficial to the find of new astronomical phenomena, but also the confirmation of existing astronomical physical models. However, the produced star tables are so large that the single database cannot manage them efficiently. Taking GWAC that has 40 cameras and is designed by China as an example, it can take high-resolution photos by 15s and the database on it has to make star tables be queried out in 15s. Moreover, the database has to process multi-camera data, find abnormal stars in real time, query their recent historical data very fast, persist and offline query star tables as fast as possible. Based on these problems, firstly, we design a distributed data generator to simulate the GWAC working process. Secondly, we address a two-level cache architecture which cannot only process multi-camera data and find abnormal stars in local memory, but also query star table in a distributed memory system. Thirdly, we address a storage format named star cluster, which can storage some stars into a physical file to trade off the efficiency of persistence and query. Last, our query engine based on an index table can query from the second cache and star cluster format. The experimental results show our distributed system prototype can satisfy the demand of GWAC in our server cluster.

Key words: astronomy big data management, the ground-based wide-angle camera array (GWAC), two-level cache, star cluster, index table

CLC Number: