ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (2): 284-294.doi: 10.7544/issn1000-1239.2017.20160850

所属专题: 2017科学大数据管理专题

• 软件技术 • 上一篇    下一篇

科学数据众包处理研究

赵江华1,2,穆舒婷3,王学志1,林青慧1,张兮3,周园春1   

  1. 1(中国科学院计算机网络信息中心 北京 100190); 2(中国科学院大学 北京 100049); 3(天津大学管理与经济学部 天津 300072) (zjh@cnic.cn)
  • 出版日期: 2017-02-01
  • 基金资助: 
    国家重点研发计划项目(2016YFB1000600,2016YFB0501900);国家自然科学基金项目(71571133);中国科学院战略性先导科技专项(XDA06010307)

Crowdsourcing-Based Scientific Data Processing

Zhao Jianghua1,2, Mu Shuting3, Wang Xuezhi1, Lin Qinghui1, Zhang Xi3, Zhou Yuanchun1   

  1. 1(Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190);2(University of Chinese Academy of Sciences, Beijing 100049);3(College of Management and Economics, Tianjin University, Tianjin 300072)
  • Online: 2017-02-01

摘要: 获取科学数据的最终目的是根据具体需要从数据中提取有用的知识,并将这些知识应用到具体的领域中,帮助决策制定者制定决策.由于科学数据规模越来越大,而且呈现结构复杂的特点,如半结构化或非结构化,难以通过计算机实现自动化处理.众包通过高效调用人力资源,成为进行科学大数据众包处理的解决方案之一.针对科学大数据众包处理的特点,围绕人才筛选机制、任务处理模式和结果评估策略3方面对科学数据众包体系进行研究,并通过地理空间数据云平台开展地学领域的基于众包的遥感影像信息提取实验.研究表明,科学数据不仅能够通过众包模式来进行处理,而且通过合理的设计众包流程能够获得高质量的数据结果.

关键词: 众包, 科学大数据, 数据处理, 人才筛选, 质量评估

Abstract: The ultimate goal of acquiring scientific data is to extract useful knowledge from the data according to specific needs and apply the knowledge to specific areas to help decision makers make decisions. As the volume of scientific data becomes larger, and the structure becomes more complex, such as semi or unstructured data, it is difficult to automatically process these data by computers. By incorporating human computing power in data processing, crowdsourcing has become one of the solutions for big scientific data processing. By analyzing the characteristics of crowdsourcing scientific data processing tasks to citizens, this paper studies three aspects, which are talent selection mechanism, task execution mode, and result assessment strategy. Then a series of crowdsourcing-based remote sensing imagery interpretation experiments are carried out. Results show that not only scientific data can be processed through crowdsourcing paradigm, but also by designing reasonable procedure, high-quality data can be obtained.

Key words: crowdsourcing, scientific big data, data processing, talent selection, quality assessment

中图分类号: