ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (2): 309-317.doi: 10.7544/issn1000-1239.2015.20140267

所属专题: 2015大数据管理

• 软件技术 • 上一篇    下一篇

大数据群体计算中用户主题感知的任务分配

张晓航1,2,李国良2,冯建华2   

  1. 1(清华大学交叉信息研究院 北京 100084); 2(清华大学计算机系 北京 100084) (zhangxiaohang12@mails.tsinghua.edu.cn)
  • 出版日期: 2015-02-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61373024,61472198);国家“九七三”重点基础研究发展计划项目(2015CB358700)

Theme-Aware Task Assignment in Crowd Computing on Big Data

Zhang Xiaohang1,2,Li Guoliang2, Feng Jianhua2   

  1. 1(Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084); 2(Department of Computer Science and Technology, Tsinghua University, Beijing 100084)
  • Online: 2015-02-01

摘要: 大数据问题所固有的规模繁杂性、高速增长性、形式多样性、价值密度低等特点为传统计算处理方法带来了严峻的挑战.一方面,大数据的规模繁杂性和高速增长性带来了海量计算分析的需求;另一方面,形式多样性和价值密度低等特点使得大数据计算任务高度依赖复杂认知推理技术.针对大数据计算中海量计算分析和复杂认知推理需求并存的技术挑战,传统的基于计算机的算法已经无法满足日益苛刻的数据处理要求,而基于人机协作的群体计算是有效的解决途径.在大数据群体计算中,最基础的就是任务的分配方式.考虑到大量网络用户不同的专业背景、诚信程度,因此不能简单随机地将要处理的任务交给大众来完成.针对此问题,提出了一种基于用户主题感知的迭代式任务分配算法.利用已知答案的测试问题迭代地检测不同人群的专业背景和完成任务的准确率.在充分了解用户真实主题和准确率的情况下为他们分配合适的问题.通过和随机任务分配算法在模拟数据和真实数据上的对比,有效显示了基于主题感知任务分配算法的准确性.

关键词: 群体计算, 人类计算, 大数据, 众包, 人机结合

Abstract: Big data has brought tremendous challenges for the traditional computing model, because of its inherent characteristics such as large volume, high velocity, high variety, low-density value. On the one hand, the large volume and high velocity require the techniques of massive data computation and analysis; on the other hand, the high variety and low-density value make big data computing tasks highly depend on the complex cognitive reasoning technology. To overcome the coexistence challenges of massive data analysis and complex cognitive reasoning, human-machine collaboration based crowd computing is an effective way to solve the big data problem. In crowd computing, task assignment is one of the basic problems. However the current crowdsourcing platforms cannot support the active task assignment, which iteratively assigns tasks to appropriate workers based on the knowledge background or users. To address this problem, we propose an iterative theme-aware task assignment framework, and deploy it into existing crowdsourcing platforms. The framework includes two components. The first component is task modeling, which models the tasks as a graph where vertices are tasks and edges are task relationships. The second component is the iterative task assignment algorithm, which identifies the themes of the workers by their historical records, computes the workers’ accuracy on different themes, and assigns the tasks to the appropriate workers. Various experiments validate the effectiveness of our method.

Key words: crowd computing, human computation, big data, crowdsourcing, human-computer interaction

中图分类号: