高级检索
    汤文, 张春明, 谭光明, 张佩珩, 孙凝晖. 基于定制协处理器的基因重测序加速技术研究[J]. 计算机研究与发展, 2014, 51(9): 1980-1992. DOI: 10.7544/issn1000-1239.2014.20130987
    引用本文: 汤文, 张春明, 谭光明, 张佩珩, 孙凝晖. 基于定制协处理器的基因重测序加速技术研究[J]. 计算机研究与发展, 2014, 51(9): 1980-1992. DOI: 10.7544/issn1000-1239.2014.20130987
    Tang Wen, Zhang Chunming, Tan Guangming, Zhang Peiheng, Sun Ninghui. A Customized Coprocessor Acceleration of Genome Re-Sequencing[J]. Journal of Computer Research and Development, 2014, 51(9): 1980-1992. DOI: 10.7544/issn1000-1239.2014.20130987
    Citation: Tang Wen, Zhang Chunming, Tan Guangming, Zhang Peiheng, Sun Ninghui. A Customized Coprocessor Acceleration of Genome Re-Sequencing[J]. Journal of Computer Research and Development, 2014, 51(9): 1980-1992. DOI: 10.7544/issn1000-1239.2014.20130987

    基于定制协处理器的基因重测序加速技术研究

    A Customized Coprocessor Acceleration of Genome Re-Sequencing

    • 摘要: 自2008年1月高通量测序技术应用以来,测序的通量和成本都在不断下降.然而基因数据的爆发式增长速度已经超过了摩尔定律,对海量数据的计算处理能力成为制约基因测序应用推广的瓶颈.以基于Hash索引的重测序算法为目标,对计算和访存行为进行分析,从而提出了一个现场可编程门阵列(field programmable gate array, FPGA)作为协处理器的架构,并在Convey公司的HC-1ex平台上进行了设计与实现.其基本处理单元内部采用全流水的设计及FIFO隔离计算模块和访存模块,可以完整执行重测序算法的核心流程.通过将基本处理单元和访存端口的一对一绑定,在4块Xilinx Virtex-6 LX760上实现了64路并行处理流程,总平均读内存带宽可达22.59GBps.与8核Intel Xeon处理器相比,可以提升28.5倍的性能.

       

      Abstract: Since January 2008 when the next-generation DNA sequencing platforms were developed, the sequencing throughput has been significantly improved. However, this technology has been challenged by the large amount of sequencing data which grows dramatically even over the Moore's Law. As an emerging data-intensive workload, the high-throughput re-sequencing tools like Hash-based programs shows different characteristics from traditional computational applications. Both low arithmetic intensity and irregular memory access pattern are major sources of inefficiency on commodity multi-core platforms. In this paper, we propose co-processor architecture for accelerating a short reads mapping algorithm. The complete mapping flow in one processing element (PE) is integrated to an exclusive memory port to improve the parallel performance. This proposed architecture is then implemented on a Convey HC-1ex reconfigurable computer. The design includes 64 parallel PEs on 4 Xilinx Virtex-6 LX760 that operate at 150MHz. Compared with an Intel Xeon 8-cores CPU, the speedup achieves 28.5 times, and the average memory read bandwidth achieves 22.59GBps. Therefore, this proposed design can potentially supply a solution to the large-amount data challenge and be applied in high throughput re-sequencing.

       

    /

    返回文章
    返回