ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (1): 134-141.doi: 10.7544/issn1000-1239.2017.20150674

• 系统结构 • 上一篇    下一篇

一种支持变形基2\+4 FFT的4路并行访存方法


  1. (国防科学技术大学计算机学院 长沙 410073) (
  • 出版日期: 2017-01-01
  • 基金资助: 
    国家自然科学基金项目(61472432) This work was supported by the National Natural Science Foundation of China (61472432).

An Address Parallel Access Method Supporting Four Reformulated Radix-2\+4 FFT

Yang Chao, Chen Haiyan, Liu Sheng   

  1. (College of Computer, National University of Defense Technology, Changsha 410073)
  • Online: 2017-01-01

摘要: IEEE 802.15.3c是高速无线个人局域网(high-rate wireless personal area networks, WPANs)的国际统一标准,该标准要求采样频率为2.592GHz的情况下在222.2ns内完成512点FFT运算,这对FFT处理器提出了极高的标准.为了满足这一要求,部分FFT处理器采用了变形的基2\+4 FFT算法以及多运算单元(processing element, PE)并行的方法.在多PE并行的情况下,只有支持其无冲突并行访问操作数以及并行按序输入输出数据的存储系统设计,才能完全发挥出多个PE单元并行的优势.根据4路并行变形的基2\+4 FFT运算单元访问操作数的规律,设计了一种支持4路PE并行访问操作数的地址转换方法;并且该方法支持并行按序输入输出数据,这解决了由于数据输入或者输出需要进行位反序操作给并行按序输入输出带来的困难.最后基于同一综合约束条件进行逻辑综合,结果表明:该方法比之前的方法节约面积46%,功耗节约了28%,并且该方法支持连续数据流(continuous-flow)操作以及即位运算(in-place).

关键词: IEEE 802.15.3c标准, 基2\+4, FFT算法, 地址转换, 并行, 即位运算, 连续数据流

Abstract: IEEE 802.15.3c is international unified standard of high-rate wireless personal area networks (high-rate WPANs) to support high data rate applications such as high-definition streaming content downloads, home theater and etc, which needs to finish 512 FFT sizes operations in only 222.2ns at the sampling rate of 2.592GHz. To satisfy this demand, some FFT processors adopt parallel PEs and reformulated radix-2\+4 FFT algorithm which can reduce the required number of butterfly stages. When parallel PEs are employed, only memory system supporting these PEs parallel accessing operating data and normal order I/O can express the full advantages of parallel PEs. According to the accessing law of four reformulated radix-2\+4 FFT PEs, this paper designs an address transformation method supporting four reformulated radix-2\+4. And the method in this paper supports normal order I/O, which solves the difficulty caused by bit reversal operation of initial or result data, to get a high-throughput design result. The implementation of the single address transformation unit is simple which requires only three two-input XOR gates and one three-input XOR gate. At the same synthesis condition, this method saves area 47% and power 24% compared with the method before. And this method supports continuous flow and in-place operation.

Key words: IEEE 802.15.3c, radix-2\+4, FFT, address schedule, parallel, in-place, continuous-flow