In the field of RNA secondary structure prediction, the Zuker algorithm is a most popular method using free energy minimization models. However, general-purpose computers including parallel computers or multi-core computers exhibit embarrassing efficiency of no more than 50%. FPGA chips provide a new approach to accelerate the Zuker algorithm by exploiting fine-grained custom design. The Zuker algorithm shows complicated data dependence, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic-like array including one master PE and multiple slave PEs for fine-grained hardware implementation on FPGA. We partition tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory by a sliding triangle window cache and transferring local elements to adjoining PEs. We also propose several methods, fitting curves with linear function, replacing scattered points with register constants, compressing address space and shortening data length to greatly reduce energy parameter tables by more than 85%. The experimental results show a factor of 18x speedup over the ViennaRNA-1.6.5 software for 2981-residue RNA sequence running on a PC platform with AMD Phenom 9650 Quad CPU, however the power consumption of our FPGA accelerator is only about 20% of the latter.