ISSN 1000-1239 CN 11-1777/TP

• Paper • Previous Articles     Next Articles

An Improved TFQMR Algorithm for Large Linear Systems Suited to Parallel Computing

Liu Jie1, Chi Lihua1, Hu Qingfeng1, and Li Xiaomei2   

  1. 1(College of Computer Science, National University of Defense Technology, Changsha 410073) 2(Institute of Equipment and Command Technology, Beijing 101416)
  • Online:2005-07-15

Abstract: The transpose-free QMR(TFQMR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems with very large and very sparse coefficient matrices. In this paper, by changing the computation sequence in the TFQMR algorithm, an improved transpose-free QMR (ITFQMR) algorithm is proposed. The numerical stability of the ITFQMR algorithm is the same as TFQMR algorithm, but the synchronization overhead that represents the bottleneck of the parallel performance is effectively reduced by a factor of two. And all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. From the theoretical and experimental analysis it is found that the ITFQMR algorithm is faster than the TFQMR algorithm as the number of processors increases. The experiments performed on a 64-processor cluster indicate that the ITFQMR is approximately 20% faster than the TFQMR.

Key words: transpose-free QMR algorithm, synchronization overhead, parallel computing, cluster, large sparse linear systems