Abstract:
Evolutionary tree reconstruction is a very important topic in biology. A reliable evolutionary tree has many important applications in medical and biological research, such as drug discovery and conservation biology. A rich variety of tree reconstruction methods have been developed, which fall into three categories: 1) maximum parsimony methods, 2) distance based methods, and 3) approaches applying the maximum likelihood principle. Maximum likelihood programs can recover the correct tree more frequently than other methods. But, inference of evolutionary trees using the maximum likelihood principle is NP-hard. The reconstruction method quartet puzzling(QP) is currently receiving considerable attention. QP is a divide-and-conquer approach to the maximum likelihood problem. QP works by estimating the set Q of quartet topologies, then combining the information in Q into an evolutionary tree on the whole set by some recombination technique. However, it is demonstrated that current performance of QP is far from satisfactory as expected. How to reconcile the different topologies of the quartets into one tree is still a major problem faced by QP. In order to improve QP, a new method named QPNJ is proposed, which has the same computational complexity as QP. Experiments with simulated datasets show that QPNJ is more accurate than others. Moreover, unlike QP, QPNJ is less dependent on the shape of the correct tree.