Dr. Ming-Yang Kao

Department of Computer Science

McCormick School of Engineering and Applied Science
Northwestern University

Time : Monday, April 01,  2:00 p.m.

Location: SB 106

 

Provably Accurate reconstruction of
Evolutionary Trees with Optimal Time and Space Complexities

 

Abstract

Evolutionary trees are useful for modeling the evolutionary history of organisms or species. An ultimate goal of biology is to reconstruct the evolutionary history of all known species. The determination of the evolutionary relationship of all green plants alone will involve hundreds of millions of species. Tree reconstructions of such scales require multidisciplinary collaboration as well as enormous computational resources. While we are not certain when, if ever, the necessary data will be ready, DNA sequences of more and more species are becoming available. In this talk, we will discuss a new algorithm for reconstructing evolutionary trees from biomolecular sequences. In  probabilistic models of evolution, the algorithm is mathematically proven to use sample sequences of length only polynomial in n to recover the correct tree topology with high probability, where n is the number of species involved. This accuracy guarantee is  supported by preliminary simulated experiments on the algorithm and its variants. Given the pairwise distances between n input sequences, the algorithm runs optimally in O(n^2) time using O(n) additional work space in the worst case. Previously, the most popular distance-based method, namely, Neighbor Joining, requires O(n^3) time and is not known to have an accuracy guarantee. The other known polynomial-time algorithms with accuracy guarantees take longer than O(n^3) time.

 (This talk is based on joint work with Miklos Csuros.)