Chandni Rajan Valiathan

Howard Hughes Honors Summer Institute

Summer 2004

chandni@brandeis.edu

Dennis Shasha

Courant Institute of Mathematical Sciences

Department of Computer Science

New York University

shasha@cs.nyu.edu

- Introduction
- Problems with trees
- Phylogenetic Networks
- Project Goals
- Algorithm 1
- Algorithm 2
- Results
- References

Back to top

- The input data was erroneous
- Reticulation events such as horizontal gene transfer and hybridization has taken place
- Convergent evolution has taken place

Back to top

At present there is no universally accepted network reconstruction algorithm.

Back to top

Back to top

- Reticulation Cycle: Suppose there is a node
*w*in a network from which two directed paths come out, and that the two paths meet at a node*x*. This node*x*is called a reticulation node, and the 2 directed paths (that start at*w*and end at*x*) together form a reticulation cycle. - Gall: A reticulation cycle that shares no nodes with any other reticulation cycle is known as a gall
- Galled Network: A network in which all the reticulation cycles are galls is called a galled network
- Partition: Every edge
*e*in a tree has two sets of nodes associated with it: the first set contains all those nodes below the edge and the second set contain all the remaining nodes. These two sets of nodes together constitute a partition of*e* - Incompatible Edges: An edge
*e1*in tree T1 is said to be incompatible with tree T2 if there are no edges in tree T2 with the same partition as*e1* - Ends of a path: These are the subtrees attached to the first or last nodes of a path.

The partition of the dashed edge (3, 6) is {{g,h},{1,2,4,5,7,8,9,10,11,a,b,c,d,e,f}}

- Given two different gene trees T1 and T2, we first find the incompatible edges in the two trees.
- For each path
*p*made up of the incompatible edges, we find the subtree X in one of the ends of*p*such that when X is removed from both trees T1 and T2, we get a common tree T3. - Once you find such a subtree X which is always attached to one of the ends of
*p*, we can build the network by adding a new edge from the other end of*p*to the root of X.

- An ortholog of a gene can only appear once in a gene tree (i.e there is no convergence)
- The true Phylogenetic nework is a galled network

- there are moderate levels of recombination
- most recombination events are recent
- we are studying a single recombination event

To download this algorithm, please contact chandni@brandeis.edu

Back to top

Suppose the gene tree for a set of species {Sp1, Sp2, Sp3} for gene A is (A1, (A21, A22)) where

Sp1:A

Sp2:A

Sp3:A

Suppose also that the gene tree for the same set of species for gene B is (B

Sp1: B

Sp2:B

Sp3:B

Combining this data we get that

Sp1:A

Sp2:A

Sp3:A

We can already see a conflict in that Sp1 and Sp2 must have an ancestor AB

- An ortholog of a gene can only appear once in a gene tree (i.e the phylogeny of a gene is a tree)
- For each gene A in the phylogenetic network for species, the ortholog of A must be the direct descendent of its parent/s or equal to the ortholog of the parent

To download this program, please contact shasha@cs.nyu.edu

Back to top

Here are the results for algorithm 1 and algorithm 2

- B.M.E. Moret, L. Nakhleh, T. Warnow, C.R. Linder, A. Tholse, A. Padolina, J. Sun, and R. Timme, "Phylogenetic networks: modeling, reconstructibility, and accuracy." To appear in the IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1(1), 2004. PDF
- L. Nakhleh, T. Warnow, and C.R. Linder, "Reconstructing Reticulate Evolution in Species -- Theory and Practice." Proceedings of the Eighth Annual International Conference on Research in Computational Molecular Biology (RECOMB 2004), 337-346.PDF
- D. Gusfield, S. Eddhu, and C. Langley. "Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination." Proceedings of Computational Systems Bioinformatics (CSB 03), 2003. PDF
- W.P. Maddison. "Gene Trees in Species Trees" systematic biology, 46(3):523-536, 1997.
- Genome-scale approaches to resolving incongruence in molecular phylogenies
- Deriving the genomic tree of life in the prescence of Horizontal Gene Transfer: Conditioned reconstruction
- Lake 1994 - Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances
- NeighborNet: An agglomerative method for the construction of planar phylogenetic networks