Distribution and asymptotic behavior of the phylogenetic transfer distance

The transfer distance (TD) was introduced in the classification framework and studied in the context of phylogenetic tree matching. Recently, Lemoine et al. (Nature 556(7702):452–456, 2018. https://doi.org/10.1038/s41586-018-0043-0) showed that TD can be a powerful tool to assess the branch support...

Descripción completa

Detalles Bibliográficos
Autores principales: Dávila Felipe, Miraine, Domelevo Entfellner, Jean-Baka, Lemoine, Frédéric, Truszkowski, Jakub, Gascuel, Olivier
Formato: Journal Article
Lenguaje:Inglés
Publicado: Springer 2019
Materias:
Acceso en línea:https://hdl.handle.net/10568/129451
Descripción
Sumario:The transfer distance (TD) was introduced in the classification framework and studied in the context of phylogenetic tree matching. Recently, Lemoine et al. (Nature 556(7702):452–456, 2018. https://doi.org/10.1038/s41586-018-0043-0) showed that TD can be a powerful tool to assess the branch support on large phylogenies, thus providing a relevant alternative to Felsenstein’s bootstrap. This distance allows a reference branch in a reference tree to be compared to a branch b from another tree T (typically a bootstrap tree), both on the same set of n taxa. The TD between these branches is the number of taxa that must be transferred from one side of b to the other in order to obtain . By taking the minimum TD from to all branches in T we define the transfer index, denoted by , measuring the degree of agreement of T with . Let us consider a reference branch having p tips on its light side and define the transfer support (TS) as . Lemoine et al. (2018) used computer simulations to show that the TS defined in this manner is close to 0 for random “bootstrap” trees. In this paper, we demonstrate that result mathematically: when T is randomly drawn, TS converges in probability to 0 when n tends to . Moreover, we fully characterize the distribution of on caterpillar trees, indicating that the convergence is fast, and that even when n is small, moderate levels of branch support cannot appear by chance.