CEA, LITEN,
Laboratoire des systèmes solaires (L2S).
50 av. du Lac Léman,
73375 LE BOURGET DU LAC - CEDEX

 

Lespinats, S., Grando, D., Maréchal, E., Hakimi, M.A., Tenaillon, O., and Bastien, O. (2011) “How phylogenetic inference methods can benefit from Multi Dimensional Scaling evolution.” Evolutionary bioinformatics, 7, pp 61-85.

Abstract :
Whatever the phylogenetic method, genetic sequences are often described as strings of characters, thus molecular sequences can be viewed as elements of a multi-dimensional space. As a consequence, studying motion in this space (ie, the evolutionary process) must deal with the amazing features of high-dimensional spaces like concentration of measured phenomenon.
To study how these features might influence phylogeny reconstructions, we examined a particular popular method: the Fitch-Margoliash algorithm, which belongs to the Least Squares methods. We show that the Least Squares methods are closely related to Multi Dimensional Scaling. Indeed, criteria for Fitch-Margoliash and Sammon’s mapping are somewhat similar. However, the prolific research in Multi Dimensional Scaling has definitely allowed outclassing Sammon’s mapping.
Least Square methods for tree reconstruction can now take advantage of these improvements. However, “false neighborhood” and “tears” are the two main risks in dimensionality reduction field: “false neighborhood” corresponds to a widely separated data in the original space that are found close in representation space, and neighbor data that are displayed in remote positions constitute a “tear”. To address this problem, we took advantage of the concepts of “continuity” and “trustworthiness” in the tree reconstruction field, which limit the risk of “false neighborhood” and “tears”. We also point out the concentration of measured phenomenon as a source of error and introduce here new criteria to build phylogenies with improved preservation of distances and robustness.


Article Matlab Toolbox Upgraded Fitch-Margoliash homepage