Laboratoire des systèmes solaires (L2S).
50 av. du Lac Léman,


Lespinats, S., Grando, D., Sautel, C.F., Hakimi, M.A. and Bastien, O. (2007) “Pointing out complex phylogenic relationships from apicomplexan methytransferase data by Multi Dimensional Scaling.” 1er Congrès de la Fédération Réaumur du Vivant, 2007 oct. Grenoble, France.

Posttranslational histone modifications modulate chromatin-templated processes in various biological systems. H4K20 methylation is considered to have an evolutionarily ancient role in DNA repair and genome integrity, while its function in heterochromatin function and gene expression is thought to have arisen later during evolution. Recently, we identify and characterize H4K20 methylases of the Set8 family in Plasmodium and Toxoplasma which exhibits originally features: H4K20 mono-, di-, and trimethylase activities, in striking contrast to the monomethylase-restricted human Set81. Our findings provide new insights into the evolution of Set8-mediated biochemical pathways, suggesting that the heterochromatic function of the marker is not restricted to metazoans. However, origin of the Set domain in apicomplexan phylum remains unclear. Beyond the typical phylogenetic prediction, we explore here the SET-Domain sequence space (Sequence space has been recently described formally2) using a new non-linear “Multi Dimensional Scaling” (MDS) method named DD-HDS3 (for Data-Driven High Dimensional Scaling). This method represents data on an Euclidean 2-dimensional space (without affecting distances). DD-HDS (as much of MDS methods) firstly preserves short distances, but it takes into account the concentration of measures phenomenon and avoids both “false neighbourhoods” and “tears”3. Obviously no phylogeny will be delineate by this means, but we got more flexibility to express distances. Then we can finely expressing thin relationships between data to provide more robust interpretations4. Thus we can picture the data structure on an easily readable layout. By applying this method on SET-Domain sequence space, we reveal the multiple putative origins of the domain. This work highlights the information gained on evolutionary history of homologous sequences by taking into account the spatial structure of the data. We shortly conclude by bringing evidences that original data are often distributed in complex structures leading to incongruent phylogenetics results.