CEA, LITEN,
Laboratoire des systèmes solaires (L2S).
50 av. du Lac Léman,
73375 LE BOURGET DU LAC - CEDEX

 

Lespinats, S. and Fertil, B. (2011) "ColorPhylo: A colour code to accurately display taxonomic classificationsEvolutionary bioinformatics (in press).

Abstract :
Colour may be very useful to visualise complex data. As far as taxonomy is concerned, colour may help observing various species characteristics in correlation with classification. However, choosing the number of subclasses to display is often a complex task: on the one hand, assigning a limited number of colours to taxa of interest hides the structure imbedded in the subtrees of the taxonomy; on the other hand, differentiating a high number of taxa by giving them specific colours, without considering the underlying taxonomy, may lead to unreadable results since relationships between displayed taxa would not be supported by the colour code.
In the present paper, an automatic colour coding scheme is proposed to visualise the levels of taxonomic relationships displayed as overlay on any kind of data plot. To achieve this goal, a dimensionality reduction method allows displaying taxonomic “distances” onto a Euclidean two-dimensional space. The resulting map is projected onto a 2D colour space (the Hue – Saturation – Brightness colorimetric space with brightness set to 1). Proximity in the taxonomic classification corresponds to proximity on the map and is therefore materialised by colour proximity. As a result, each species is related to a colour code showing its position in the taxonomic tree. The so called ColorPhylo displays taxonomic relationships intuitively and can be combined with any biological result. A Matlab version of ColorPhylo is available at http://sy.lespi.free.fr/ColorPhylo-homepage.html.
Meanwhile, an ad-hoc distance in case of taxonomy with unknown edge lengths is proposed.


Article Matlab Toolbox ColorPhylo homepage