CEA, LITEN,
Laboratoire des systèmes solaires (L2S).
50 av. du Lac Léman,
73375 LE BOURGET DU LAC - CEDEX

 

Lespinats, S., Giron, A., Deschavanne, P. et Fertil, B. (2003) “Genomic Signature : Deciphering the style of DNA.” Journées de Post Génomique de la DOUA, JPDG03, mai 2003.

Résumé :
The usage of short oligonucleotides in sequences (the so-called genomic signature) has been shown to be species-specific. Since the genomic signature can be observed in DNA fragments as short as 1Kb, it appears to result from a “style” that characterizes the organization of DNA all over each genome. As a consequence, given a short DNA fragment, it is generally possible to find its origin, providing that the signature of the species the fragment comes from is already known.

By means of an Euclidian metric qualifying the distances between signatures, we have undertaken the systematic analysis of 49 genomes, screening each of them with a sliding window to get local DNA signatures. It appears that the origin of pieces of DNA is found with a high efficiency. However, oligonucleotides contribute unequally to the recognition process. In particular, some subsets of oligonucleotides that were identified by means of a genetic algorithm are highly efficient. The analysis of intra- and inter-species variations of oligonucleotide frequencies shows a consensus among species about the usage of some of them. In particular, the oligonucleotides with the most variable usage along genomes are common to most species. Others share the property of frequency invariance along and among genomes.

Some elements of a DNA syntax may subsequently be proposed: Based on their frequency properties, sets of “function” oligonucleotides can be identified as having a syntactic role (as “the, of, or...” in the human sentences) common to most species. “Content” oligonucleotides on the other side, may be characterized by a most variable –and less species-specific- usage along genome. In fact, “content” oligonucleotides are rarely found in the best discriminant sets. In this context, the resolving power of the genomic signature and consequently of the style of each species seems to result more from characteristics of function oligonucleotides than content oligonucleotides.

Télécharger le poster