corner
corner

Phys. Rev. E 79, 061911 (2009) [11 pages]

Quantitative measure of randomness and order for complete genomes

Download: PDF (713 kB) Buy this article Export: BibTeX or EndNote (RIS)

Sing-Guan Kong1,2, Wen-Lang Fan1,2, Hong-Da Chen1,2, Jan Wigger4, Andrew E. Torda4, and H. C. Lee2,3,5
1Department of Physics, National Central University, Chungli, Taiwan 32001, Republic of China
2Graduate Institute of Biophysics, National Central University, Chungli, Taiwan 32001, Republic of China
3Graduate Institute of Systems Biology and Bioinformatics, National Central University, Chungli, Taiwan 32001, Republic of China
4Center for Bioinformatics, University of Hamburg, Bundesstrasse 43, D-20146 Hamburg, Germany
5National Center for Theoretical Sciences, Hsinchu, Taiwan 30043, Republic of China

Received 7 July 2008; revised 14 April 2009; published 9 June 2009

We propose an order index, ϕ, which gives a quantitative measure of randomness and order of complete genomic sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length. The 786 complete genomic sequences in GenBank were found to have ϕ values in a very narrow range, ϕg=0.031−0.015+0.028. We show this implies that genomes are halfway toward being completely random, or, at the “edge of chaos.” We further show that artificial “genomes” converted from literary classics have ϕ’s that almost exactly coincide with ϕg, but sequences of low information content do not. We infer that ϕg represents a high information-capacity “fixed point” in sequence space, and that genomes are driven to it by the dynamics of a robust growth and evolution process. We show that a growth process characterized by random segmental duplication can robustly drive genomes to the fixed point.

© 2009 The American Physical Society

URL:
http://link.aps.org/doi/10.1103/PhysRevE.79.061911
DOI:
10.1103/PhysRevE.79.061911
PACS:
87.14.G−, 02.50.−r, 05.45.−a, 87.15.Cc