Phys. Rev. E 72, 011908 (2005) [7 pages]Model for the distributions of k-mers in DNA sequencesReceived 21 January 2005; published 18 July 2005 The evolutionary features based on the distributions of k-mers in the DNA sequences of various organisms are studied. The organisms are classified into three groups based on their evolutionary periods: (a) E. coli and T. pallidum (b) yeast, zebrafish, A. thaliana, and fruit fly, (c) mouse, chicken, and human. The distributions of 6-mers of these three groups are shown to be, respectively, (a) unimodal, (b) unimodal with peaks generally shifted to smaller frequencies of occurrence, (c) bimodal. To describe the bimodal feature of the k-mer distributions of group (c), a model based on the cytosine-guanine “CG” content of the DNA sequences is introduced and shown to provide reasonably good agreements. © 2005 The American Physical Society URL:
http://link.aps.org/doi/10.1103/PhysRevE.72.011908
DOI:
10.1103/PhysRevE.72.011908
PACS:
87.14.Gg, 87.15.Aa, 02.50.−r
|
