corner
corner

Phys. Rev. E 66, 031910 (2002) [9 pages]

Recognition of an organism from fragments of its complete genome

Download: PDF (213 kB) Buy this article Export: BibTeX or EndNote (RIS)

V. V. Anh1,*, K. S. Lau2,†, and Z. G. Yu1,3,‡
1Centre in Statistical Science and Industrial Mathematics, Queensland University of Technology, P. O. Box 2434, Brisbane Q4001, Australia
2Department of Mathematics, Chinese University of Hong Kong, Shatin, Hong Kong
3Department of Mathematics, Xiangtan University, Hunan 411105, People’s Republic of China

Received 13 May 2002; revised 13 June 2002; published 24 September 2002

This paper considers the problem of matching a fragment to an organism using its complete genome. Our method is based on the probability measure representation of a genome. We first demonstrate that these probability measures can be modeled as recurrent iterated function systems (RIFS) consisting of four contractive similarities. Our hypothesis is that the multifractal characteristics of the probability measure of a complete genome, as captured by the RIFS, is preserved in its reasonably long fragments. We compute the RIFS of fragments of various lengths and random starting points, and compare with that of the original sequence for recognition using the Euclidean distance. A demonstration on five randomly selected organisms supports the above hypothesis.

© 2002 The American Physical Society

URL:
http://link.aps.org/doi/10.1103/PhysRevE.66.031910
DOI:
10.1103/PhysRevE.66.031910
PACS:
87.14.Gg, 87.10.+e, 47.53.+n

*Email address: v.anh@qut.edu.au

Email address: kslau@math.cuhk.edu.hk

Author to whom correspondence should be addressed. Email address: yuzg@hotmail.com or z.yu@qut.edu.au