Gorban, A.N. and Popova, T.G. and Zinovyev, A.Yu. (2004) Four basic symmetry types in the universal 7-cluster structure of 143 complete bacterial genomic sequences. [Preprint]
Full text available as:
|
PDF
383Kb |
Abstract
Coding information is the main source of heterogeneity (non-randomness) in the sequences of bacterial genomes. This information can be naturally modeled by analysing cluster structures in the ``in-phase'' triplet distributions of relatively short genomic fragments (200-400bp). We found a universal 7-cluster structure in all 143 completely sequenced bacterial genomes available in Genbank in August 2004, and explained its properties. The 7-cluster structure is responsible for the main part of sequence heterogeneity in bacterial genomes. In this sense, our 7 clusters is the basic model of bacterial genome sequence. We demonstrated that there are four basic ``pure'' types of this model, observed in nature: ``parallel triangles'', ``perpendicular triangles'', degenerated case and the flower-like type. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy (more precisely, by two similar functions, one for eubacterial genomes and the other one for archaea). All 143 cluster animated 3D-scatters are collected in a database and is made available on our web-site: http://www.ihes.fr/~zinovyev/7clusters The finding can be readily introduced into any software for gene prediction, sequence alignment or bacterial genomes classification.
Item Type: | Preprint |
---|---|
Keywords: | codon usage, cluster structure, mean field, frequency dictionary |
Subjects: | Biology > Theoretical Biology |
ID Code: | 3915 |
Deposited By: | Gorban, Prof Alexander N. |
Deposited On: | 06 Nov 2004 |
Last Modified: | 11 Mar 2011 08:55 |
References in Article
Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.
Metadata
- ASCII Citation
- Atom
- BibTeX
- Dublin Core
- EP3 XML
- EPrints Application Profile (experimental)
- EndNote
- HTML Citation
- ID Plus Text Citation
- JSON
- METS
- MODS
- MPEG-21 DIDL
- OpenURL ContextObject
- OpenURL ContextObject in Span
- RDF+N-Triples
- RDF+N3
- RDF+XML
- Refer
- Reference Manager
- Search Data Dump
- Simple Metadata
- YAML
Repository Staff Only: item control page