There appear to be conserved constraints on the distribution of nucleotide sequences in cellular genomes

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)

Abstract

The data from a genomic library can be sorted into the frequencies of every possible tetranucleotide in the sequence. This tabulation, a short sequence distribution, contains the frequency of occurrence of the 256 tetranucleotides and thus seems to serve as a vehicle for averaging sequence information. Two such distributions can be readily compared by correlation. Reported here are correlations (Spearman rs) of the distributions from all of the genomic libraries in GenBank 44.0 with sizes equal to or larger than that of Salmonella typhimurium, except for the data for mouse and humans. All of the organisms examined showed highly significant correlations between the two DNA strands (not the complementarity expected from base pairing). Of 155 comparisons between libraries, 132 showed significant correlations at the 99% confidence level. Application of the correlation coefficients as a similarity matrix clustered most organisms in a phenogram in a pattern consistent with other hypotheses. This suggests a highly conserved pattern underlying all other genetic information in cellular DNA and affecting both DNA strands, perhaps caused by interaction with conserved factors necessary for DNA packaging.

Original languageEnglish
Pages (from-to)24-30
Number of pages7
JournalJournal of Molecular Evolution
Volume32
Issue number1
DOIs
Publication statusPublished - Jan 1991
Externally publishedYes

Keywords

  • Asymmetric nucleotide sequences
  • Averaged sequence
  • Evolution
  • Evolutionary constraints
  • GC content
  • Sequence constraints
  • Sequence structure
  • Short sequence distribution

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'There appear to be conserved constraints on the distribution of nucleotide sequences in cellular genomes'. Together they form a unique fingerprint.

Cite this