Near thirty years ago I undertook studies to test directly whether evolution happens neutrally (including nearly-neutrally), at random, or selectively (according to the synthetic theory). To avoid any previous treatment of this subject I discarded any reference to coding or non-coding DNA segments, repeated or single sequences, structural restrictions (centromere or telomere regions) or any other reference to functional or structural properties of DNA. I examined the sequence and answered the question whether nucleotides with their bases were randomly or non-randomly distributed.
I constructed the expected random distribution of a base (taken as a ball) and the non-bases taken as the walls of a box where balls are distributed. I found that bases distributed neutrally or non-randomly among non-bases according to a Bose-Einstein distribution and tested this expectancy against observed distributions of viral genomes, mtDNA, and DNA segments.
Bases were distributed so far from the expected neutral distribution that any interpretation within the neutral or nearly neutral theories was impossible. I revised the foundations of the neutral theory and found it has foundational errors, as for example neutral fixation is theoretically and factually impossible due to the equilibrium between direct and reverse mutation. It appeared that any base at any site is co-adapted and co-selected with all the other bases of any genome.
Then I studied dinucleotides to demonstrate that hypothesis. If we randomly take a nucleotide (first base) and then we take another nucleotide randomly chosen K nucleotides downstream (second base) we can obtain all the set of dinucleotides whose bases are separated by 0, 1, 2… K sites. Any of these dinucleotide sets can be tested whether its bases are distributed randomly by a simple chi-square test with 9 degrees of freedom (χ29). The nonrandom distribution of the second base in relation to the first one was even farther from neutrality than the sequence distribution. The test showed significances whose probabilities were less than 10-100,000 indicating again the impossibility of neutral models and the compatibility with a generalized “cosmic intelligent design”. When the values of the χ29 were examined according to the number of separation sites, a strong and highly significant periodicity appeared; it had a 3K periodicity.
As for example, the series given by 3K+2 showed a large positive deviation (more observed than expected dinucleotides) from randomness; then the series 3K presented a smaller negative deviation (less observed than expected dinucleotides) and finally the series 3K+1 presented a still smaller negative deviation. The χ29 value is positive, thus the periodicity is given by a periodicity of this magnitude. The periodicity was present in the human chromosome 21 until separations of 15 million of sites and in the archaea M smithii until 6,000 sites of separation. These interactions and periodicity cannot be due to coding or structural local functions. Some unknown properties of DNA should be uncovered. It should be remarked that this is a periodicity of the distance to neutrality and not a sequence periodicity. The Watson and Crick’s model described the structure or physical properties of DNA, the present model describes one of its evolutionary structure or property. What is the evolutionary meaning of such a stochastic periodicity of non-randomness? The door is open for any sharp hypothesis.
These findings are described in the article entitled Selective intra-dinucleotide interactions and periodicities of bases separated by K sites: new vision and tool for phylogeny analyses, published in the journal Biological Research. This work was led by Carlos Y Valenzuela from Universidad de Chile.