What's Wrong with Independent Birth of Organisms?
An investigation into the origin of life
A review by Gert Korthof. updated 10 Apr 2018 (first published 29 Dec 2002)
( Introduction updated 21 Jun 2014 )
The origin of life is an age-old and unsolved problem. Independent researcher Periannan Senapathy came up with an extraordinary solution: the independent origin of all organisms, including humans. It is a theory that seeks to replace everything Charles Darwin and evolutionary biologists have put forward the last 150 years. Although many of his objections to Darwin's theory are the same as those of creationists, I found no evidence that he is a creationist.
Senapathy did computer simulations and found that it was very difficult to find genes of bacteria in computer generated random DNA, but that it was easy to find the sequence of genes interrupted with meaningless pieces of DNA (introns). He concluded that random DNA sequences would automatically contain split genes typically of higher organisms. Next, he concluded that in a pre-biotic environment genes of plants and animals could be produced. The cumbersome and time-consuming Darwinian processes of mutation and natural selection would be completely unnecessary. It also implies that all other origin of life scenarios (including the RNA-world and pre-RNA-world) are superfluous.
Senapathy presents his extraordinary solution to one of the biggest problems in biology in an overconfident way. He is unaware of quite a few biological facts which happen to cause insurmountable problems for his theory. However, it would neither be fair nor interesting to disprove his theory using current biological knowledge (his book was published in 1994) or dogmatically use the theory of evolution to refute his views. It is relatively easy to find millions of objections to his theory, but to determine the most fundamental, the most profound objection –that is the objection that makes everything else superfluous– is far from trivial. I called it 'The elephant in the room'. It is a conceptual error, and is not depended on the latest research findings.
The results of this review go far beyond Senapathy's Independent Origin theory. This review evolved into a general Origin Of Life investigation with an emphasis on DNA. The comparison of different theories of evolution and origins of life continues to be a fascinating project. My motivation is: the effort deepens our insight into the Origin and Evolution of Life and enables us also to detect errors in other theories of the Origin of Life (312). In the end I will conclude that all the reasons why Independent Origin fails are indirect arguments for evolution.
Periannan Senapathy (1994) "Independent Birth of Organisms. A New Theory That Distinct Organisms Arose Independently From The Primordial Pond Showing That Evolutionary Theories Are Fundamentally Incorrect". (66)
thinking outside the box
||Senapathy and the Greek Philosophers
In Antiquity basically two solutions for the origin of organisms were developed: design and accident. The creationists Socrates and Plato argued for design. The Atomists Empedocles and Epicurus argued for accident. The atomists needed an infinite universe to explain why accident could produce highly improbable adaptations such as the eye. Darwin improved the 'accident theory' by eliminating the huge improbabilities and replacing them by natural selection. Senapathy's solution is non-creationist and non-Darwinist. But he does not invoke an infinite universe (96), so he bears the full burden of the Boeing-747 argument, and at the same time he dismisses the greatest improvement since Antiquity of the atomist naturalistic theory: full common descent, full gradual evolution, and full natural selection. Therefore, he has all the disadvantages of the 'accident theory' and must do without all the advantages of Darwinian evolution.
Despite DNA, Senapathy ('immutable DNA') is closer to Greek philosophers than he knows: Lucretius believed in the fixity of species, and that all mutations occurred in one burst at the beginning of the world; all species were fully formed by spontaneous generation and did never change (85,p.149). Anaximander believed that all life arose in water (85,p.153). As an afterthougth Senapathy adds 'natural selection' (Fig.7). The Greek philosophers proposed at least two rounds of selection: one for viability, and one for fine adaptations by competition between individual animals.
See: my review of David Sedley (2007) 'Creationism and its critics in Antiquity'.
The most recent updates can be traced through the last Notes.
See also page:
and the facts of life
(a general overview)
"Evolutionary scenarios are an artform. They usefully exercise the brain, causing us to look at old data in new ways and
stimulating us to collect new data. They do not have to be true!".|
W. Ford Doolittle
"All observation must be for|
or against some view
if it is to be of any service"
"Science should be a pluralistic pursuit of explanation. It can only benefit by being pluralistic. Monism warps the effort in every way–from funding, to publishing, to teaching, to ... well, understanding."|
Stuart Firestein, Science, 8 Aug 2014
in a review of Is Water H2O? Evidence, Realism and Pluralism by Hasok Chang
A seductive computer experiment
Here is Senapathy's discovery in his own words:
"As I was working with the origin of genes from random genetic sequences, I realized that simple-to-complex gene evolution was quite unnecessary to explain the origin of complex genes found in multicellular creatures. I could demonstrate that the complex genes of multicellular creatures simply existed in very long random genetic sequences" (99)
In this figure from Senapathy's book a computer generated random sequence of the letters of the alphabet is shown:
Fig 7.1. The words TO, BE, OR, NOT, TO, BE found in a random string of letters (p.226).
Note: This example symbolizes the sense strand of single-stranded DNA.
But DNA is double stranded and always has a Positive-sense and a Negative-sense strand.
(see: wikipedia). See my criticism: here
Highlighted are the pieces of Shakespeare's phrase "TO BE OR NOT TO BE". Senapathy admits that it is hopeless to search for an uninterrupted simple phrase "TO BE OR NOT TO BE". That is why he allows for the words being separated by strings of an arbitrary number of arbitrary letters. The phrase is present completely by accident, but it occurs with predictable frequency in a random sequence of letters of about 3500 characters long. So far so good: this is uncontroversial. For practical reasons, the example phrase contains only 2 and 3 letter words. If longer words are needed, then the distance between the individual words would be larger. So would the whole string of letters. Furthermore, it would be nearly always possible to find the words in the right order, says Senapathy. Furthermore, any sentence and indeed the complete works of Shakespeare (interrupted by nonsense words) can be found when the random sequence is long enough (22). Senapathy does not tell how long. This is not just a game with words. What is possible with words is possible with genes according to Senapathy. Just substitute words for 'exons' (335), the nonsense between the words with 'introns' (335) and you have Seanapthy's theory! Please note that this is a mathematical theory. It is a claim about probabilities. Only when applied to real genomes, it does become an empirical claim.
His central idea is:
Introns are pieces of DNA in the middle of a gene, which are removed after they are transcibed into mRNA (336) and before the gene is translated (250) in the cytoplasm into protein and so the intron sequence does not end up in the protein. Introns are central to Senapathy's theory because the occurrence of genes in random DNA sequences depends on finding split genes: genes with introns. He concludes that finding the uninterrupted gene sequences of today's prokaryotes in a computer-generated random DNA sequence is extremely unlikely (84). This is no surprise because bacteria have the most compact genomes and the highest gene densities of all species on earth (246, p.233). Therefore, in real life the random origin of current genes of prokaryotes must also be extremely improbable. On the other hand, eukaryotic genomes contain large amounts of DNA and a lot of junk DNA, so are ideal candidates.
- split genes (genes with introns) are easy to find in computer generated random DNA sequences. Genes without introns are impossible to find.
- so when in real life DNA sequences are randomly assembled from their building blocks, genes with introns will easily be formed by accident (split genes are primordial; they were never in one piece (200), and introns were never inserted)
- since split genes are only found in eukaryotes (plants and animals), eukaryotes must have originated first
- prokaryotes (which don't have introns) must have evolved from eukaryotes by losing introns (23).
Ignoring everything else (!), the crucial test for Senapathy's theory is: if split genes cannot be found in a random DNA sequence with sufficient probability, then his whole theory breaks down. In figure 1, the words are only 2-3 letters, otherwise we needed pages full of letters to demonstrate the effect. For longer words, we simply need longer sequences of letters. Senapathy does not calculate how long. However, we can see from the figure that the meaningless (135) pieces of letters are far greater than the real parts of the gene. This is exactly what is found in real genes of eukaryotes (organisms with a nucleus in their cells such as all animals and plants). Therefore, it is not very surprising that Senapathy became enthusiastic for the thesis of independent origin of organisms. If we add the fact that more than 98% of the human genome seems to be meaningless junk DNA anyway, the theory seems plausible at first sight (287). Furthermore, the existence of introns is one of the great unsolved puzzles of molecular and evolutionary biology and Senapathy claims to have solved it.
Probability of the human genome
data from de Duve (270)
Indeed, given enough time and resources, a computer could generate all the genomes in the world. Therefore, the probability is not zero. However, you do not have achieved anything if you do not calculate the probability of at least one complete genome (90). The probability of one gene does not help. The human genome happens to be very large. The haploid human genome has a length of ∼ 3 billion base pairs and the diploid genome is ∼ 6 billion base pairs (296). The number of possible sequences of 6 billion bases is not calculated by Senapathy. So he has no idea how many trials he needs to produce a human being. Yes, it is true: the larger a piece of DNA, the higher the probability of finding genes. But at the same time, the larger a piece of DNA, the harder it is to synthesize it abiotically.
Please note, that the diploid genome cannot be reduced to the haploid because of heterozygosity (different alleles) and the X/Y pair.
For illustration, we can calculate the possible sequences up to 200 bases length (see table). Surprisingly, there are more than 1 million possible sequences of only 10 bases (which is very much smaller than the smallest exon). A sequence of 200 bases is still nothing compared to 6 billion bases. The probability of the human sequence of 6 billion bases must be indistinguishable from infinite.
We need to answer some basic but very difficult questions:
Let us assume for the moment that only one sequence of 6 billion bases produces a healthy human being, then the probability is 1/A. Since A is ∞ infinitely large, the probability is 1/∞ or ∼ 0 (zero). However, there are now 7 billion people on earth and since each individual has a unique genome, there are 7 billion unique sequences (B) that produce a human being. But that is not all. If we count all humans beings ever lived on earth, we arrive at an estimated number of 108 billion (266). So, the probability to hit a human genome is 108 billion divided by A, which is still close to ∞ (I guess).
- How many sequences of 6 billion bases are possible? (random sequences)
- How many of those will produce a human being? (human sequences)
- How many sequences of arbitrary length will produce a human being? (not considered here)
- What is the smallest DNA sequence that produces a human? (not considered here)
There is still more variability in the human genome we didn't include. For example, there are 38 million (mostly neutral) Single Nucleotide Polymorphisms (SNPs), 1.4 million bi-allelic indels and 14,000 large deletions in the human genome (295). There have been found 1,146,401 autosomal protein-coding SNVs in 15,336 protein-coding genes of 6,515 individuals (304). There are 117,277 mutations in The Human Gene Mutation Database. Furthermore, there are naturally occurring structural genomic variants in the human genome. Further, a survey showed that the average healthy person has about 20 genes knocked out (334). If we add every possible combination of SNP, mutation, indels and structural variants, B. rises above 108 billion (by an unknown amount). In Nov 2014 a new genome sequencing technology detected 22,000 segments of 50 to 5,000 bases in length that have never been reported before (347). More variation means more possible genomes able to create a human.
Another way of estimating the amount of neutral variation –compatible with health– is the following. Evolutionary analyses indicate that natural selection has conserved five times more base pairs that don't code for proteins than ones that do (1.5%) (267), so we arrive at 7.5%. Another estimate is that up to 74% of the human DNA may be transcribed into RNA (232). Others estimate that probably around 60% of the mammalian genome is transcribed (261). Comparative analysis of 29 mammalian genomes reveals a high-resolution map of >3.5 million constrained elements that encompass ∼4% of the human genome and suggest potential functional classes for ∼60% of the constrained bases (268).
Human chromosomes under a scanning electron microscope. ©Nature2017
Above we defined B. as all the sequences that produce a human being. However, in fact we should have added: how many sequences of 6 billion bases partitioned into 46 paired pieces (called 'chromosomes', see picture above) of specified length and contents. This is a far stronger requirement than simply 6 billion bases. Of course, any partition of the total DNA into chromosomes could produce a human, but the independent origin scenario demands the origin of the human genome as it is now. And then we have ignored we need not one human genome, but a female and a male genome! (See: §sex).
Another serious restriction applies: the actual human genome (46 chromosomes) must conform a pattern of sequence and chromosome similarities with all of the 1.9 million species on earth, particularly, but not exclusively, primates and mammals. This significantly reduces the number of acceptable human genome sequences, because not any human genome fits that pattern.
Before discussing introns and exons, it is very useful to think about the analysis of John Maynard Smith:
So far he only stated the problem that has to be solved. He continues:
- If we imagine the simplest conceivable organism whose hereditary mechanism depends on the processes of nucleic acid replication and protein synthesis, it would have to possess enough DNA to specify all the varieties of tRNA, the protein and RNA components of the ribosomes, the activating enzymes associated with the 20 amino acids,
the various enzymes which replicate the DNA and make an RNA transcript of it, and more besides." (93)
It is very useful to realize that according to Senapathy organisms with the complexity Maynard Smith described above, can originate spontaneously. But that's not all. According to Senapathy, organisms far more complex, so far more improbable than minimal life can arise spontaneously:
"The very first cells were highly complex eukaryotic cells with a nucleus" (p.239). Please note, that I am not assuming the theory of evolution, but I am merely pointing out what is highly improbable for Maynard Smith is highly probable for Senapathy. Senapathy's computer simulation is so seductive because it ignores the complexity of minimal life.
"It is impossible that an organism of this degree of complexity should arise by physico-chemical processes, without natural selection." (John Maynard Smith, 93, p.111)
A computer simulation is a virtual world
Even if Senapathy found the complete sequence of the human genome in a computer simulation (he did not), this would not prove that the human genome could originate in the real world. A computer simulation is a seductive virtual world! The computer is an ideal symbol manipulator, and the sequence of 4 symbols –call it information– can easily be abstracted from DNA and manipulated on the computer. But there is a lot more in a chromosome than DNA alone! (see: A DNA sequence is not a genome) A computer simulation is a virtual world and life is chemistry! All life needs energy and molecules! A computer can produce almost instantaneously a virtual DNA sequence of a billion bases at virtually zero energetic costs. In the real world one needs at least Adenine, Thymine, Cytosine, Guanine, phosphate and deoxyribose for DNA synthesis and for RNA: ribose and Uracil. DNA must be synthesized! Energy is required! Even when these building blocks are present, it is still difficult to assemble nucleotides into chainlike DNA polymers that compose messages and carry out reactions (127, p.53).
If life is ever created artificially, it will be in a test tube, not in a computer! (58). The seductive nature of computer simulations is so strong that it completely obscures the fact that a computer simulation is a virtual world. However, it must be said that Richard Dawkins (193) demonstrated the principle of natural selection with a computer experiment, the Weasel program, and the experiment started with a random sequence of letters!
Random sequence libraries
Recently, in vitro selection experiments show that it is possible that functional RNA's (ribozymes, ligases, polymerases) can arise from random sequence libraries (235). The first and perhaps still the most robust of these ligases were selected by Bartel and Szostak (1993) from a pool that encompassed 220 random sequence positions. This looks like the origin of genomes from random sequences. Please note: (1) these are chemical experiments, not computer experiments, 2) the length of the sequences are a million times smaller than the human genome, 3) this is about RNA, not DNA. DNA cannot self-replicate. 4) no organism is created, certainly not an eukaryotic organism from random sequences. These experiments are in the context of an RNA world, not a DNA-protein world. In 1985 Ballivet and Kauffman (251) created tens of thousands of random DNA sequences, so created what later became combinatorial chemistry. However, their goal was not to create genomes or organisms. Stuart Kauffman thinks it is possible to create minimal life forms from Collectively Autocatalytic Sets (CAS) (251), but these are certainly not complete prokaryotic or eukaryotic genomes. In 2009, Harvard University geneticist George Church unveiled a technique that lets researchers design millions of slightly different versions of a strand of DNA (272).
The Eigen Limit
The maximum length (informational content) of a nucleic acid sequence is inversely proportional to the error rate of its
replication (the Eigen limit). When the error rate exceeds this limit, new errors accumulate in the system, compounding the
original error and resulting in eventual randomization - 'mutational meltdown' or 'error-cascade' (173), (192). So, what Senapathy does makes no sense. Of course your computer can generate endless long sequences. The point is however, that it has to be synthesized, copied, and maintained in the real world. It is important that any hypothesis be framed in light of our understanding of the physico-chemical properties of molecules (173). The Eigen limit forbids Senapathy's sequences, even when they 'self-assemble'.
Manfred Eigen (1992) concludes:
"The genes found today cannot have arisen randomly, as it were by the throw of a dice. There must exist a process of optimization that works towards functional efficiency. Even if there are several routes to optimal efficiency, mere trial and error cannot be one of them". (192)
A test for randomness
Senapathy asks an interesting question about the real world: How would the DNA sequence of extant organisms look like if it originated randomly? Let's test for randomness.
- A test for randomness (improved 1 Mar 2018)
If DNA has directly arisen from random assembly of the building blocks and genomes are immutable, then all genomes we examine today should show randomness. A prediction is that the distribution of stop codons should be random. Is it? Actually there are two questions: What do we observe? What do we expect? Senapathy does not state this clearly at the beginning. Senapathy produces plots to answer both questions, but they are difficult to read, not well explained and do not seem to support his theory. A stopcodon is the DNA code for the end of the protein. If the distribution of stop codons in extant genomes would match a purely random distribution, it would be strongly suggestive for the random origin of genomes. The frequency of stop codons in a random computer generated DNA string must be calculated on the basis of the fact that 3 out of the 64 codons are stop codons (assuming this holds for the origin of life also). So we must expect that the average length of genes between two stop codons would be 64/3 = 21 codons (= 63 bases) (my calculation, Senapathy does not show this!). However, this is far too small for a real gene! The average length of a human gene is approximately 450 codons (1350 bases) (343). The Random Hypothesis fails spectacularly! End of story!
According to Senapathy the expected length of genes in computer generated random sequences would vary from 0 (two stop codons next to each other) up to 500 or 600 bases, but "More than 95% of all random genes are shorter than 100 bases" (p.234). Indeed! This is the reason that ORF detecting software defines ORF's as a minimum of 100 codons (343).
According to Senapathy genes in organisms are often 9000 bases (=3000 amino acids) long (p. 234). So this is far above the length of the genes found in the computer generated sequences. Therefore, one must conclude that typical eukaryotic genes could not be formed directly from randomly assembled DNA.
Figure 7.4. See also the figure in PLOS ONE article were he repeats the idea.
Nevertheless, Senapathy is not discouraged by this result. He invokes 'a kind of processing' at the RNA level (see figure 7.4) that results in eliminating all the RNA sequences with too many stop codons (corresponding to introns) and combining long ORFs (corresponding to exons). So, he thinks the test for randomness is a success because he invoked 'processing' and because he thinks that is a permitted step in a Origin-of-Life scenario. But this 'processing' is nothing else but RNA splicing because it happens at the RNA level. The DNA stays intact, it does not change at all. Therefore, the DNA of extant eukaryotes must still have the signature of random DNA with abundant stop codons. But it does not. This kills the whole idea.
Please keep in mind that this RNA splicing must have ocurred before the origin of life, before there would be any complete genomes, before there were any cells at all. So he confuses what is happening today in the cells of living eukaryotes with a period in the history of the earth when no life existed. Further, RNA splicing occurs after transcription (a process he simply assumes) in the nucleus (he simply assumes).
A further problem with this idea is that he assumes that the key sequences necessary for proper splicing are just there. How convenient! Furthermore, the processing assumes full-blown splicing machinery. This is cheating. One must not only find exons! This is not shown in the computer simulation (Figure 7.1)! So we should not search for: TO BE OR NOT TO BE, but:
wxyzTOabcdefghi ... wxyzBEabcdefghi ... wxyzORabcdefghi ... wxyzNOTabcdefghi ... wxyzTOabcdefghi ... wxyzBEabcdefghi
( wxyz and wxyz represent splice recognition sites (363) and ... represents an intron). How else can the words (exons) TO BE OR NOT BE be recognized? (26) As we know now, the real number of bases essential for proper splicing is unlikely to be less than 10 and is plausibly as high as 30 (39). To be correctly processed to proteins, begin and end of exons need to be recognised. However, this would make the task of finding them in a random sequence far more difficult than Senapathy imagined.
Further questions: Senapathy does not tell us why there are only 3 stop codons and not 1 or 2 or not more than 3. If only 1 stopcodon existed, that would produce much longer sequences. Maybe there used to be only 1 stopcodon at the origin of life and later 2 were added. So Senapathy's assumption is unsubstantiated, and he could have used speculatively 1 stopcodon: 1,56% stop codons = 156 stop codons in 10.000 codons (=30.000 bases) = mean length = 64 codons = 192 bases (still too short).
Structure of an eukaryotic gene, in: Senapathy, Appendix: Genetics Primer, figure 10, p. 555.
Start codons: 25 Sep 2011, 25 Feb 2018 Although, he knows start codons (see figure), Senapathy mistakenly ignores them in his calculations. However, an Open Reading Frame (ORF) is defined as the sequence between a Start codon and a Stop codon. The necessity of Start codons (usually AUG) gives a further restriction of the length of ORFs. Start codons occur 1 in 64 codons (see genetic code table). If we combine Start and Stop codon frequencies, the predicted average ORF length would be reduced from 21 to 17 codons! (362) There is absolutely no room for introns and exons in such a short piece. So, the discussion of introns and exons is totally irrelevant.
- Transposable Elements 9 Nov 2012
Transposable Elements constitute two-thirds of our own genome and 85% of the corn genome (297). They originate by duplication. Moreover, TEs themselves encode an enzyme called transposase. That means that two-thirds of our own genome is not random DNA.
- Long terminal repeats 9 Nov 2012
Long terminal repeats (LTRs) are sequences of DNA that repeat hundreds or thousands of times. They are found in retroviral DNA and in retrotransposons. Because of the repeats they are not random DNA.
- Codon usage bias
The genetic code is redundant. The number of codons for amino acids varies from 1 - 6. However, the usage of synonymous codons is non random. In most sequenced genomes, synonymous codons (encode the same amino acid) are not used in equal frequencies. Some are rarely used, others with high frequency. For the amino acid LEU the most frequently used codon is used 140 times more often than the least frequently used codon (33). This is a genome wide bias. Some species such as Thermus thermophilus avoid certain codons almost entirely (306).
There should be no such huge codon bias when genomes arose by chance from the primordial pond. In the independent birth scenario all codons for an amino acid are expected to occur on average with the same frequency. There is an explanation for codon usage bias: (354).
- Non-random stop codons 4 Jan 2013
In the standard genetic code, there are 3 stop codons: TAA , TGA , TAG . In a random genome the stop codons are expected to have equal frequency (about 33%). However, the distribution of stop codons within the genome of an organism is non-random. For example, the E. coli K-12 genome contains 63% TAA, 29% TGA, and 8% TAG (wiki), so is highly non-random. However, note that the whole idea of a 'stop codon' depends on a full standard genetic code! And in order to function as a stop codon a Release factor is required, which is a protein. Where does that protein come from? See: elephant in the room.
- Mononucleotide repeats 18 Nov 11
Are nucleotide sequences actually used by organisms a random sample of all the possible sequences encoding that particular amino acid sequence, or do they deviate from a random choice? Short mononucleotide repeats occur at about the frequency expected by chance, but longer mononucleotide repeats are substantially rarer than predicted by the null model in all three organisms C. elegans, S. cerevisiae, and E. coli. For example, in E. coli, the codon TTT is avoided in favor of TTC at positions immediately followed by a T. This reduces the frequency of runs of four Thymines, and thus indirectly also of longer stretches of Thymine that necessarily contain runs of four. (237). Explanation: as mononucleotide repeats are prone to slippage during transcription and translation, the most parsimonious explanation is selection against error-prone nucleotide composition.
- Dinucleotide frequency 12 Nov 12 update
Since the genome of any organism arose randomly, each genome must have identical statistical properties. How could they differ significantly? For example, within the human genome each of the 4x4=16 possible dinucleotides (see table) should be present in equal frequency (6.25%). However, there is a notable depressed level of the CG dinucleotide (0.99% compared with 9.80% of TT dinucleotde) (108, p. 11). Vertebrates: CpG dinucleotides have long been observed to occur with a much lower frequency in the sequence of vertebrate genomes than would be expected due to random chance. On the other hand, there are CpG islands: genomic regions of several hundred base pairs with a high GC content. This alone refutes random origin of DNA sequences. See also: Chargaff's GC rule.
- Microsatellites, or short tandem repeats 28 sep 12
Microsatellites consisting of two, three or four nucleotides repeats and can be repeated 3 to 100 times. Especially, the higher repeats very unlikely occur in random DNA. They are non-random.
- Chargaff's cluster rule
Apart from the first parity rule (A=T C=G), Erwin Chargaff made three other fundamental observations on the base composition of DNA, which are only now being incorporated into mainstream biology. The first species-invariant observation was that individual bases are clustered to a greater extent than expected on a random basis (197) which contradicts random origin.
"Another consequence of our studies on deoxyribonucleic acids of animal and plant origin is the conclusion that at least 60% of the pyrimidines occur as oligonucleotide tracts [runs] containing three or more pyrimidines in a row; and a corresponding statement must, owing to the equality relationship [between the two strands], apply also to the purines." (197)
- Chargaff's second parity rule
The second species-invariant observation was that Chargaff's first parity rule also applies, to a close approximation, to single-stranded DNA. The validity of the rule became clearer when full genome sequences became available. For example, the "top" strand of Vaccinia virus has 63921 A, 63776 T, 32010 C, 32030 G (197). The bacterium Sarcina lutea has an extreme (A+T)/(G+C) ratio of 0.35 (Biopolymer Chemistry). If DNA were random 25% A, 25% T, 25% C, 25% G are predicted. So, again random origin is refuted.
- Chargaff's GC rule
The ratio of C + G to the total bases (A+C+G+T) (GC-content) tends to be constant in a particular species, but varies between species (197). The CG-content or AT/CG ratio of genomes is not 1:1. Already in 1952 CG percentages as low as 34,8% have been discovered in the DNA of insect viruses (34). In the bacterial kingdom the CG percentage varies from 25% to 75% (35). Acidianus infernus and Methanococcus jannaschii have 31% GC-content (255). A species with an extremely low GC-content is Plasmodium falciparum: ~20% (wiki). The genomes of extremophile organisms such as Thermus thermophilus are particularly GC-rich (wiki). Although some deviation from the mean of 50% is to be expected in a random sequence of DNA, extreme deviations are highly improbable and contradict random origin.
- A test for randomness of phase 0,1,2 introns 16 Mar 11
An intron could start between or within triplet codons. The same is true for the end of an intron. Between codons is called (phase 0), within codons is called phase 1 (after the first base) or phase 2 (after the second base). If introns have the same phase at the beginning and end, they are symmetric, otherwise they are asymmetric. There are three symmetric intron types (0,0), (1,1), and (2,2) and six asymmetric types (0,1), (0,2), (1,2), (1,0), (2,0), and (2,1). The theory of random origin of introns and exons predicts that all 9 intron phases should have approximately the same frequency in random DNA. Already in 1992 Fedorov et al. showed that the proportions of the three intron phases were significantly unequal. In 1993 Green et al. showed excess phase 0 introns and excess symmetric exons in ancient conserved regions (ACRs). Later publications based on GenBank give the same results. The proportions of three intron phases and frequencies of nine associations of introns showed significant nonrandom distribution: 48% phase 0, 28% phase 1, and 24% phase 2 and all symmetric exons (0,0), (1,1), and (2,2) showed significant excess over a random prediction (199). Conclusion: random origin is refuted.
- A test for randomness of proteins
Ptitsyn (136) concludes "that primary structures of proteins are basically just the examples of random amino acid sequences which have only been 'edited' during biological evolution". Pande et al (137) examine the possible random nature of protein sequences. The hypothesis was that proteins are slightly edited random sequences. They found pronounced deviations from pure randomness. Keefe & Szostak (138) state that "Functional primordial proteins presumably originated from random sequences". However, even if proteins emerge spontaneously prebiologically, this is irrelevant for the origin of life, because proteins cannot be translated in to DNA (Crick's central dogma of molecular biology). So, the result is useless for the question of random origin of DNA. Douglas L. Theobald (130) wrote "Many proteins probably do exist that have independent origins. For instance, in the Metazoa certain protein domains have probably evolved de novo that are not found in either Bacteria or Archaea. However, the independent evolution of unique Metazoan proteins, by itself, is not evidence for or against UCA." (UCA= Universal common ancester).
- Non-random genome architecture
Are genomes randomly arranged assemblages of genes or is gene order non-random? Genome architecture (that is, the order, spacing and orientation of genes in the genome) can be highly non-random (239).
- Low-Complexity Sequences or Repetitive Sequences
Eukaryotic genomes contain vast amounts of repetitive DNA derived from transposable elements (TEs) (269). The more Low-Complexity or Repetitive Sequences occur in eukaryote genomes, the more difficult they are to explain under the random origin scenario. The most difficult to explain are tandem repeats.
- Paul Davies about genomes
"If genomes are information-rich, then they have to be random (or almost so). If biological organization is random, its genesis should be easy. [!] The vast majority of possible sequences in a nucleic-acid molecule are random sequences. Only a tiny, tiny fraction of all possible random sequences would be even remotely biologically functional. A functioning genome is a random sequence, but it is not just any random sequence. It belongs to a very, very special subset of random sequences" (40)
So Senapathy is right about genomes being mathematical random, but its production is not easy because only a tiny, tiny fraction of the genomes are functional. Without knowing it, Davies refuted Senapathy.
Searching for exons, discarding introns and non-coding DNA
Let's ignore here that the average eukaryote has a genome size 1000 times larger than an average prokaryote (246, p.52), and let's ignore all other problems (see § 6: DNA) and focus on introns and exons. Senapathy searches for exons (protein coding DNA) and ignores introns (non-protein-coding DNA) in his random genomes.
However, if one cannot ignore introns, Senapathy's strategy would fail. Are introns just random noise as he assumes in his computer simulations? It is important to distinguish eukaryotic spliceosomal introns (which are not self-splicing and do not code for proteins)
from Group I (which are self-splicing) and Group II introns (which may code for proteins). Recently it has been discovered that introns do sometimes have identifiable functions (41). Also, comparative sequence analysis has revealed that the sequence of some introns is highly conserved, suggesting that functional contraints operate. Some of the observed conservation can be attributed to those sequences required for RNA splicing (150). In that case Senapathy is not justified in discarding introns and they should be included in his computer simulation of virtual genomes. But then his search for genes (exons+introns) in random DNA certainly would fail, just as it fails for prokaryotic genes.
What if introns are not random? updated 28 Feb 11
Senapathy's analysis is based on the idea that exons are non-random (functional) and introns are random DNA (non-functional). Maybe this is because it was generally assumed that the sequence of any given intron is junk DNA with no biological function. An indication of non-functional random introns is an intron length that is not divisible by three nucleotides (3n). The random character is also supported by Patthy (1999) based on data from Li & Grauer (1991) about substitution rates of spliceosomal introns. More recently, however, this is being disputed. For example, a point mutation in intron 7 of the human gene TPH1 is highly correlated to the development of the psychiatric disorder schizophrenia (wiki). Some introns are known to enhance the expression of the gene that they are contained in by a process known as intron-mediated enhancement (IME). One of the most important roles of introns currently under investigation is the transcription of the introns to small regulatory RNA, such as a type of RNAs called miRNA (microRNA). These small single-stranded RNAs regulate the expression of genes (wiki).
Additionally, introns need to be removed from mRNA, so they require precise recognition sites. The budding yeast Saccharomyces cerevisiae has a highly stringent seven-nucleotide branch-point sequence requirement (163). Nearly all eukaryotic nuclear introns begin with the nucleotide sequence GT, and end with AG.
Introns also encode a variety of untranslated RNAs including microRNAs, small nucleolar RNAs (snoRNAs) and guide RNAs for RNA editing (171). Some introns are known to enhance or be necessary for normal levels of mRNA transcription, processing and transport (171). Introns also contain many highly conserved elements: there are 100 ultraconserved elements shared by human, mouse and rat (171).
There are 154 highly conserved intronic sequences in the chimpanzee genome (198).
However, this is only a minority if the total number of introns is more than 100,000 as in humans. Nonetheless, those highly conserved intronic sequences are nonrandom sequences just as exons.
So, what if (some) introns are not random? If some introns are not random, they are just as improbable as exons, and cannot be ignored. In fact, the whole idea that one can ignore introns while searching for genes is wrong, because genes have a nonrandom number of introns and a nonrandom position within a gene (relative to the triplets). If human genes have on average 8 introns, plant genes 4 introns, fish and insects 3 introns (165), then Senapathy has to explain that this nonrandom pattern could originate by independent origin of genes from scratch. He failed to do so.
What about small introns?
Senapathy's theory predicts the same average intron length for all animals and plants based on statistics alone.
However, the eukaryote Oikopleura genome contains introns which are very small (peak at 47 bp, only 2.4% > 1kb)
(132). The unicellular eukaryote ciliate P. tetraurelia: most introns in its genome (>96%) are very short (<34 nucleotides) (239). Strikingly, all introns in intron-poor genomes of unicellular eukaryotes are short, with nearly uniform, apparently tightly controlled lengths and conserved, optimized splice signals at exon-intron junctions (246,p.235). In general, in vertebrates there are relatively long introns and short exons, whereas in lower eukaryotes, introns are short and exons are long (209). See also: insects with short introns: (344). This is a problem for Senapathy, because he needs long introns and short exons in every organism.
Identical intron positions not predicted
The positions of many introns are identical in orthologous genes of animals and plants (259). A small number of identical intron positions could be expected by chance, but too many is very unlikely. A random origin scenario should produce arbitrary intron positions in exons. Any position is acceptable as long as introns are removed precisely and reliably. So, identical intron positions refute the origin of introns-exons from random DNA.
It has been found that introns account for at least 30% of the human genome and may be a significant, perhaps major, source of regulatory noncoding RNAs (116). An experiment concerning the relationship between introns and coded proteins provided evidence that some non-coding DNA is just as important as coding DNA. This experiment consisted of damaging a portion of noncoding DNA in a plant which resulted in a significant change in the leaf structure because structural proteins depended on information contained in introns (113).
Genomes don't just encode protein-coding RNAs. They also give rise to various groups of noncoding RNAs that can regulate gene expression. Short RNAs that form from enhancer sequences might be one such class of regulatory RNA.
The concomitant increase in non-coding content of the genome with organismal complexity supports the proposition that evolutionary innovations and expansion of regulatory RNAs were fundamental to the genetic programming of complex eukaryotes (293). As far as I know Senapathy ignored promotors (near to a gene's transcription start site) and enhancers (far away). Without regulatory sequences a DNA sequence is not a genome.
Genes in the introns of other genes
There are 25 candidates for single-exon genes that are located within an intron of a gene on the same strand (125). Examples are the gene for neurofibromatosis type I (NF1) which contains 3 genes in the opposing strand of the intron; intron 22 of the Factor VIII gene contains 2 other genes; intron 17 of the retinoblastoma susceptibility gene RB1 contains another gene (167).
A substantial fraction of genes in complex eukaryotic genomes is contained within introns of protein-coding genes. In C. elegans only ~2.5% of protein-coding genes are so nested, whereas nearly 50% of non-protein-coding RNA genes found in introns (230).
Chlamydomonas reinhardtii (a unicellular green alga) has at least 70 small nucleolar RNAs (snoRNA) gene clusters within introns of protein-coding genes. This algae has the highest number of intronic snoRNA gene clusters among eukaryotes, and shows the functional importance of introns in a single-celled organism (231).
Conclusion: Senapathy's method is based on the idea that introns are meaningless random DNA. Now, we know this is wrong. So, according to his own logic the eukaryotic genome cannot arise in the primordial pool.
What if there is no absolute difference between introns and exons?
From the point of view of protein coding potential there is no difference between introns and exons. The previous examples of genes (in the opposite strand) in introns of genes (overlapping genes) is a demonstration of the principle. Also, intron splicing sites can be deleted or created de novo quite easily by mutation (179) as long as they are in frame. Also, different cell types of the same individual can interpret the same sequence of a pre-mRNA either as an exon or as an intron (Alternative splicing) (305). Furthermore, the definition of exons and introns is not absolute because there are weak and strong splice sites, and also exonic splicing enhancers and exonic splicing silencers.
Exonization is the process through which an intron becomes an exon (179),
and intronization is the process through which an exon becomes an intron (195). Human Alu elements located in introns can be exonized (191). In humans introns <100 bp in length are retained in 95% of the genes (186). Intron retention is most common in lower metazoans and is also common in fungi and protozoa (exonization). The prevalence of exon skipping gradually increases further up the eukaryotic tree (209). This means also that there is no absolute, static difference between introns and exons (intronization).
Figure from: 195.
Knowles and McLysaght (2009) demonstrate for the first time that human genes have arisen de novo from noncoding DNA since the divergence of the human and chimpanzee genomes (190). "In humans 2-5 % of the genes have been reported to retain introns" (187). Because of alternative splicing, the distinction between exons and introns is no longer absolute (179). Every sequence could be a useful RNA or protein (if translated). The difference between extant exons and introns is that exons (in combination with splicing machinery and splice sites) have been tested by natural selection for their usefulness
in the organism. In 2007 (181) it has been found that almost 10% of alternatively spliced human genes involves the retention of an intron. High levels of intron retention (30 % of alternatively spliced genes) in the plant Arabidopsis thaliana are reported (189).
De novo origin of introns: several examples were found in human genes in which insertion of Alu, a primate-specific retro element, into an exon created a new intron in the 3' UTR (untranslated region) (209). This means that the whole idea that introns just occur in random DNA is wrong. They can arise by insertion.
What if there is no absolute difference between coding and non-coding DNA? new 19 Jan 13
Any sequence of DNA that is (1) transcriptionally active, and (2) has a translatable open reading frame could be a protein coding gene. This looks good for Senapathy. However, 'reading frame' implies a 'reader'. What is the reader? Where does it come from? Pioneering research in 2006 clearly showed that new genes could originate from non-coding sequences in Drosophila. Levine et al. identified five novel genes in Drosophila melanogaster that were derived from non-coding DNA (307). Again, the point is: those novel genes don't do anything unless they are sitting in a cell. So, Senapathy is begging the question: where does that cell come from?
Noncoding DNA: junk DNA?
Senapathy is searching for protein coding genes (exons) in random DNA and ignores noncoding DNA. The most surprising discovery about the human genome is that the majority of the functional sequence does not encode proteins (271).
Protein-coding sequences, which comprise only ~1.5% of the genome, are dwarfed by functional conserved non-coding elements (CNEs) which constitute 6% of the human genome (271). Furthermore, about 80% of the cell's DNA showed signs of being transcribed into RNA (134, 287). The transcriptome of the fruit fly Drosophila melanogaster reveals that some 75% of the organism's genome is transcribed at one stage or another – in line with the widespread transcription observed in other species (203). In animals several hundred thousand up to 3 million strongly conserved noncoding regions (CNCs) with a mean length of 28 base pairs are scattered throughout vertebrate genomes (221). Furthermore, a considerable fraction of the junk DNA could be involved in chromatin structure maintenances and remodeling such as scaffold/matrix attachment regions (SARs/MARs) (117). Furthermore, in non-coding regions important transcription-factor binding sites (TF-binding sites) are found. Much of this has only recently been discovered. Senapathy could not know this. However, it invalidates his theory nonetheless.
What about long non-coding RNA?
Long noncoding RNAs are transcribed RNA molecules greater than 200 nucleotides. The ENCODE project revealed in 2012, that 75% of the human genome is transcribed into non-coding RNA, and that there may be between 10,000 and 200,000 long non-coding RNA (lncRNA). Scientists have shown that these can activate gene expression and silence genes. Do the lncRNAs contain stop codons? base triplets? introns? (311).
Eukaryotes with intronless genes, prokaryotes with introns
19 JUL 13
Let's ignore all other problems (see § 6: DNA) and focus on introns and exons. Senapathy claims he can find eukaryotic genes in random DNA because they are interrupted with random non-coding DNA (introns). What if eukaryotes have intronless genes too? Interestingly, many eukaryotic histone and GPCR genes are predominantly 'intronless'. A number of vertebrate 'intronless' genes have been compiled. The human genome report identified 901 single exon genes (source). Recently, intronless genes or single exon genes (SEG) have been discovered in eukaryotes (45). The process by which these genes are produced is making a DNA copy of processed intronless mRNA (called retroposition). The genes are called retrogenes. It is based on reverse transcription of mRNA (see: Steele review). These intronless retrocopies were long thought to be doomed to decay and were routinely classified as processed pseudogenes because of the expected lack of regulatory elements and the presence of deleterious mutations in many copies. Nevertheless, individual functional retrocopies (retrogenes) have been discovered since the late 1980s (118). Retroposition is an important mechanism of gene copying and produced a large number of functional genes in mammalian genomes. Proof is the existence of human intronless retrogenes and their parental intron-containing homologs (274). Retroposition produced approximately 1000 functional intronless genes in humans. Above the functional intronless genes, geneticists have found no less than 644 processed (intronless) pseudogenes on the human X chromosome (57).
Intronless genes in eukaryotes are more widespread than previously thought and do not necessarily depend on retroposition. A famous example is the SRY gene (Sex-determining region Y) which has only a single exon of 850 base pairs and no intron. In mammals 6% of the genes are intronless (196). This is more than enough to exclude the independent origin of mammalian genomes. Sakharkar et al (143) have identified 2017 expressed intronless genes in the mouse genome. About 5% of human genes lack introns; there would be at least 602 intronless human genes (182). Examples are: histones, olfactory receptor genes and G protein-coupled receptors (GPCRs) (more than 90% of mammalian GPCRs are intronless 182). According to other researchers, humans have 6229 intronless genes or 16.7% of the total number of genes (253). The spider mite Tetranychus urticae has 2966 intronless genes and the fruit fly Drosophila melanogaster has nearly 4500 intronless genes (241). Evolutionary analysis reveals that 56 intronless genes are conserved among the three domains of life--bacteria, archea and eukaryotes. Jain et al (144) reports the presence of 11,109 (19.9%) and 5,846 (21.7%) intronless genes in rice and Arabidopsis genomes. A total of 301 and 296 intronless genes from rice and Arabidopsis, respectively, are conserved among organisms representing the three major domains of life, i.e., archaea, bacteria, and eukaryotes. The yeast species Saccharomyces cerevisiae and Candida albicans (eukaryotes) are devoid of introns in >90% of their genes. Nearly all the genes (99.5%) of a red alga species (unicellular eukaryote) are intronless (49). Mammalian G-protein-coupled receptor (GPCR) genes (this protein family is one of the largest in the mammalian genomes) are characterized by a large proportion of intronless genes or a lower density of introns when compared with GPCRs of invertebrates (166). A very small minority of human genes lack introns and are generally very small genes, examples being histone genes, many small RNA genes, various neurotransmitter and hormone receptor genes and autosomal processed copies of intron-containing X-linked genes (167).
Drosophila melanogaster has
nearly 4500 intronless genes
Introns are very rare in animal mitochondrial genomes and animal mitochondria lack group II introns (153). The sequence of human mitochondrial genome shows extreme economy in that the genes have none or only a few noncoding bases between them (154).
Conclusion: many eukaryotic genomes cannot be found in a random piece of DNA using Senapathy's search strategy and consequently his theory of independent origin would fail. Of course, Senapathy could not have known the extent of this phenomenon when he wrote his book (46), but it nevertheless refutes his theory of independent origin of many eukaryotes.
Wide variety of intron-density in eukaryotic genomes
Recently, attention has been drawn to eukaryotic genomes with very few introns (142). Unicellular eukaryotes with compact genomes have only a few introns (149). The intron density of annotated eukaryotic genomes varies by more than three orders of magnitude: from 140.000 introns in the human genome (intron density 8.4 introns per gene) to only 15 introns in the microsporidian Encepalitozoon cuniculi (171). Eukaryotes dramatically differ in their intron densities, ranging from only a few introns per genome in many unicellular forms to over 8 introns per gene in vertebrates as well as some invertebrates like the sea anemone (258). Only a single spliceosomal intron has been found in the intestinal parasite Giardia lamblia (164). The average number of introns per gene in most multicellular species is between 4 and 7, whereas the average number for most unicellular eukaryotes is less than 2 (164). Additionally, species with compact genomes also have small introns: the insect Belgica antarctica has a mean intron length of 333 bp. Compare this with: Drosophila melanogaster: 955 bp. and Aedes aegypti: 3728 bp. So, the introns of B. antarctica are more than 10 times smaller (344).
But according to Senapathy, all eukaryotes should necessarily have the same intron statistics because all those genomes are random DNA sequences. There should be no systematic difference between unicellular and multicellular eukaryotes. Also the gene densities (genes/Kb) of unicellular eukaryotes and vertebrates differ by a factor 500 (246,p.233). This is not compatible with random sequences. Furthermore, the intron-poor compact eukaryotic genomes have close to the same low probability for random origin as prokaryotes, and thus could also not originate in the primordial pond for this reason alone.
Prokaryotes with introns
Group I and II introns are both found in some bacterial and organellar genomes (152).
Group I introns interrupt rRNA, mRNA and tRNA genes in bacterial genomes (wiki). Group I introns have highly conserved structure and function across all species in which they are found (150). Some group I introns encode homing endonuclease (HEG), which catalyzes intron mobility (151). Group II and III introns are similar and have a conserved secondary structure. Group II introns are ribozymes. Group II introns are now being found in unexpected numbers in bacterial genomes (here). Group II introns are found in ~25% of the sequenced bacterial genomes (165). Group II introns are also known in Archaebacteria. Group II introns are catalytic RNAs (ribozymes) that can self-splice in the absence of protein. A phylogenetic tree can be made for introns. Some introns code for a Reverse Transcriptase (RT) (here). Group II introns are complex elements that encode a large protein containing a reverse transcriptase domain and several accessory domains, and have a (nearly) uniform size of approximately 2.5 kb (149). Remarkably, almost all introns identified so far encode reverse transcriptase ORFs. So they are not random pieces of DNA.
All this means that Senapathy needs to revise his theory. Concerning introns there is no absolute distinction between prokaryotes and eukaryotes. The relevant distinction for his theory is not prokaryote/eukaryote, but number and length of introns of a species. And that results in a gradual scale from intronless, via intron-poor to intron-rich species. His theory must predict a minimum number and length of introns and Senapathy needs to specify that minimum. (See also: § 28).
Introns are spliced in the nucleus 19 Jul 13
Introns are splicde out in the nucleus of the eukaryotic cell. So, without nucleus and cytoplasm there can be no intron splicing. So, Senapathy has to explain the origin of the eukaryotic cell first. Explaining the origin of DNA with introns does not help to solve the origin of life. First, there has to be a cell. Senapathy overlooked an elephant in the room. See the complete elephant.
A static versus a dynamic genome
In chapter 3 Senapathy argues that 'The genome of every distinct organism is closed to evolutionary change'. In Senapathy's theory genomes are born directly from the basic DNA building blocks and they do not change thereafter. His genomes are immutable. This static view conflicts with reality. Real genomes show characteristics of change. Real genomes are dynamic. Nearly 45% of the human genome is made up of jumping genes, Transposable Elements (TEs), retrotransposons or mobile elements. The first transposons were discovered in maize (Zea mays), by Barbara McClintock in 1948, for which she was awarded a Nobel Prize in 1983. Transposable elements are relatively short sequences of DNA which can copy themselves and insert themselves randomly elsewhere in the genome. For example, our genome contains some 1.4 million copies of 300 base pairs (called Alu). Many of these Alu elements are continuing to multiply and insert themselves in new locations in the genome at a rate of about one new insertion per every 100 to 200 human births (59). Recent estimations are 1 Alu insertion for every 20 births in humans (242). Alu elements are the most successful Transposable Elements in the human genome. The human genome includes also more than 500,000 dead copies of L1 Transposable Elements and roughly 100 active L1 copies, that are able to spawn new L1s that jump to new chromosomal locations (206). A few short repetitions can be expected by chance, but not more than a million repetitions. A randomly generated genome does not contain such highly repetitious patterns. Senapathy's random genome model fails to explain this pattern. Furthermore, the active mobile elements are not random sequences, they contain open reading frames with genes which allow to copy, move, and insert themselves.
Unexpected variability in human genomes exists. Not only do we carry different copy numbers of parts of our DNA, we also have varying numbers of insertions-deletions (INDELs) and other major rearrangements in our genomes. There are at least 297 places in the genome where different individuals have different forms of these major structural variations (62). More than 800,000 of the small INDELs map to human genes, including 2,123 small INDELs that mapped to the coding exons of these genes, and more than 39,000 INDELs in the promoter regions of genes (205).
Somatic Genome Mosaicism: But even within an individual, genomic variation exists. When individual cells of an organism have mutations this is called 'Genome Mosaicism'. For example: Somatic mosaicism can be caused by L1 transposition during embryogenesis. (315).
Polyploidization: Another dynamic aspect of genomes is polyploidization. Example: about 100 million years ago the genome of a yeast ancestor duplicated, doubling the number of chromosomes from 8 to 16 (222). Most plant species have experienced at least one genome doubling early in their history (223) and about 35% of vascular plant species being recent polyploids ("neopolyploids": having formed since their genus arose) (224). The bread wheat (Triticum aestivum) genome is hexaploid, meaning that it contains 6 sets of chromosomes, which derive from three different diploid genomes (302).
The record-holder is Celosia argentea which is a dodecaploid (twelve sets; 12x). Think about it: all polyploids could not originate in the primordial pond. Only the original (unduplicated) genome could (ignoring all other problems with the theory).
All this is evidence of an ever changing genome and it refutes the idea that genomes arise only once in the primordial pond and are static and immutable. Remarkably, when criticizing Darwinism, he knows about the dynamic nature of the genome, but when constructing his theory of immutable genomes Senapathy forgets the dynamic genome completely.
A DNA sequence is not a genome
13 Mar 13
"We shall see how the random combinations of genes in a primordial pond could lead to the assembly of numerous genomes." (p.9).
But 'genes' or 'genomes' do not exist without a genetic code and the universal genetic code is the strongest argument against independent origin, because truly independent origin predicts different genetic codes for each independently born organism. This prediction fails spectacularly. Before I can explain this, I need to point out a few characteristics of the genetic code.
Summary of the argument of this paragraph
- A DNA sequence is not a genome: a genetic code is required
For protein coding genes a genetic code is required. Without a genetic code a DNA sequence itself is just a meaningless polymer.
The genetic code specifies how DNA is translated into proteins. Given that there are 4 different bases in DNA and that they are read in triplets, it follows that there are 4×4×4 = 64 different triplets. Those 64 triplet codons are translated into 20 amino acids. That is the genetic code. The most remarkable fact is that all living creatures on earth have the same genetic code! One of the most profound questions one can ask is: Why is there a nearly (162) universal genetic code when there are millions of
possible ways to connect 64 triplet codons with 20 amino acids? To be precise: 1.5 x 1084 possible codes (127, p.163). But all these possible code assignments assume 64 triplets of A,T,C,G and the same 20 amino acids. But why assume this? We must distinguish between DNA in general and the genetic code:
DNA in general:
The genetic code
- Why only 4 bases while more than 100 nitrogenous bases are possible? (160). Why not use 20 different bases to code for 20 amino acids? (111).
Why are there 2 types of bases: purine and pyrimidine bases? The problem of prebiotic synthesis of the DNA bases is the same for independent origin and evolution, except that life did not start with DNA (but with RNA or something even more simple).
However, the fact that the 4 bases in DNA are common to all life is only a problem for independent origin.
- Life universally uses the same 20 amino acids. Why only 20 amino acids while 150 amino acids are possible in nature? (160). Why exactly those 20 amino acids and not others? (the universal use of the same 20 amino acids points strongly to common descent of all life).
Only 12 of the 20 amino acids can be synthesized prebiotically, the other 8 can only be synthesized by living organisms (208). So independently arisen genomes coding for 20 amino acids cannot function because 8 amino acids are missing.
- Why does the DNA backbone universally consist of deoxyribose and phosphate? And why does RNA use ribose instead of deoxyribose and Uracil instead of Thymine? The sugar deoxyribose is harder to make, and in present-day cells it is produced from ribose in a reaction catalyzed by a protein enzyme, suggesting that ribose predates deoxyribose in cells. (260). Alternatives for DNA are chemically possible (256). Senapathy a priori excludes any evolution from simpler bases and sugars to the current ones.
The genetic code:
- Why 3 bases (triplet) as the unit of the genetic code? A code with only two bases and a codon length of 4 (quadruplet codons) is also possible (111).
- Why a commaless triplet code? (126)
- Why a non-overlapping code? (126)
- Why a degenerate code? (redundant coding): why are there 61 meaningful triplet codons? 20 triplets would be enough, so 41 codons are redundant. What is the problem if 41 'codons' are unused? Is it necessary that all 61 codons are used? No, certainly not. Yes, we know now that all 61 codons are used, but the independent scenario must explain this fact.
- Why does the code has polarity? (code must be read in a fixed direction)
- Why choose 3 'stop' codons (but see: elephant in the room!) UAA, AUG, UGA? Why not 1, 2 or 4 'stop'-codons? Genetic codes with only 2 stop codons do occur naturally in eukaryotes (335) and prokayotes (338), (204). In some ciliates, like Paramecium, only UGA is a stop codon. This changes stop-codon statistics from 3/64 to 2/64 or 1/64. Quite a difference! Senapathy does not give a justification for using 3 'stop-codons'.
Another question: Why is there a strong bias for 'stop' codon usage (frequency) in different species? For example, in Escherichia coli W3110 the codon UAG has a frequency of 0.2 per thousand compared with CUG has 53.1 per thousand (source). Furthermore, the non-random assignment of the 3 stop codons (they start with U, and two start with UA) conflicts with the random origin theory.
- Why only one and nearly universal Start codon (362) (AUG = MET = methionine)? A start codon is necessary because bases must be read in triplets and the correct base must be identified as the first of the triplet. The start codon must hold throughout the whole genome. Misidentifying the first base of a triplet would result –if translated at all– in a different amino acid and consequently in a different protein (Just as in case of a one-base deletion or insertion).
- The degeneracy (redundancy) of the code is not random. The codons for an amino acid are clustered.
This is the structure of the genetic code.
Info: a redundant code is a logical consequence of 64 triplet-codons coding 20 amino acids and 3 stop codons. So, on average an amino acid is coded by 3 codons, the real number varies from 1 - 6. Mostly, the codons for a particular amino acid have the same first two letters. The third letter varies most. This is not random! Furthermore:
- Why does the genetic code minimize the chemical consequences of amino acid mistakes? Optimal code, Steve Freeland (234).
- Neighboring triplets in the genetic code tend to specify biochemically similar amino acids, so that single-nucleotide substitutions rarely lead to radical amino acid replacements (238). This is a non-random code!
- Why is there a codon usage bias? (see also: codon usage database). Some codons for the same amino acid are used very frequently, others rarely. Codon usage should be random in the independent origin scenario.
Info: this is not about the structure of the genetic code, but about how often the codons are used in a genome. Statistically, one would expect that all codons occur in about the same frequency in DNA. This is not so. For example. there are differences in the frequency of occurrence of synonymous codons (139). For the amino acid LEU the most frequently used codon is used 140 times more often than the least frequently used codon (33). In the bacterial kingdom the C+G percentage varies from 25% to 75% (35).
- Why is methionine the starting amino acid in the synthesis of most protein chains if DNA arose spontaneously?
(Initiation Codons, eukaryotes)
A lot of freedom of choice! If all these variations are chemically possible, why is there only one genetic code? Every Origin of Life theory has to explain this restriction, but:
If organisms originated really independently, there should have been as many variations of the code as independently born organisms. (161)
The reason is: each independent origin is a new trial. The probability is zero that in a few million trials (= the number of species) the outcome would always be the same result, considering the 1.5 x 1084 possible genetic codes alone. And that is assuming 64 triplets of A,C,T,G and 20 amino acids. Let alone all the variations above. So, independent origin spectacularly fails.
- Chemical necessity?
The only possible escape would be that the genetic code would be a chemical necessity. In that case the same genetic code would be produced automatically again and again. Is the code a chemical necessity? Is seems now likely that only 25% of the codons can be explained by a chemical affinity of amino acid and codon-RNA, and 75% of the codons are arbitrary assigned (127, p.174). The 25% follows from the universal laws of chemistry and thus will be the expected outcome on any planet. However, 75% of the genetic code must be explained by common descent. That is an inherited and 'frozen' accident.
Independent origin cannot explain universal arbitrary features of the genetic code because there cannot be millions and trillions of trials with exactly the same outcome!
Unless one particular 'frozen accident' is inherited by all descendant species. This is called evolution by common descent and modification. Of course the theory of evolution needs to explain how the genetic code originated, but the fact that every species has the same genetic code can only explained by common descent. The frozen accident theory may be not the most elegant theory, but at least it is not ruled out by probability theory! Any talk about 'common pool of genes' destroys a truly independent origin of life.
Conclusion: the universal genetic code is the strongest argument against independent origin I can think of (128).
- Inheritance of the code
Additionally, and crucially, despite being largely the result of an accident, the code is encoded in DNA itself. The encoding of a codon to its amino acid is a result of the aminoacyl tRNA synthetase which adds the aminoacyl group to its allocated tRNA. This embodies the connection between DNA triplets and amino acids (82). The DNA sequence of an organism could be random, its translation cannot be random. The translation must be a fixed key, otherwise it cannot even be called a 'translation'. This fact alone destroys the random origin of a genome. The rest of this page constitutes additional evidence.
- Evolution of the code
"The important point to realize is that in spite of the genetic code being almost universal, the mechanism necessary to embody it is far too complex to have arisen in one blow. It must have evolved from something much simpler. Indeed, the major problem in understanding the origin of life is trying to guess what the simpler system might have been". Francis Crick (1981) Life Itself, p.71 (184).
Is gradual evolution of the genetic code compatible with independent origin? Independent origin of complete eukaryotic genomes must assume the full set of 20 amino acids, because the current genetic code codes for 20 amino acids. Less than 20 will prevent the production of functional proteins. It cannot start with a subset of amino acids and later add additional ones (as in evolutionary scenario's). Evolutionary scenario's make the problem easier. For example, Wong proposed the evolution of the genetic code in two phases. In the first phase 6 - 10 amino acids were assigned to 61 codons. The code then expanded by the addition of phase-2 amino acids (145a). Any gradual evolution of the genetic code is incompatible with Senapathy's scenario. Not only the genetic code but also the genome would have to evolve. Both are (and must be) fixed in Senapathy's scenario. So, the burden to explain the abrupt origin of a full-blown 64/20 genetic code rests on Senapathy's shoulders. He did not address this difficulty (see also § 27: PLOS One article). See for other proposals of the evolution of the genetic code: (158).
|The Standard Genetic Code contains 18 fragile codons (red) that can be changed into a STOP codon by a single point-mutation and whose mistranscription can therefore generate nonsense errors. The remaining 43 sense codons are "robust" to such errors.|
Six amino acids are encoded exclusively by fragile codons ("fragile amino acids", shaded), ten amino acids are encoded exclusively by robust codons ("robust amino acids", unshaded) and four amino acids can be encoded either by robust or fragile codons ("facultative amino acids", hatched shading).
We observe a 8% depletion of fragile codons in single-exon genes in the human genome that is highly significant.
After Brian P. Cusack et al (240)
Considering all these problems with the spontaneous origin of the genetic code, one wonders why the first genomes did not consist of DNA genomes with only non-coding RNA genes which do not need a genetic code? There are 8,801 small RNAs and 9,640 long non-coding RNAs (lncRNAs) (287) totalling 18,441 RNA genes. One step further is the question:
- Why a DNA genome?
Why are the first genomes made of DNA? What's wrong with RNA genes or RNA genomes? Double-stranded RNA viruses do exist.
RNA genomes would be simpler because they could consist of rna genes which do not require a genetic code to be present, since no proteins have to be produced.
- A sequence constructed with any 4 'letters' is not DNA
A sequence constructed with 4 different 'letters' is not DNA because the letters must be able to chemically pair specifically and reliably. They must consist of pairs. For example: A must pair with T, C with G. So not any 4 letters will do. A restriction applies which cannot be ignored. Furthermore, they must not only be able to pair, they must be able to separate temporarily for DNA replication. This pairing is completely absent from all computer simulations. Strangely, it seems irrelevant for computer simulations. But it cannot be ignored if one wants to solve the origin of life because DNA cannot exist without pairing. "All organisms, from bacteria to humans, face the daunting task of replicating, packaging and segregating up to two metres (about 6 × 109 base pairs) of DNA when each cell divides." (340). See: cell division (below).
- A single-stranded linear sequence of 4 letters is not DNA
A sequence constructed with 4 different letters is not DNA because Prokaryotic and Eukaryotic DNA have a double stranded helix. Single-stranded DNA (ssDNA) viruses exist. Double-stranded DNA (dsDNA) creates a problem. In nature usually only one strand of a particular region in DNA (sense strand, or positive sense, coding strand) is translated into proteins. The other non-sense or antisense strand ('non coding strand', 'complementary strand', 'antisense') is not translated. Senapathy's example in figure 1 is a virtual single-stranded DNA sequence and is supposed to be the coding strand. He ignores that both strand could be used as a code. How is it decided what is the coding and what the complementary strand?
In most organisms, the strand of DNA that serves as the coding template for one gene may be noncoding for other genes within the same chromosome (341). Amazingly, there are several protein coding genes encoded in the opposing strand of another gene! (Overlapping genes). These are genes within genes. Example is the gene for neurofibromatosis type I (NF1) which contains 3 genes in the opposing strand of the intron (167). If genomes are random sequences, then Senapathy should predict that genes are located randomly on either strand. If there is an excess of genes on one strand, the hypothesis is falsified. The problem is that his approach is incapable to incorporate all these complications because he uses only single-stranded sequences.
- A linear sequence of A,T,C,G is not a gene
A sequence of the bases A,T,C,G of arbitrary length is not a gene (or exon) because the length must be a multiple of 3. The reason: the code is a triplet code. This objection is related to the genetic code (above).
- Double-stranded DNA does not form spontaneously 1 Feb 2012
The spontaneous formation of double-stranded DNA has never been observed. Double-stranded DNA is always formed by semi-conservative replication on the basis of an pre-existing double-stranded DNA polymere (Meselson-Stahl, 1958). So, of each double-stranded DNA molecule one strand is inherited as it is from the mother DNA molecule and the other strand is newly synthesized.
- DNA requires a 'DNA synthesizer'
Even if we assume statistics allows for a complete eukaryotic genome in a random sequence of A,T,C,G, then still those DNA sequences must be synthesized from chemical building blocks: bases, deoxyribose, phosphate. It turns out that this is extremely difficult abiotically (83). Enzymes are required: Deoxyribose is generated from ribose 5-phosphate by ribonucleotide reductases. Nucleotides, the building blocks of DNA have never been produced in any prebiotic synthesis experiment (365). The enzyme dihydrofolate reductase is required for making nucleotides. At the same time enzyme inhibitors prevent DNA synthesis. This is fatal for de novo DNA synthesis. Finally, DNA polymerase III is well-known for its blazing catalytic speed of ~ 1000 base-pairs per second (360), so replicating DNA without any enzymes would take an eternity. Even in the lab it is difficult to synthesize a small eukaryotic chromosome, let alone a complete genome (329), (330). (Mainstream science has the RNA-world and the Pre-RNA world), but this is unavailabe in Independent Birth of Organisms.
- DNA synthesis costs energy
"Each building block needs to be activated, or chemically charged, before it can be incorporated into a polymer. Activation requires a preexisting source of chemical energy." (256). "It has been known for nearly 20 years that chromatin assembly is an ATP-dependent process" (146). There are two energetic components: the costs of nucleotide synthesis and the polymerization cost needed to make a DNA or mRNA molecule (188). Finally, DNA synthesis costs more energy than RNA synthesis.
- Even the smallest genome is too long 2 Aug 2012
The genome of the urogenital bacterial parasite Mycoplasma genitalium is 582,970 base pairs long (525 genes), making it one of the smallest genomes of any independently dividing cell – for comparison, the gut bacterium Escherichia coli has 4.6 million base pairs and around 4,200 genes (282). Compare with the smallest eukaryotic genome of Encephalitozoon intestinalis: 2.25 million base pairs. Despite this 'small' genome size, the genome of Mycoplasma genitalium is too long to originate spontaneously (apart from all other requirements for a genome).
- A genome is not an arbitrary collection of genes (1) 6 Aug 2012
Even if all eukaryotic genes are produced abiotically, on statistical grounds it can be excluded that random assembly of those genes will produce a viable eukaryotic genome. Let alone an eukaryotic organism. The number of combinations of genes and their interactions that need to be probed (by natural selection!) is infinitely large, so it would take an infinite amount of time to test the whole thing. "It is not feasible to understand evolved organisms by exhaustively cataloging all interactions in a comprehensive, bottom-up manner". For only 10 genes there are already 115,975 possible interactions (283). And then we ignore when (development), where (tissue, organ), at what level genes are expressed and when they are shut down.
- A genome is not an arbitrary collection of genes (2) 23 Jun 2016
"Animals are more than the sum of their genes – it is the regulated expression of genes across space and time that helps to differentiate egg from embryo, leg from wing or bat from fly.". (356)
- A genome is not an arbitrary collection of chromosomes 29 Sep 2012
Aneuploidy is a wrong number of chromosomes in a cell. Carcinogenesis has been shown to be initiated by random aneuploidy (292). Aneuploid embryos usually die.
- RNA genes are ignored 24 Sep 2012
Genes that code for proteins have start en stop codons and are translated into proteins following the Genetic Code table. However, by mid-2009 evidence for at least 6000 human RNA genes had been obtained. RNA genes are difficult to identify using computer programs: there are no open reading frames (ORFs) to screen for (274, p. 262). Senapathy's search for genes depends on Open Reading Frames (sequences without STOP codons). This method fails to find RNA genes. The ENCODE project found 18.441 RNA-genes (290). RNA genes that were known when Senapathy wrote his book are Transfer RNA (tRNA) (mentioned on page 556) and Ribosomal RNA (rRNA) genes (not mentioned).
- A DNA sequence is not a chromosome
Animals and plants cells do not contain naked DNA. For example, the diploid human genome contains 6 billion base pairs of DNA per cell with a total length of 2 meter packaged into 23 pair of chromosomes. Because each base pair is around 0.34 nanometers long (one-billionth of a meter), each diploid cell therefore contains about 2 meters of DNA. This creates the DNA Packaging Problem: (supercoiled DNA) How is all of that DNA packaged into chromosomes and into the nucleus? (320) and how is it unpackaged and unwinded in order to read genes.
A DNA sequence is not a chromosome
Human chromosomes under a scanning electron microscope. ©Nature2017
"A DNA sequence isn't enough; to understand the workings of the genome, we must study chromosome structure. Far from being the random result of packing 2 metres of DNA into a sphere perhaps 10 micrometres across, the structures vary across cell types and exert an as-yet-mysterious influence on gene expression." (176).
"We usually think of genomes abstractly as one-dimensional entities that are purely defined by their linear DNA sequences. Reality, of course, is far more complex. The DNA helix is folded hierarchically into several layers of higher-order structures that eventually form a chromosome" (177).
"Comparing the length of metaphase chromosomes to that of naked DNA, the packing ratio of DNA in metaphase chromosomes is approximately 10,000:1 (depending on the chromosome). This can be thought of as akin to taking a rope as long as a football field and compacting it down to less than half an inch. This level of compaction is achieved by repeatedly folding chromatin fibers into a hierarchy of multiple loops and coils." (320)
A DNA sequence is not a chromosome because important chromosomal structures must be present:
At least these structures are present in linear chromosomes in eukaryotes, not in the circular chromosomes
of most bacteria. Although linear chromosomes probably have advantages, they come with problems: replication of linear chromosomes presents a problem as it leads to gradual loss of the terminal telomeric regions, the telomeres.
telomeres (white dots), centromeres (red dots)
(Science 22 April 2011)
Telomeres are structures at the ends of chromosomes that contain a series of non-coding DNA repeats, and which become shorter themselves but protect the coding regions from damage. Human telomeres are several kilobases of repeated sequences of DNA bound by specialized protective proteins. A peculiarity of the DNA-replication mechanism causes telomeres to shorten as cells divide. Sometimes the enzyme telomerase can replenish the lost DNA. If telomeres get too short, through aging or because telomere maintenance goes awry, cells can stop dividing. The protection conferred by telomeres is a fundamental biological mechanism present in nearly all animals and plants (119). 'Independently born' organisms (if they exist) have at best telomeres with a random length, simulating cells and chromosomes of random age –from old, average to young. That means many wil not be able to complete replication.
- Centromeres are defined as pieces of non-coding repeated DNA with variabel DNA sequence that function as the site of spindle attachment at cell division (mitosis and meiosis). They are essential for equal chromosome separation during cell division. Remarkably, centromeres are inherited epigenetically. That means DNA sequence does not determine the identity of the centromere. This implies that ultimately the centromere is inherited from a previous generation. There is no previous generation in the Independent Birth of Organisms hypothesis.
- Histones are proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. Without histones human DNA could not fit in the nucleus because it would be too long. DNA without histones could certainly not enter mitosis. Histones are highly conserved in eukaryotes, so random origin is virtually excluded. Histones have also a function in gene regulation. Along with histones come more than 20 histone modifiers, chaperones and other regulators. Important question: Where do histones come from in the independent origin scenario? It does not help that histones are encoded in the DNA, because how are they supposed to be transcribed into mRNA and translated into protein? (see below)
- Nucleosomes consist of DNA wrapped two times around small, globular histone octamer particles. They form the fundamental repeating units of eukaryotic chromatin, which is used together with condensin (359) and topoisomerase II to pack the large eukaryotic genomes into the nucleus while still ensuring appropriate access to it. Already in 1987 a publication appeared pointing out the relation between chromosome structure and gene expression.
"Since 1968, we have learned that DNA wraps around histones, packing ~102 base pairs into the 10–8m nucleosome. We also know that individual chromosomes occupy distinct subnuclear volumes called
chromosome territories which pack ~108 base pairs into 10–6 m (243).
©Science 28 Jul 2017: ChromEMT: Visualizing 3D chromatin structure
and compaction in interphase and mitotic cells.
- sex chromosomes: eukaryotic genomes come in two forms: male and female genomes. Males have a unique chromosome that females don't have: the Y-chromosome. Because males have only one X-chromosome and females two X-chromosomes, a mechanism is necessary to ensure genes on the X-chromosome are expressed in the right levels. Dosage-insensitive genes are those that function perfectly well when present as a single copy. By contrast, two copies of dosage-sensitive genes are required for normal health. (dosage compensation, X-inactivation).
- Linear chromosome: Here is the fundamental problem for the theory of independent origin: why should the most complex chromosome type, the linear chromosome, with all its problems, arise spontaneously in stead of the circular chromosome? For example: why don't humans have 46 circular chromosomes? Or why don't humans have one big circular or non-circular chromosome? On the other hand, evolution theory needs to explain the evolution of the linear chromosome (120).
- A DNA sequence is nothing without proteins:
Ribosome Recycling Factor,
DNA repair enzymes,
splicing regulatory proteins,
Here is the paradox in a nutshell:
To produce a protein from DNA specific proteins are needed. It does not help that those proteins are encoded in DNA. Surely, every necessary protein could be encoded in DNA. The point is that DNA needs to be transcribed and translated (250). That requires enzymes to be present in the first place. 'The central dogma of molecular biology' states that genetic information encoded in DNA is transcribed to mRNA (by RNA polymerases (321)), and mRNA is translated to protein (by ribosomes). Since RNA-polymerase is a protein itself, it needs to be present before it can be produced (275). This is certainly impossible for the first 'independent organism'. This alone is fatal for the theory of independent origin. This is why the origin of any organism from pure DNA is impossible! This is also why scientists concluded that there must have been a RNA-world before a DNA-protein world. This is also why even giant viruses with all the genes for mRNA synthesis, etc. still depend on living cells for their reproduction (See: here).
Here are three examples:
A DNA segment is copied into RNA by RNA polymerase. Gene expression is jointly controlled by two classes of protein: transcription factors and epigenetic regulators. Transcription factors act by binding directly to DNA, whereas epigenetic regulators can influence gene expression in various ways, for example by altering histones. There are more than 1000 transcriptional activator proteins encode in the human genome. Even in the simplest well-studied eukaryotes (yeast) there are perhaps 200 activators (228). 11 Oct 11. See: Transcription.
"Translation is probably the most complex biochemical cellular process, needing more than 120 different molecular elements
ranging from messenger RNA to ribosomes and their many protein and RNA accessories. According to Fraser and coworkers,
even the smallest of cellular organisms (Mycoplasma genitalium) need a minimum of 90 different proteins for translation and about 30 for DNA replication." (185), (250).
The eukaryotic ribosome is composed of 79 ribosomal proteins. "More than 200 assembly factors and small RNAs are needed to synthesize ribosomes in the nucleolus. Ribosomes are absolutely essential for life, generating all cellular proteins required for growth. Complete loss of any single ribosomal protein often leads to death of the embryo in mice" (317). See: Translation.
Every cell division requires DNA replication. DNA does not 'self-replicate'. Replication enzymes (DNA polymerases) are requited. At least 15 DNA polymerases operate in human cells. Human DNA polymerases are 900-1000 amino acids long (2700 - 3000 base pairs). The minimum set of proteins required to initiate DNA replication in eukaryotes (Saccharomyces cerevisiae) is 16 proteins (351). "The work of a generation of biochemists, notably Arthur Kornberg, has shown that it takes dozens of protein complexes, each involving many proteins to accomplish this [replication]. They can be thought of as complex components of several giant molecular machines, which synthesize the new DNA, check it for errors, and pass it on for further interactions which package it in chromosomes." (264). The polymerases responsible for replicating nuclear DNA are at least 100-fold faster and nearly 1,000-fold more accurate than polymerase η which is involved in DNA repair caused by ultraviolet light (278). See: Replication.
- A DNA sequence is nothing without RNA 18 Feb 12
DNA can do nothing without RNA. In fact, DNA cannot even replicate without the prior formation of an RNA primer (270, p.66).
- DNA specifies amino acids but does not synthesize amino acids 1 Feb 12
DNA specifies the sequence of amino acids in proteins but does not synthesize amino acids.
So, when there are no amino acids available (for example essential amino acids), protein synthesis is impossible.
To synthesize 'non-essential' amino acids the cell needs specific enzymes. Those enzymes can only be produced when the right amino acids are available. This is because enzymes are a sequence of amino acids.
- DNA sequence is not sufficient to produce proteins 13 Jul 11
About one quarter to one third of all proteins require metals to carry out their functions (metalloproteins: iron, copper, magnesium, cobalt, zinc, molybdenum, vanadium, manganese, nickel, selenium). For example, iron and copper are present in virtualy all enzymes and in some proteins that interact with oxygen (215). However, metals are not coded by DNA! So, protein specification by DNA is incomplete. Furthermore, most of the Mg2+ in a cell is bound to DNA, to RNA, to the cellular energy carrier ATP or to enzymes, and acts as an essential cofactor for these molecules. Mg2+ is not in the Sequence.
- The Transcription Code 18 Feb 11
The genetic code is not the only code. The processing of the information of DNA starts with transcription, which requires:
Where does the Transcription Code come from? It is not random. Furthermore, there is a transcription factor binding code within (!) the protein coding part of a gene (324).
- each gene has its own promoter element (contains a conserved gene sequence called the TATA box) and enhancer element(s)
- initiation codon: AUG is an initiation codon or start codon. The next gene does not start right after the stop codon, so an initiation codon is needed. Senapathy ignores this.
- termination codon UAA that stops transcription.
- The Translation Code 30 Mar 12
These are all nonrandom sequences and therefore lower again the probability of spontaneous origin.
- Kozak consensus sequence is a non random sequence which occurs on eukaryotic mRNA and plays a major role in the initiation of the translation process.
- ribosomal binding site (RBS) is a non random sequence on mRNA that is bound by the ribosome when initiating protein translation.
- Internal ribosome entry site (IRES) is a nucleotide sequence that allows for translation initiation in the middle of a messenger RNA (mRNA) sequence as part of the greater process of protein synthesis.
- The Histone Code 4 Oct 11
The histone code is a hypothesis that the transcription of genetic information encoded in DNA is in part regulated by chemical modifications to histone proteins, primarily on their unstructured ends. Together with similar modifications such as DNA methylation it is part of the epigenetic code. (see Epigenetics below)
- The Splicing Code 18 Feb 11, 9 Mar 11
The very existence of introns requires splice site recognition: the border between intron and exon. So, Senapathy simply assumes splicing machinery when talking about introns and exons. This is the splicing code. Where does it come from? How is it done? Why that specific code? Why the universal splicing code? The splicing sequence is in DNA, but that is only 'a code' if specific proteins (splicing regulatory proteins) recognize that sequence. So, it does not solve the problem of the origin of split genes to search for splicing codes in random DNA. Of course one will find it in random DNA. The point is, however, that the splicing code is not random, so where does it come from? The spliceosome is a very complex ensemble of five snRNAs and about 200 proteins, so cannot arise from scratch. Furthermore, alternative splicing implies different proteins in different cell types. Furthermore, it has been shown that more than 20,000 unique Single Nucleotide Variants, SNVs, likely affect splicing (349). In addition to splicing, eukaryotes possess elaborate mRNA surveillance mechanisms, in particular nonsense-mediated decay (NMD), to assure that only correctly processed mature mRNAs are translated (170).
There are two splicing codes: exonic splicing sites and intronic splicing sites. The exonic splicing sites implies a dual code because a piece of DNA encodes a protein and a splicing signal at the same time (303). There maybe a third function: nucleosome positions bias certain synonymous codons.
All this effectively blocks the origin of splicing from scratch.
- The Poly(A) Code 18 Feb 11
The information in DNA is modified before it is used. No equivalent to poly(A) or the caps are in DNA and these are added to the mRNA (Polyadenylation). Therefore, a DNA sequence is not enough. The mRNA is modified. This is crucial for exporting mRNA from the nucleus to the cytoplasm. Alternative polyadenylation can also shorten the coding region, thus making the mRNA code for a different protein. Again: the information in DNA is not enough.
- The Epigenetic Code: imprinting
"The major problem, I think, is chromatin. What determines whether a given piece of DNA along the chromosome is functioning, since it's covered with the histones? What is happening at the level of methylation and epigenetics? You can inherit something beyond the DNA sequence. That's where the real excitement of genetics is now."
(James D. Watson: 30).
When the human genome was first fully sequenced, it was often described as the recipe for making a person. In reality, the genome is more like an entire cookbook that can produce hundreds of different cell types depending on which genes are switched on and off. That switching is accomplished using a vast suite of epigenetic marks (284).
Epigenetics is defined as the chemical modification of DNA that affects gene expression but does not involve changes to the underlying DNA sequence. As the emphasis in biology is switching away from 'The Sequence' and towards the mechanisms by which gene expression is controlled, epigenetics is becoming increasingly popular (104). Cell differentiation is associated with selective DNA methylation.
"Imprinting reflects competition between a mother's interests and a father's when it comes to gestating the offspring. A mother wants a fetus that doesn't grow too big, so she can survive the pregnancy. A father wants the opposite: a fetus that becomes a strapping baby and, later, a strapping adult who hoards resources and spreads his genes to new progeny. Essentially, imprinting means that in some places along the human genome –about 100 genes in all– the way DNA behaves depends on which parent passes it to the offspring." (357). This is all absent when a genome hypothetically originates de novo from a primary pond.
'Writers' add chemical marks to the DNA or to the histone proteins that DNA wraps around. ©Nature (285) (modified)
©Nature (click to enlarge)
Epigenetic processes are essential for packaging and interpreting the genome, are fundamental to normal development and are increasingly recognized as being involved in human disease. Epigenetic mechanisms include, among other things, histone modification, positioning of histone variants, nucleosome remodelling, DNA methylation, small and non-coding RNAs.
(Nature, 7 Aug 2008).
|Much of a cell's identity is determined by modifications to chromatin, which comprises DNA and the proteins that bind and package it. Epigenetic instructions, in the form of chemical marks that cling to chromatin, tell cells how to interpret the underlying genetic sequence, defining a cell's identity as, say, blood or muscle. The marks serve as instructions that are passed down as cells divide, providing a sort of cellular memory to ensure that skin cells beget other skin cells (285).|
Methylated cytosine (5-methylcytosine), often referred to as DNA's fifth base, makes up a subset of nucleotides in the mammalian genome (121). It can regulate tissue-specific gene transcription, without affecting the genetic blueprint. Cytosine methylation may function as a memory module of cell identity and developmental state.
Two epigenetics examples:
DNA of human sperm is highly methylated and that of eggs moderately so. There is a massive loss of DNA methylation from most of the zygote genome immediately after fertilization in human embryos: the erasure of epigenetic memory (337). However, specific DNA methylation appears to be obligatory in plants and vertebrates (eukaryotes!) (298). That's the end of independent origin theory. Furthermore, DNA methylation is used for repressing expression and preventing further expansion of repetitive DNA elements.
- DNA methylation is essential for the survival of the embryo. Two studies of mouse embryogenesis now show that transmission of DNA methylation from gametes is predominantly maternal. Mouse embryos need maternal imprints for normal development.
- FBHM (Familial Biparental Hydatidiform Mole) is a recessive disorder in humans that results in repeated pregnancy loss due to a failure to establish maternal imprints at multiple loci throughout the genome (227).
- A DNA sequence does not survive
"Any theory postulating that genes [!] may have emerged randomly and then waited to be used are fundamentally wrong, especially
in a world dominated by the deleterious effects of the second law of thermodynamics. Genes had to have a functional meaning
from the very beginning or they would have vanished soon after they emerged." (53).
Please note, this applies to genes. Let alone to genomes. Furthermore, phenotypic effects of DNA are ignored: RNA-editing changes the sequence, and so the phenotypic effect. With no RNA-editing in the primary pond, the DNA sequence would probably not survive for this reason alone.
- 1 Aug 2014 A DNA sequence is not chemically stable outside a living cell
DNA is stable enough to function as a carrier of heredity. However, DNA degrades naturally after an organism dies. DNA degradation is a process by which DNA breaks down into smaller fragments. Environmental factors such as sunlight, heat, and humidity can increase the rate of degradation. Further, Cytosine deamination takes place. DNA outside a living cell has a half-life of 521 years. That means that after 521 years, half of the bonds between nucleotides in the backbone of a sample would have broken; after another 521 years half of the remaining bonds would have gone; and so on (339). This has negative implications for the independent origin of DNA.
- 7 Oct 2015 A DNA sequence is not chemically stable inside a living cell
In the early 1970s, scientists believed that DNA was an extremely stable molecule, but Tomas Lindahl demonstrated that under physiological conditions DNA decays at a rate that ought to have made the development of life on Earth impossible. This insight led him to discover a molecular machinery, base excision repair, which constantly counteracts the collapse of our DNA. Before Lindahl, nobody really considered the idea that DNA requires active engagement by a set of housekeeping processes to keep it in a stable state (353).
- A DNA sequence is nothing without a nucleus
The defining property that sets eu-karyotic cells apart from pro-karyotic cells is the nucleus (wiki).
"Genomes are more than linear sequences. We usually think of genomes abstractly as one-dimensional entities that are purely defined by their linear DNA sequences. In addition to the complex arrangement of the genetic information itself, the cellular factors that read, copy, and maintain the genome are organized in sophisticated patterns within the cell nucleus. Specific nuclear processes such as transcription and replication occur at spatially defined locations in the nucleus." (177), (178).
- A DNA sequence is nothing without a cell
Remarkably, in the Appendix Senapathy knows:
"Usually, an organism starts its growth from a single cell" (p. 536)
He seems to forget this important fact in his theory. A sequence of DNA is useless without a cell: membrane, mitochondria, ribosomes, nucleus, nuclear membrane, centrosome, ATP (energy). (See further details: par 7 Genome-centered approach).
- Cell division (mitosis): from the first animal or plant genome a body must be created. That means millions of cell divisions. Chromosomes do not only create an organism, they are duplicated just before each cell division. "Chromosome segregation must be executed with high fidelity so that the mother cell and the daughter cell that arise from division receive precisely the same DNA content". Otherwise aneuploidy will result. Comparing the length of metaphase chromosomes to that of naked DNA, the packing ratio of DNA in metaphase chromosomes is approximately 10,000:1 (Nature). Therefore, central to the problem of segregation is the issue of packaging." (Nature). Mitosis is a complicated process in which about 625 genes are involved (Nature, 2010). In 2017 the number of genes having a role in cell division was 1295 genes (361). The spindle apparatus is the structure that separates the chromosomes into the daughter cells during cell division. Without spindle no cell division. The kinetochore is a complex machinery composed of more than 100 proteins through which chromosomes attach to the microtubules that form the spindle apparatus, which allows chromosome segregation. (see above: 'A DNA sequence is nothing without proteins'). Proteins involved: separase, cohesin, etc.
- Embryonic Genome Activation:
A human develops from a single cell, a fertilized egg. Only on the third day the human embryo, at the 8-cell stage, starts reading its genome (Embryonic Genome Activation). Before that moment transcripts can be detected in the embryo, indicating maternal deposition. But that is impossible for 'independently born organisms' because they do not have a mother and consequently no maternal RNA (352). 11 Jun 2015
- Synthetic genomes. Lessons from synthetic genomes: the first synthetic genome (a bacterial genome) was created from scratch in the lab by Craig Venter in May 2010.
- The first lesson: this work didn't create a truly synthetic life form, because the genome was put into an existing cell (129). (the original genome was carefully removed before it received the new genome).
- The second lesson: it is very difficult to create a 1-million-base genome. Blue Heron (The Gene Synthesis Company) has mastered the art of synthesizing relatively long, entirely accurate sequences, and stringing them together to create gene-sized fragments on the order of hundreds to thousands of bases. However, the human genome is 3000 times bigger! To assemble a one million genome from 1k pieces the process involved, according to Venter, "invention after invention after invention of new ways to do things" and "There were literally thousands of hurdles that had to be overcome" (201).
- The third lesson: errors! Initially the attempt failed because the artificial genome failed to take control of the cell. The cause was a single-base mistake which delayed the project 3 months (129).
- A DNA sequence is nothing without microbes: "Animals grow up under the influence of their microbes, not just the blueprints encoded in their genomes. Microbes play a role in development. The bodies and immune systems of animals ranging from tsetse flies to mammals mature properly only after exposure to bacteria. The larvae of some marine worms metamorphose into adults only when they encounter bacterial molecules. (350). 15 Jan 2015
- A DNA sequence is nothing without an organism: parts of the DNA sequence (genes) must be expressed at the right time and place in the right quantities in order to develop an adult from one cell and maintain life als an adult. To achieve this, proteins that control such processes have to bind to specific places in the genome. There is evidence that ageing entails a gradual drift towards more random patterns of gene expression, which might cause organ/tissue failure. There are millions of ways to express genes at the wrong time, place, quantity or sex. Furthermore:
- other genes in the genome influence the function of a gene
- the function of the gene product must be in accord with the laws of biochemistry (energy production, protein folding, catalysis, prevention of protein aggregation). Some proteins cannot fold without the help of other proteins, called chaperones, so that means: genes cannot function in isolation, they need other genes.
- DNA must be protected from damage: right from the start DNA must be protected from damage (mutation). Indeed: a very complicated DNA-repair system exists. Furthermore, DNA must be faithfully duplicated (replication) with the help of specialized enzymes. It is highly improbable that they arose by chance.
- a sequence of DNA is meaningless without the correct ecological context, that is biological and nonbiological (the physical conditions of the earth: temperature, atmosphere, climate, gravity, water). See: The genome is blind.
- The interaction and interdependency of nuclear and mitochondrial genome: Interactions between the nuclear genome and mitochondrial DNA are essential for proper cellular functioning, but incompatibilities between the two can lead to compromised development and fitness. Despite having their own genomes, mitochondria don't make many of their own proteins; most are synthesized in the cytosol by cellular equipment encoded in the nucleus. Thus, the interactions of mitochondrial and nuclear DNA are critical to cellular life (318) 26 Sep 2013.
- A DNA sequence is not an organism: an eukaryotic organism contains multiple separate genomes. See: separate page . Explaining the nuclear genome is not enough.
- A computer generated DNA string is a virtual thing, see: here.
- The limits of a Genome-centered approach: § 7
The problem of the origin of life can be expressed in different ways:
The concept of information is the most abstract. There is no free floating information in nature. It is always embedded in genomes. But the 'genome' concept is also abstract, there are no free floating genomes in nature. Genomes are always embedded in cells and organisms. Organisms are always part of populations, and populations form a species. Species are always embedded in ecosystems. Ecosystems are geographical and historically located.
- origin of DNA: a chemical problem
- origin of genomes: a biological problem
- origin of information: a mathematical problem (statistics)
Senapathy merges the problem of the origin of life (an unsolved problem), with the origin of species (solved in principle by Darwin). Senapathy knows that Darwin did not address the question of the origin of life in his Origin of Species (p.199). Following Darwin, evolutionary biologists focus on the origin of species, and leave the origin of life to specialists (chemists). When evolutionary biologists study the evolution 'from microbe to man', they are not handicapped by their ignorance of the origin of life. However, Senapathy has to solve both problems at the same time! That means he cannot ignore the origin of the genetic code. He has to analyse what the very problem of the origin of life is (see also paragraph What is Life?). Today scientsts argue that a RNA-world has preceded the DNA-world mainly because DNA does not self-assemble prebiotically.
Senapathy reduces the problem of the origin of life to the origin of a genome. Next he reduces that problem to finding the sequence or 'the origin of biological information'. Next he reduces that problem to a statistical problem. What we see is a stepwise narrowing down of the original problem. At the end he claims having solved the original problem while only tried to solve an extremely restricted form of the original problem. Neither the origin of life, nor the origin of genomes have been solved. The most interesting problem is: what genes are required for a successful genome?
The origin of life is a chemical problem and the origin of species is a biological problem. I am afraid that Senapathy thinks that as soon as he has solved finding the Sequence, he has solved all the problems of biology! However, a sequence without cellular machinery is like software without a computer! (333). Please note Senapathy uses computers for his research.
The genome-centered and information-centered approach
Senapathy's thinking is gene-centered, genome-centered and information-centered. Gene-centered means that the most important elements in genomes are genes, and the rest of the DNA, including introns, is junk. Information-centered means that the sequence of 4 symbols is the most important aspect of genomes. Here is an example of genome-centered thinking:
“Thus, the genome is the master of the cell and the organism.” (p.551, Appendix - Genetics Primer)
Senapathy is not alone. Recently Richard Dawkins wrote: “Replicators are the units that survive (or fail to survive) through the generations. Vehicles are the agents that replicators programme as devices to help them survive. Genes are the primary replicators, organisms the obvious vehicles.” (291) and this can be found on the pages of Nature: “DNA is famous as the instruction manual of life — the multi-billion-base-pair data tape that directs how a fertilized egg turns into the specific cells, tissues and organs” (294). Computational genomics researcher Eugene Koonin is also very genome- and sequence-centered (358), but that is the only similarity between them and Senapathy.
DNA does not produce Life
The pervasive effect of the discovery by Watson and Crick of the α-helix structure of DNA and the Central Dogma is the genome-centered view of life.
However, DNA only codes for protein. DNA does not create carbohydrates, fats, energy, nutrients. Energy and nutrients are external to DNA and the cell. DNA may encode a cell's potential, but the RNA molecules present dictate the activities that define a cell's state at any particular moment (252). Introns and splicing are central to Senapathy's theory. Splicing occurs at the level of RNA, not DNA.
Senapathy's approach is genome-centered and information-centered. Life is first reduced to genomes and then DNA is reduced to information which a computer can handle. The Central Dogma is indeed central to biology, but that by no means does imply that it was involved in the origin of life. DNA does not produce Life. A related problem is genetic determinism. Niche-construction theorists, like developmental biologists, view phenotypes (and hence their environmental modification) as underdetermined by genes (247). A phenotype cannot be predicted from a genotype.
A quick glance at the internal structure of the eukaryotic cell is enough to see the limitations of the genome-centered approach:
A few necessary cellular components: The ribosome is an RNA-protein complex performing protein synthesis in all living cells. The emergence of the ribosome constituted a pivotal step in the evolution of life. This event happened nearly four billion years ago. The centrosome (55,56) is inherited from a mother cell. Upon cell division, each daughter cell receives one centrosome. The mitochondrion is inherited from the mother(!) and there is an interdependency of mitochondrial genome and nuclear genome. Where does the mitochondrion come from in Senapathy's theory? How could the interdependency be explained?
(see wikipedia for explanation)
According to Senapathy, p.552
Please note: except mitochondria, there are no internal structures in the cell and no indication of DNA in mitochondria.
See: Endosymbiosis theory refutes Senapathy, § 28 and 30.
A DNA sequence with introns needs splicing sites, the splicing out of introns requires a spliceosome (229), which uses five small nuclear RNAs and hundreds of proteins.
Senapathy has a gene- and genome-centered approach to explaining life. Probably, that was the common wisdom in 1994 when he wrote his book. Indeed, genes determine most of the differences between humans and other species. But Senapathy does not know its limitations when explaining the origin of life. The big mistake is to believe a naked genome is technically capable of creating a human being. This is succinctly described by the historian Jan Sapp:
"Critics of gene theory continue to emphasize that only a cell can make a cell, and that plant and animals emerge from eggs, not genes" (38)
Of course the cell must be equipped with a genome. Microbiologist Carl Woese wrote that the
"strange claim by some of the world's leading molecular biologists that the human genome is the holy grail of biology is a stunning example of a biology that has no genuine guiding vision"
It is an unremarkable fact of biology that all animals start life as a single cell and that animals and plants have two complete sets of chromosomes (diploid). In this context however, it is a highly significant fact, because that single cell resulted from the fusion of two haploid cells and the two sets of chromosomes directly came from two parents. In Senapathy's theory, there are no parents! William Harvey said: Omne vivum ex ovo: 'All life from eggs'.
"One of the main arguments I will make in this book is that structures resembling microscopic soap bubbles were an absolute requirement for life to begin, as essential to the process as the assembly of genes and proteins". (David Deamer (2011) 215, p.3.)
Life is cellular. So, any theory trying to explain the origin of life needs to explain the origin of the cell. A cell is defined by a membrane. A membrane is neither made of DNA, nor proteins, but of phospho-lipids. Phospholipids are the result of a long evolutionary process, and their synthesis requires enzymatically catalyzed reactions that were not available for the first forms of cellular life (86). Contemporary phospholipid-based cell membranes are formidable barriers to the uptake of amino acids, metal ions, etc. Modern cells therefore require sophisticated protein channels and pumps to mediate the exchange of molecules with their environment (87). That is a perfect and sufficient reason why modern animals and plants cannot arise out of a primordial pond, not even single celled organisms. Senapathy did not give any reason why his primordial pond would not be filled with random DNA sequences until the primordial pond became depleted of DNA building blocks. The process would stop there. All Senapathy has to say is "and the membranes that surround the cell were also available" in the primordial pond (p.308). His theory says they were available! That is his 'explanation'. He has no explanation. So, we can forget about introns and exons. The membrane is a crucial argument against independent origin of present-day life.
What is life?
The genome-centric view of life largely ignores thinking about what is life? See: § What is life?
Some criticisms of genome-centered view in the scientific literature
It is true that some genomics researchers still believe that "The human genome encodes the blueprint of life" (290), but most disagree. Here follows a number of illustrative quotes from books and articles expressing the idea that the genome is not enough (please note most appeared after Senapathy's book):
Not by genes alone
Do genes code the organismal form? Not quite, says evolutionary biologist Massimo Pigliucci:
"Genes by themselves do literally nothing. Organisms do not begin with a bunch of genes that generate everything else:
they need a set of environmental conditions, as well as the inheritance of materials and extra-genetic information from the
previous generation. From the point of view of causal analysis, genes may be said to be a necessary but far from sufficient
condition for the development (and evolution) of organisms."
(Original link: life.bio.sunysb.edu/ee/pigliuccilab/bookclub/ does not exist anymore.)
Not by DNA alone
"Today, the view that biological information is transmitted from one generation to the next by the DNA sequence alone appears untenable.
There is increasing awareness that non-genetic information can also be inherited across generations."
Étienne Danchin et al (2011) Beyond DNA: integrating inclusive inheritance into an extended theory of evolution, Nature Reviews Genetics 12, 475-486 (July 2011).
A DNA sequence is meaningless
"A DNA sequence, by itself, is meaningless. The information in the double helix is interpreted through its interactions
with the rest of the cell."
Barton et al (2007), EVOLUTION, p.381.
A DNA sequence is an archive
"An organism is not a linear script in a DNA language we have learned to read. In fact, such a simplification is a shocking distance form the truth."
"Without RNA, a cell would be all archive and no action".
Michael Yarus (2010) Life from an RNA World, Harvard University Press, p. 97.
A DNA sequence is dead
"Despite its obvious importance to life, biological energy receives far less attention than it deserves. According to molecular biologists, life is all about information. (...) Life without energy is dead"
Nick Lane (2005) Power, Sex, Suicide, p.68.
Genes don't do anything
"Critical to my appreciation of genetics was the understanding that by and large genes don't actually do anything at all."
Lisa Seachrist Chiu (2006) When a Gene makes You Smell Like a Fish and Other Tales about the Genes in Your Body, p.2.
DNA is an inert database
"DNA isn't life. It doesn't even leave the nucleus of the cell. A whole army of proteins is needed to unpack, edit, and execute the information it contains. Without this apparatus, DNA is but an inert database, full of errors and repetitions. To grasp the nature of life, we must move away from our obsession with genes alone."
front flap text of Denis Noble (2006) The Music of Life. Biology Beyond the Genome (216)
Every gene needs an environment
"We now know that there is no such thing as a gene that acts in isolation and that every gene needs an environment
--whether the environment is the presence of molecules made by other genes, signals generated internally within
the developing nervous system, or electrical activity transduced from the external world.
The genes of brain development are impressively environment- and experience-dependent."
Mriganka Sur (2008) NEUROSCIENCE: The Emerging Nature of Nurture, Science 12 December 2008: Vol. 322. no. 5908, p. 1636
Mysteries of the Cell
"We live in the golden age of genetics, but the fundamental unit of biology is still arguably the cell."
John Travis, Mysteries of the Cell, Science 25 November 2011
Processes of Life
"Beyond doubt, Dupré emphasizes, the perpetuation of life from one generation to the next requires much more than simply the passage of DNA. He concludes that genomes do not merely store information. Because of their constant dynamic interaction with other constituents of the cell, their capacities depend not only on their sequence of base pairs. More important, those capacities are determined by the systems of which the DNA molecules are only part."
Review of Processes of Life. Essays in the Philosophy of Biology by John Dupré, Oxford University Press, 2012 in Science 17 Aug 2012.
DNA is not just a database
"It is tempting to regard this famous molecule as just a database containing the algorithm for constructing an organism. But DNA is also a physical object that constantly bends, twists, and interacts with other biomolecules."
Philip C. Nelson: Spare the (Elastic) Rod, Science 31 August 2012
DNA does not 'play itself'
Does this sheet music play itself?
Of course not.
Sheet music is coded music, but it needs an interpreter to become music.
Even a pianola roll does not play itself!
You need a pianola to play the music encoded in the paper roll.
A pianola roll without pianola is dead.
Also, DNA does not 'play itself'.
It needs cell machinery to 'play itself'.
DNA without a cell is dead.
Would a paper roll with random holes encode 'random music'? That is certainly possible, but still a pianola is required to play it. And the pianola itself does not spontaneously arise out of random parts.
"Scientists have long since abandoned any concept of biological determinism. It has now been proved beyond doubt that although our genes are fixed, their expression is highly dependent on what our environment throws at us."
Nature editorial, 'Life stresses', 11 Oct 2012
22 Nov 2012|
A DNA-centric viewpoint
"After the Second World War, biology in the West moved away from thinking of the cell in physicochemical terms, towards a reductionist molecular-biology approach, with a DNA-centric viewpoint. (...)
Not surprisingly, cold-war divisions led many US scientists to dismiss Oparin. The Nobel laureate Hermann Muller, who thought that life originated as a gene, criticized the poor status of DNA within Oparin's picture of early life."
'In Retrospect: The Origin of Life', Nature Books and Arts, 22 Nov 2012
13 Aug 13|
"In this light, the phylosymbiotic microbiome (*) can be understood as an addition to the coadapted genomes of a host organism rather than an arbitrary amalgam. Linking the microbiome and host genome underscores the hologenome as a unit of evolution and blurs the lines between what biologists typically demarcate as the environment and the genotype of a species. Based on the mounting evidence for speciation by symbiosis, it is becoming clearer that a unified theory of evolution that considers the nuclear genome, cytoplasmic organelles, and microbiome as interacting components in the origin of new species is an emerging frontier for biology." (316).
*) the microbial community relationships that recapitulate the phylogeny of their host.
27 Nov 13|
The Philosophy of Biology
"There is more in biology than nucleotide sequences, as there is more in language than letter sequences. ...
Development is a complex process of which DNA is an important, but not the only, factor. ...
Biology education should make clear that life requires not only DNA but also a complex cellular machinery."
11 Dec 2013|
Tibor Gánti: The Principles of Life
"Consequently, a living system should necessarily comprise at least two systems, of which one is the controlling unit and the other is the controlled part. (...) It is naïve to believe that the genesis of life can be clarified by studying whether a program can be developed by itself. It cannot." (p.15.)
"It has already been mentioned that the cytoplasm and the nucleus can only develop together: no multicellular organism can develop either from a nucleus alone or from the cytoplasm of an ovum without a nucleus. ... [similarly] an embryo cannot develop from the ovum if either its cytoplasm or its nucleus is missing, and no cell can operate without these either." (p. 17)
"This book is a polemic essay. A polemic essay against the onesided idea of biology having the genes in the center. ... Fate brought that I have to oppose again a dogma, against the dogma of omnipotent genes." (Contra Crick, 1989)
|| 30 Jul 2014|
The dual nature of life: information and energy
"Life is not just a genetic entity. Genes by themselves do nothing more than salt crystals. Life is an open, cycling system organized by the laws of thermodynamics."
Eric D. Schneider and Dorion Sagan (2005), Into the Cool. Energy Flow, Thermodynamics, and Life, p.24 paperback.
|| 9 Aug 2014|
Richard Lewontin: a critic of the DNA-centric view
"The trouble with general scheme of explanation contained in the metaphor of development is that it is bad biology. If we had the complete DNA sequence of an organism and unlimited computational power, we could not compute the organism, because the organism does not compute itself from its genes. (...)
Of course it is true that chimps look different from humans because they have different genes. And a satisfactory explanation for the differences need not involve other causal factors."
Richard Lewontin (2000) The Triple Helix. Gene, Organism, and Environment, p.17.
|| 1 May 2015|
A review of: In Search of Cell History by Franklin M Harold
"One theme in the book, which I happen to be partial to, is the implication that biologists have been overly fixated on DNA. We tend to think that, because variation in DNA maps onto variation in phenotype, genes control all aspects of living cells. However, the conversion of information in DNA into cell structure depends on the cell itself reading and interpreting the genetic information. And the cell has aspects of organization, for example membrane-bound structures and long-lived protein complexes, that are passed down from generation to generation without direct encoding in DNA. Genetic software requires cellular hardware (or should it be 'gelware'?). For these reasons, we should not be overly gene-centric when thinking about cell evolution, but should also give due weight to biochemical, energetic, and structural aspects of living cells. For example, I agree with Harold that a simplistic gene-first (RNA-world) model for the origin of cells is flawed. RNA molecules cannot replicate themselves–they need to be embedded in a chemical system that allows their information to be copied. Explaining the origin and perpetuation of this system cannot be laid simply at the feet of the RNA molecules themselves". (reviewed by David Baum)
The logic of the genome-centered view
Why would the primordial pond not produce independent nerve cells, brain cells, muscle cells, kidney cells, heart cells which combine into organisms? Why would the primordial pond produce multicellular organisms by means of single cells developing into multicellular organisms? Why would a genome-centered theory predispose to our well-known embryological development program? Why would a random DNA sequence not produce cancerous cells? aneuploid cells? Would the DNA of a hypothetical genome by accident be in a (epigenetic) state necessary for the start of embryological development? If the DNA would be in an adult (differentiated) epigenetic state(see: Induced pluripotent stem cell). The problem is urgent because Senapathy claims independent origin of multicellular organisms (eukaryotes), not single-celled organisms (prokaryotes).
Information centered approach
This is what I wrote in a review years ago: "Information is the ultimate explanation of life. Information is the secret of life. This view of life is an oversimplification" (202). Senapathy is victim of this oversimplification because he thinks a computer simulation of statistical properties of DNA is enough to explain the origin of life (eukaryotes). However, Senapathy is not the only one. Famous ultra-Darwinist Richard Dawkins described organisms as information carriers based on the idea that
bodies are containers for DNA, which itself is just a code and storage format from an informational point of view.
Another aspect of the information and DNA centered view is the fact that DNA contains the information that is faithfully transcribed and
translated into proteins. This view is wrong. One of the main reasons is RNA-editing, which means the non-random replacement of one base by another (for example A-to-G or C-to-T) and ends up in the protein sequence (214). This means that the DNA sequence is not sufficient for producing proteins.
Furthermore, RNA-editing does not make sense if the information of DNA originated from random DNA sequences. If the endproduct of RNA-editing is useful for the organism, why would DNA sequences representing the protein sequence not be selected in the first place? That would be the more likely event.
Furthermore, Senapathy ignores DNA methylation which adds extra information to the primary DNA sequence (epigenetics) and it controls transcription factor binding sites and regulatory regions.
Shortcomings of the DNA-centered approach are: it ignores cell and cell membrane, cell arises from fertilization, and genes need to be regulated and true understanding of gene regulation requires the study of epigenetics and a lot more. Furthermore, RNA-editing changes the information transcribed from DNA. Therefore, any DNA-centered theory of the origin of life is incomplete and wrong. Certainly, if eukaryotic life is claimed to be the first form of life.
Even within the genome-centered view two different views are possible: 'the regulatory code' versus 'the protein code'. Senapathy exclusively pays attention to protein-coding sequences (exons). However, many of the observed differences between species likely stem from when and where the products of the genes are made, in other words: 'the regulatory code'.
I collected additional technical reasons from developmental biology, genetics and ecology on the page: Independent Origin and the facts of life.
Ironically, the study of genomes, called genomics, is precisely the research field that enables biologists to construct
the tree of life! The use of genomics to construct Darwinian trees of life is called phylogenomics. An example is:
'A Phylogenomic Study of Birds Reveals Their Evolutionary History' published in Science, 27 Jun 2008. According to the theory of Senapathy there can be no true tree of all birds, because there is no common descent of all birds. See also: Elizabeth Pennisi (2008) 'Building the Tree of Life, Genome by Genome', Science 27 June 2008: 1716-1717.
See also: Phylogeography from Wikipedia, the free encyclopedia.
Everything you always wanted to know about s e x, but were afraid to ask
Sexual reproduction is the most common form of reproduction in 'higher' organisms. But it is not the only method of reproduction: asexual reproduction does exist (prokaryotes: bacteria and some fungi such as Penicillium). Sexual reproduction is a most important fact against independent origin, because it is far more complicated than asexual reproduction. A theory of independent origin should predict asexual haploids because both asexual and haploid are more simple and therefore more likely to originate. Above that, most sexual species are also eukaryotes, which is another complication (see: Did prokaryotes arise from eukaryotes?). The independent origin of a female and a male of the same species is extremely improbable. The emergence of a hermaphrodite is slightly less improbable than two sexes because it does not involve two different individuals. (see: hermaphrodite?). The improbability applies to all sexual reproducing organisms. Asexual reproduction is rare among animals (65). As a starting point, I list the following uncontroversial facts. One does not need to be an evolutionist or a Darwinist to accept them.
- Human body cells are diploid: one complete set of chromosomes from the father, plus one complete set from the
- sperm and egg cells are haploid: 1 incomplete set of chromosomes.
- a zygote is a fertilized egg: haploid egg + haploid sperm = diploid zygote.
- humans have a diploid set of 23 pairs or 46 chromosomes in total
- females have 23 identical pairs, including a X-chromosome pair, excluding the Y-chromosome
- males have 22 identical pairs and one non-identical pair of a X-chromosome and a Y-chromosome.
The Y-chromosome is about 1/5 the size of an X-chromosome (see photograph, and diagram: Fig. 2).
- Egg cells have always one X chromosome and sperm cells have either one X or one Y chromosome.
- additionally female egg cells contain many mitochondria each containing 37 genes.
The mitochondria are transmitted exclusively via the female egg cell.
- The number of genes in the human genome is about 22,000 genes (estimates vary, but that is not relevant for my argument).
So far the facts of life. Now Senapathy's claims that
"The genomes were directly assembled into single seed cells, analogous to the fertilized eggs of sexually reproducing
"male and female 'seed cells' lead to male and female individuals";
"Both male and female seed cells can be assembled";
"One can infer that it is not difficult to segregate the genes for a male or a female into a specific chromosome
and in two different sex cells". (p. 358)
"Not difficult"! This is an astonishing and outrageously careless statement. The first quote proves that Senapathy does not distinguish between 'haploid' and 'diploid' cells (18). This is fatal for his theory. Although he knows about X and Y chromosomes in another context (p. 588), in this context he forgets about it. Just look how different the X- and Y-chromosome are! He just flatly states that it is "not difficult" without any evidence! Analogous to the fertilized eggs? But a fertilized egg is the result of fertilization! And one needs a female and a male for fertilization. Furthermore, human fertilization requires an interaction between two proteins: Izumo1, which is produced by sperm, and Juno, its receptor on eggs (355).
Senapathy jumps from genes to genomes, thereby completely ignoring that chromosomes exist and have a structure. However, when he criticizes evolution, he knows about chromosome structure and histones (p.125, 143). He ignores the problem of
isogamy, internal versus
separate males and females, egg-laying versus
live birth, etc. Senapathy does not specify any details. Indeed, part of the trouble is that his scenario is extremely vague at crucial points. See the following quote for an entertaining, charming and unscientific scenario:
"Perhaps among the organisms produced in the primordial pond, some had only secondary sex organs, but no genital organ to copulate; whereas other organisms would have the latter but not the former. Both the above situations may or may not have had the reproductive cycles of sperm/egg production. There could have been many seed cells producing individuals, with wrong combinations of male and female sex organs and secondary sex characteristics. Rarely, some seed cells will process all the three sets of genes for all these three functions - attractiveness by secondary sex features, copulation by genital organs, and reproduction by sperm/egg cycles. This is analogous to many seed cells giving rise to individuals with improper or incomplete organs, which will not survive. Only those individuals with the absolutely right organs will survive. Therefore, only one out of myriads of seed cells may form a viable organism. This may explain why it would have taken geological time for seed cells to be formed with genomes capable of producing viable organisms." (page 358-359.) (my emphasis) (97)
This text could have been written by Greek philosophers Empedocles or Epicurus. There is no science in it. I will consider two possible scenarios with all the necessary details: the haploid and the diploid scenario.
Can a male or female arise from haploid cells?
In 1692 Richard Bentley asks what the probability would be that a male and a female of the same species should each arise by chance (78). This is exactly Senapathy's question. That is: the question he ought to ask. If the goal is to produce a male one needs an egg that is fertilized by a sperm carrying a Y-chromosome. For a female one needs and an egg that is fertilized by sperm carrying a X-chromosome. So one needs 4 haploid cells to produce one male and one female (3).
Fig 2. Human sex
What is the probability that those 4 cells arise from the primordial pond? Let us start with the production of one egg cell from random assemblage of genes in the primordial pond. A human egg cell contains approximately 32,000 genes (minus Y-specific genes) distributed over 23 chromosomes. For example chromosome 1 contains an estimated 2700 genes; X chromosome 1600 and Y chromosome 250. I will ignore a lot of complicating factors: a chromosome is more than naked DNA (19), has a centromere, telomeres (74), and an egg cell is more than a bag of genes. All those genes have exact locations on the chromosomes characteristic for the species. For a given species, the same genes are on the same chromosomes and in the same order. Chromosomes themselves have no order (free floating). If one wants to produce a fertile individual that is able to reproduce then a requirement is that all genes on the corresponding chromosomes of the egg and the sperm are in exactly the same locations. Therefore, the probability of the genomes of one egg cell and one sperm cell equals the probability of 32,000 genes ending up in exactly the same positions on 23 chromosomes in two independent trials. Please note that I am not estimating the probability of random assembly of the human genome in one trial. I am estimating the repeatability of the event. In this scenario we need the repeatability because in the end we need 4 (haploid) genomes of the same species. The problem can now be formulated as: how many permutations are there of 32,000 genes? To get an idea of the magnitude of the problem: the number of permutations of only 29 genes exceeds 1030 (which is the number of DNA sequences in Senapathy's primordial pond). It is clear from this that it's impossible that a second cell that matches the first, will be produced with this method, let alone that 4 genomes could be produced in this way.
Degenarate polyploid genomes
The situation is even worse. Many plant species are polyploid, having duplicated genomes. For example: yeast Saccharomyces cerevisiaea is a tetraploid and its genome consists of four roughly identical genomes (the diploid number is 4N). Because of the accumulation of mutations during the history of the species the original identical genomes start to differ. These are called degenerate tetraploids. Is is highly unlikely that degenerate tetraploids originated spontaneously from a pool of random DNA segments (especially all mutations being neutral).
Male bees, wasps, and ants develop from unfertilized haploid eggs, so are haploid. Would that help Senapathy's theory? No, because a diploid female has to produce the egg. Although it is known for an egg to start developing without being fertilised (parthenogenesis) in some insects, snakes (boa constrictor), lizards, and Komodo dragons, and parthenogenesis has been reported in about 70 vertebrate species (roughly 0.1%) and even in sharks (75), but not in mammals. Early mammalian development has a strict requirement for genetic contributions from a male and a female parent (174) and fully parthenogenetic human embryos cannot develop to term (4). The embryos die after a few days. Maternal and paternal genomes are both necessary for normal development in mice, and this is believed to account for the absence of parthenogenesis [=development without fertilization by a father] in mammals (31). An embryo that did not have a sperm involved in its formation cannot make a placenta (the organ that keeps the fetus alive) and so cannot be born (54).
In plants the formation of asexual seeds is called apomixis and it leads to populations that are genetically uniform maternal clones (apomixis is found in more than 400 species of flowering plants). Even if parthenogenesis would work in humans, this only produces females, so would not explain the origin of males. This does apply to all animal species with an XX-female/XY-male sex-chromosome system. There is no Y-chromosome in a female cell, therefore a male cannot be produced parthenogenetically.
Apart from the Y-chromosome, all sexually reproducing animals, simply have two parents. Asexually reproducing species (making clones of themselves) like bacteria, only have one parent.
Senapathy's computer experiment shows a single sequence (Fig 1). The analogy with a haploid genome is obvious. (I am ignoring the fact that DNA is a double helix for the moment). I guess he based his theory of independent birth on the idea of a haploid genome and assumed it was no problem to produce diploid organisms. Regrettably, the haploid method fails to produce the first human male and female. The haploid method can only produce asexually reproducing bacteria and other prokaryotes. The funny thing is that Senapathy knows about X and Y chromosomes when he wants to refute neo-Darwinism, but does not realize that they are an insurmountable obstacle for his own theory. Nobody would deny that simple genomes have a higher probability than complex genomes. Therefore, a haploid organism should be the expected outcome of the primordial pond. Multicellular haploid organisms do exist: males of the honeybee are haploid. Senapathy is confronted with the amazing but inevitable question:
why on earth are not all species haploid?
The above problem points to a wider problem: Senapathy does not distinguish between the origin of an individual and a species. How to proceed from the first individual to a population of interbreeding individuals? (Note: Only a few genes can prevent interbreeding of individuals of the same species (incompatibility genes, hybrid sterility, reproductive isolation). Self-incompatibility is of special importance because of obligate outcrossing. Self-pollination in plants requires only one individual, but still is sexual reproduction and few plants actually self pollinate.
I suppose Senapathy could come up with the following objections:
(1) maybe millions of different genomes could produce humans. – This is not relevant. What is relevant that any genome must be produced in a male and a female form. That makes it impossible.
(2) genes do not need to be in same position as the current positions to produce a human, so the probability to produce a human genome from random DNA sequences is very much higher. – Theoretically it seems quite possible that genes which are ordered in a different way on the human chromosomes would still produce a human being. However, paternal and maternal genes (alleles) have identical positions on chromosomes. That situation must be explained.
(3) there are many variations of a gene that produce the same protein and many protein sequence variants produce the same enzymatic function. – This is right but not relevant because I only considered the positions of genes on chromosomes. Those chromosomal positions must match.
(4) sex organs are generated by genes just as all other organs, so it should not be difficult (p.353). – Of course sex organs are generated by genes. That is not the point. The point is: what is the probability that male-specific genes come together on chromosome Y and that the sex-genes are expressed at the right time and the right place in the right sex, and that both genomes are identical apart from sex-specific genes, and both genomes are able to fuse and create a new individual?
Can a male or female arise from diploid cells?
17 Oct 2011
A diploid cell (= a pair of each chromosome) must be somehow produced because human body cells are diploid. Instead of the spontaneous origin of 4 suitable haploid gamete-like cells (2 sperm and 2 egg cells), theoretically, replicating each chromosome of a haploid cell and skipping cell division could produce a diploid cell. This doubling method escapes the huge improbability of generating a human genome twice. However, immediately two problems arise. The first is that the 'doubling method' fails to produce a diploid male cell because it does not produce a Y chromosome. An XY-pair can never be produced by doubling. Without a male cell, this scenario fails to produce a male and a female. Therefore, it fails to produce what could be the start of a new species.
The second problem is that even when the 'doubling method' produced a diploid female cell, it would be a 100% homozygote: all pairs of chromosomes would be identical. However, individual human genomes are diploid in nature, with half of the homologous chromosomes being derived from each parent. Therefore, they are different: heterozygote. Normally, in a diploid cell each gene has two versions called alleles. A complete homozygote chromosome pair in which the two chromosomes were identical would be a recipe for trouble, as the effects of any bad gene would be felt to their fullest. This is the same problem that genome researchers encounter if they would try to create the extinct woolly mammoth from its DNA (110). Normal diploid cells are heterozygote in a significant degree, because they originate from two parents (70).
two kinds of cells - two kinds of cell division
From a genetic point of view all animals and plants have two different kinds of cells: diploid body cells and haploid sex cells. These cells are created by two fundamentally different processes: mitosis which creates diploid cells and meiosis (91) which creates haploid cells. Both processes are complex because they must guarantee that daughter cells receive the correct set of chromosomes. At the beginning of mitosis, the process of cell division, chromosomes are organized randomly - like jigsaw puzzle pieces spread out on the floor. During the process of mitosis the two halves must be oriented such that they will be pulled in opposite directions into two newly forming daughter cells. Mechanisms must exist to eliminate wrong configurations while selecting the right ones (52).
|mitosis produces:||diploid cells (2n)|
|meiosis produces:||haploid cells (n)|
|fertilization produces:||n + n = 2n (diploid)|
Further, there is an important difference between mitosis and meiosis: in contrast to mitosis, gametogenesis eliminates age-induced cellular damage and resets life span (212).
Meiosis is even more complex and is controlled by meiosis-associated genes. In normal female meiosis in plants and animals, only one of the four products forms an egg nucleus while the other three are discarded into polar bodies. Why? Even the loss of a single chromosome can be lethal and can contribute to birth defects and cancer. Explaining these two highly complex and highly conserved processes with randomness, explains precisely nothing.
Then there is a third process called fertilization which is the fusion of a haploid sperm and a
haploid egg. Fertilization creates the first diploid cell of an individual: the zygote. After that, multicellular organisms develop clonally by mitosis from a single cell. Senapathy knows that "In sexually-reproducing animals, development always begins with a single cell called the zygote." (p.307). He knows that the zygote is diploid and originates from the fusion of haploid sex cells (p. 307). He next renames 'zygote' to 'seed cell' (a verbal trick) and claims it could arise spontaneously from the primordial pool without sufficient evidence (see elsewhere on this page).
Fertilization is 'simpler' than meiosis, because it is just adding two genomes together. However, things can go wrong too. The fusion must add precisely two and not three or more genomes. Furthermore, it appears that many highly specific proteins are involved in fertilization
(71), and these proteins must be encoded by the genome. Furthermore, sperm must deliver a pair of centrioles: "When the nematode C. elegans egg is fertilized the sperm delivers a pair of centrioles. These centrioles will form the centrosomes which will direct the first cell division of the zygote and this will determine its polarity." (226). So the fertilization process is not much simpler than meiosis.
Plants have a double fertilization: one sperm fertilizes the haploid egg cell, which becomes the diploid embryo, and the other sperm fertilizes the diploid central cell, generating the triploid endosperm. It is extremely unlikely that these complicated processes occur by chance. The endosperm nourishes the embryo during seed development. That is another clue that plant seeds need a mother plant. The endosperm genome is not transmitted to the next generation.
Evolutionary theory starts with relatively simple haploid cells which reproduce without sex and without meiosis.
They are in fact clones. On the other hand diploid sexually reproducing organisms are much more complicated, because they use both mitosis
and meiosis. The transition from asexual clones to sexual reproduction is one of the 8 major transitions of life (John Maynard Smith, 32) and Senapathy lets them originate just as easy as single celled organisms by random forces.
Senapathy claims that his theory predicts eukaryotic genomes. I don't see how his theory could predict diploid organisms in the first place. Given the fact that there are less complicated and therefore more likely ways to reproduce, his theory certainly would not predict a complicated process like meiosis (37). However, meiosis is even more complicated than that. Meiosis produces in the end four haploid germ cells. In males, all four give rise to sperm. In the female however, only one of those four develops into an egg cell, while the other three eventually die. So, additionally, there is also a sex difference in the meiosis process.
Could the first organism be a hermaphrodite? In a hermaphrodite species all individuals have both male and female reproductive organs.
The possible advantage would be that there is no need to produce two individuals (males and females) which differ in the DNA that determines sexual characteristics, but have otherwise equal genomes. Therefore, the origin of the hermaphrodite species could start with just one self-reproducing individual (snails, plants). Apart from all other objections, if successful, it would only explain hermaphrodite species, not the majority of species with males and females. Furthermore, it would not even explain all hermaphroditic species: many flowering plant species are obligate outcrossers that cannot self-fertilize because of self-incompatibility: they recognize and reject their own pollen.
A male without a Y chromosome?
Males of grasshoppers and aphids ('plant bugs') do not have a Y chromosome, they are described as XO. Females of those species have two X-chromosomes; they are XX (36). Are females of grasshoppers and aphids perhaps candidates for independent origin? Theoretically they could produce a species. However, apart from the absence of the Y-chromosome problem, the problem of producing a female and male version of the same genome still exists. Furthermore, the production of males requires a rather unusual form of meiosis. Apart from this, if successful, it would only explain XO species, not the majority of species with a Y-chromosome!
Conclusion: both the haploid and the diploid methods fail to produce the first male and female. It is impossible to produce humans from a pool of genes even if all the necessary human genes were swimming around in duplicate. The funny thing is that Senapathy knows that 'eukaryotic organisms usually contain two of each gene, and a haploid genome contains only one copy of each gene' , but he does not realize that this is fatal to his theory.
Diploidy and sexual reproduction are tightly interconnected. But even when one allows for the independent origin of diploid organisms, than it is still not necessary that they should have sexual reproduction with such a complicated process like meiosis. Indeed, why don't all diploid species use some form of asexual reproduction (common among plants; rare among animals)? Why not produce diploid children directly from a diploid cell, thereby circumventing meiosis? In general: when there are two solutions for a problem in nature, the theory of independent origin should predict the most probable solution, that is the most simple, of the two alternatives.
Postscript: in the Bible, God instructs Noah to take pairs of each kind of animal onto the ark. In those days people knew that one needs a male and a female.
Did prokaryotes arise from eukaryotes?
Fossil evidence refutes Senapathy
Senapathy's theory states that eukaryotes originated first because genes with introns can be found in computer generated random sequences of bases, whereas prokaryote intronless genes cannot. Did prokaryotes (114) arise from eukaryotes? Please note that this very idea is evolution and not independent origin! Apart from that, Senapathy still needs to show how this happened. I know of one other publication claiming that "a plausible, albeit controversial, case has even been made that prokaryotic cell architecture is a simplified derivative of that of eukaryotes" (115).
The standard textbook view is that the first organisms found in the fossil record are 3,500 million years old and are prokaryotes (bacteria). For the next 800 million years life on earth consisted of prokaryotes.
Another source states that the first 1.5 billion years, life consisted of aquatic microbes (51). The first indirect evidence of eukaryotes appeared 2,700 million years ago and the first fossil eukaryotes appeared 1,7000
million years ago. Another source states that eukaryotes emerged perhaps as many as 2 billion years later than prokaryotes.
Senapathy claims eukaryotes emerged first. It's clear from these data: the fossils say NO!
Endosymbiosis theory refutes Senapathy
Mitochondria are organelles in all eukaryotic cells; they are crucial for the energy supply of the eukaryotic cell; they multiply independently within eukaryotic cells in a simple asexual fashion (just like bacteria!); their DNA is circular (just like bacteria!); the genes are intronless (just like bacteria!); have little non coding DNA and few intergenic regions between genes (just like bacteria!); are haploid (just like bacteria!); are exclusively inherited from the mother (maternal transmission) and have their own DNA (37 genes in vertebrate animals) which is autonomously copied.
Human mtDNA encodes only 13 proteins, as well as the 22 tRNA and 2 ribosomal RNA genes required for their translation. The mosaic composition of human mitochondria is evident in the organelle's replication and translation machinery, with the ribosome closely resembling its bacterial counterpart and the DNA polymerase resembling that of a viral (bacteriophage) ancestor (299). None of these facts is controversial.
These facts support the hypothesis that mitochondria were once free living single-celled prokaryotes. This hypothesis is called endosymbiosis theory and was proposed by Lynn Margulis in 1970 (Senapathy knows this on pages 231, 598). Initially rejected by biologists as too speculative, the theory was accepted by evolutionary biologist John Maynard Smith as early as 1975 in his The Theory of Evolution. That was 19 years before Senapathy published his book (1994). The entire DNA sequence of the human mitochondrial genome - 16,569 nucleotides - was determined in 1981 (249), 13 years before the publication of Senapathy's book. It was relatively easy to identify the bacterial ancestors of the mitochondria in 1985 (246,p.177). The theory is now the standard view in biology and evolution textbooks (5). Nobel Prize winner Christian de Duve said about the proofs for the bacterial origin of mitochondria:
"In the opinion of the vast majority of investigators, these proofs are conclusive." (6).
Grauer and Li (2000) in Fundamentals of Molecular Evolution state "the molecular evidence is now overwhelmingly in
favor of the endosymbiotic theory". What Senapathy has to say is this:
"Some scientists have suggested that eukaryotes were formed by "endosymbiosis"...
Although there exists some resemblance between mitochondrion and bacterial cells, the origin of the nucleus in the
eukaryotic cell is still considered to be a total mystery." (p. 231).
Senapathy only devotes two short paragraphs to an issue of crucial importance to his theory. In the quote he switches from the problem of mitochondria to another issue: the nucleus. He conveniently omits that mitochondria have their own DNA (which is present in John Maynard Smith, 1975). Clearly, he wants to get rid of the theory because it undermines his own theory. The main problem for his theory is that it is extremely unlikely that a dual genetic system originated independently in a million eukaryotic species. Further, he cannot explain the simultaneous origin of a prokaryote (mitochondrion) and a eukaryote, because prokaryotes (mitochondria) have intronless DNA.
Additionally, it is even more unlikely that the mitochondrial genomes of all species always contain the genes for the critical electron-transport proteins for respiration, along with the necessary machinery to produce those proteins (13 mRNAs, 22 tRNAs and 2 rRNAs) (64).
Finally, if nuclear genes originated from random DNA, then how it is possible that more than 1,000 nuclear genes encode mitochondrial proteins? Genes from the nuclear and mitochondrial genomes must work in concert to generate a functional oxidative phosphorylation (OXPHOS) system. This system cannot have originated independently.
The most plausible explanation is that the dual genetic system arose only once and was inherited from the first eukaryote (63). One of the most irritating facts in Senapathy's book is that he dismisses a theory without a careful examination of the facts. The presence of mitochondria in eukaryotes is not an insignificant fact. It is now recognized that eukaryotic life on earth became the dominant form of life on earth because mitochondria caused gender (5).
See further: § 28 (endosymbiosis). Another organelle: peroxisome.
Energetics refutes Senapathy
Mitochondria bestowed upon their eukaryotic host 105-106 times more power per gene than a prokaryote. Mitochondria allowed their host to evolve, explore and express 200,000-fold more genes with no energetic penalty. Eukaryotes harbor approximately 12 genes per Mb, bacteria 500-1,000 genes per Mb. The high gene density and small protein size of bacteria can be explained in bioenergetic terms. "Prokaryotes cannot have evolved from eukaryotes, because the energy per gene required to bring forth the complex eukaryotic starting point for prokaryotic evolution under such views requires a prokaryotic endosymbiont to begin with" (131).
Termites refute Senapathy
Termites mostly feed on dead plant material, generally in the form of wood, leaf litter, soil, or animal dung. This is a problem if termites originated independently in a primary pond. To survive they immediately need other organisms (pants, animals, in other words: an ecosystem). Furthermore, to digest wood (cellulose) termites (eukaryotes) need bacteria (prokaryotes) in their hindguts. A termite never could originate and survive in the primary pond, because prokaryotes could not originate in the primary pond according to Senapathy.
Immune system of eukaryotes 28 jul 11
The immune system of eukaryotic animals is a problem for Senapathy. The function of immune systems is to combat bacteria. Bacteria are prokaryotes. According to Senapathy eukaryotes arose from the primordial pond and prokaryotes later developed from eukaryotes.
What's the point of having an immune system if there are no bacteria? Individuals without immune system would survive, and since genomes are fixed, eukaryotes could not 'evolve' the necessary genes (HLA genes) later when encountering pathogenic bacteria.
Simple is easy, complex is difficult.
The problem of the origin of prokaryotes and eukaryotes is part of a more general problem of simplicity and complexity.
In the theory of evolution simple organisms are the most easy to explain and complex organisms are the most difficult to explain. That's why evolution starts with single cells, and ends with complex multicellular creatures. Remarkably, in the Appendix Senapathy knows of "a simple bacterial cell [prokaryote], and a more complex cell [multicellular organisms]" (p.559). In Senapathy's theory simple and complex creatures have the same probability to originate from the primordial pond. A single cell is as likely as a whale, elephant or human. Maybe this is caused by the genome-centered view of life. Genomes of simple and complex species differ only in the length of their genomes? or do they? In fact, in contrast to eukaryotes, prokaryotic genomes are usually nearly completely devoid of mobile elements and introns and have genes with very simple regulatory structures, often transcribed into operons with negligible leader and trailer sequences. There are 30 differences between eukaryotes and prokaryotes (17).
Mitochondria contain DNA
"Take the enzyme cytochrome oxidase, for example, which handles the final step of cell respiration.
In mammals, the complex is composed of 13 subunits, 3 of which are encoded by mitochondrial DNA,
and 10 by nuclear genes. If the subunits of cytochrome oxidase don't work together properly,
electrons are not passed to oxygen and respiration fails, triggering the death of the cell."
"The mitochondrial and nuclear genes adapt to each other within a population, and the process must happen quickly
because the mutation rate is so high in mitochondrial DNA."
Nick Lane, Nature 19 Nov 2009
Chloroplasts contain DNA 15 Apr 2012
Several researchers proposed that chloroplasts evolved from bacteria (endosymbiosis) in the late 19th century on the basis of microscopic study of plant cells (246, p.44) and again in 1905 and 1907 (244). In 1962 Hans Ris and Walter Plaut demonstrated DNA in chloroplasts in plants (245). This is not an obscure publication, it has been cited at least 15 times between 1964 and 2011 (it is not known in wikipedia). In 1967 Lynn Margulis under the name L. Sagan concluded that not only chloroplasts, but also mitochondria evolved from endosymbiotic bacteria (246, p.44). In 1970 she published her paradigm-changing book Origin of Eukaryotic Cells. In 1978 Robert Schwartz and Margaret Dayhoff proved the endosymbiosis theory (248). Senapathy could and should have known all this in 1994.
Land plant chloroplast genomes typically contain around 110-120 unique genes. Some algae have retained a large chloroplast genome with more than 200 genes (source).
Second argument against Senapathy: chloroplasts are derived from free living cyanobacteria. That means those cyanobacteria must have existed before eukaryotic plants. This contradicts Senapathy's scenario in which bacteria evolved from eukaryota.
Bacteria and Archaea
To complicate matters further, the Woesian revolution established that that prokaryotes, far from being a homogeneous group, actually consists of two genetically very different groups: Bacteria and Archaea (73, 114). Although Archaea superficially resembled bacteria (being single-celled and lacking a nucleus), Archaea have a distinctively different metabolism, cell wall, and transcription machinery. That means in Senapathy's theory both Bacteria and Archaea are supposed to have originated from eukaryotes. This is very unlikely.
Size matters 19 Jan 2017
Bacteria are small. Because of that there are millions and billions of them. Bacteria multiply fast. Because they are fast and numerous, they evolve fast. That is an advantage for the origin and evolution of life. Evolution can try out millions of solutions for the problem of staying alive. This could be a matter of life or death. The origin of life on earth could have failed. Multicellular life is big and not so numerous. Therefore it evolves more slowly. It is logic that bacteria originated first, not animals and plants. They are too complicated, too big and too slow.
All mammals require a mother
© Lennart Nilsson (1990)
Human embryo with umbilical cord. The umbilical cord
is the link between mother and embryo
"When we consider the case of the independent birth of mammals, it is reasonable to think that a conglomeration of a large number of cells and biochemicals in the primordial pond could have formed an environment akin to that of the placenta and uterus of mammals. There, a seed cell can differentiate into an embryo and a full-grown offspring". (Senapathy, p.309)
A human baby without a mother? Surely, you're joking, Mr. Senapathy! (313). All mammals have internal gestation in contrast to egg-laying animals, like sea urchins, frogs, fish and worms (66). Could a human baby develop in a primordial pond? Could it survive without placenta, without mother? That would be a miracle (29). There is no constant supply of food and oxygen during nine months. There is no protection against pathogens or predators. The placenta comprises two components: a fetal portion and a maternal portion. So, how could a placenta and umbilical cord form without a mother? How could the first human individual have a navel?
Warm-blooded animals with a constant body temperature require a tenfold increase in energy expenditure above cold-blooded animals (101). Does the primordial pond have a temperature of precisely 37 °C? Birds have a body temparature of 38 - 43 °C. How does the primordial pond handle that?
Ignoring all the problems discussed above, the very idea that a isolated human genome could produce a human fetus contradicts established knowledge of many reciprocal adaptations of fetus and the pregnant mother. For example: Would the first human genome include genes for implantation, placenta, umbilical cord? What's the point?
The process of implantation is occurring during week 2 of development in humans. A synchronized dialog between maternal and embryonic tissues are crucial. Without implantation the embryo cannot survive.
Even a 100% accurate and complete human genome sitting in a cell does not have a higher probability of survival than any random genome. It could only survive when it is self-supporting: when it could get its own food (adult). Until that it really is a kind of parasite. The human genome is simply not designed to be self-supporting in a primordial pond at the one cell stage and many years after that.
Viviparity means the embryo develops inside the body of the mother. The best example is placental mammals. Another group is the pouched marsupials like the kangaroos and koalas of Australia. At birth, the baby kangaroo is no larger than a peanut, a blind, pink, hairless fetus-without-a-whomb that must crawl on its own through the mother's fur into the pouch. It drinks milk from a teat (122). However, scorpions, some sharks, some snakes, and velvet worms also are viviparous. Roughly 20% of non-avian reptile species (lizards, snakes) give birth to live offspring (viviparity).
Other problems: if the human embryo is in the water of the primordial pond during nine months, how does the sudden transition to air breathing (usually called: 'birth') happen? Who helps the baby out of the water to prevent death by drowning? After normal birth fetal haemoglobin (a crucial oxygen-carrying protein) drops off and the adult version kicks in. Organs, such as the kidneys and lungs, which do not function in the womb, must all switch on at the same time (called 'birth'). How is all this regulated and coordinated?
Babies have average birth weights around 7.5 pounds. In the primordial pond, the baby could grow to any size before 'birth', because it does not have to pass the birth canal of its mother. Females have wide hips and a large enough pelvic opening that enable babies with big brain sizes to be born. A real genome 'knows' to start the delivery at the right size of the baby. How does a human genome in the primordial pond know about the nine months? Timing of birth in humans is under genetic control (207). Does the mouse genome know that it is sitting in smaller animal and needs to deliver in a shorter time?
The human baby is born premature when compared with chimpanzees. The head of the fetus is still small enough to pass through its mother's birth canal. One of the consequences is that humans at birth are utterly helpless (42). The human brain doubles in size in the first year of life and achieves 95% of its adult size by the age of 5 (although white matter grows at least to age 18).
It takes roughly 13 million calories to rear a human baby from birth to nutritional independence at around age 18 or older (168). Big brains are so metabolically expensive that primates must postpone the age of reproduction in order to build them. High fecundity requires at least an extended family with fathers and grandmothers around to help provision and care for the young (109).
The first teeth of the human baby typically appear between six and nine months. It can take several years for all 20 teeth to complete the tooth eruption. How does the independently born baby get food without teeth and without a mother?
An essential amino acid or indispensable amino acid is an amino acid that cannot be synthesized de novo by the organism (usually referring to humans), and therefore must be supplied in the diet. Non-essential amino acids can be synthesized by the organism –provided the necessary cellular machinery is present. Also, omega-3 phospholipids (essentially for the brain) is primarily obtained from diet. Nearly all animals must obtain vitamins, carotenoids ( metabolically expensive chemicals!), etc. through their diet because their genomes have no genes for producing them.
Amazingly, in order to be a healthy individual, in Senapathy's scenario the first female genome would not need hormones for ovulation, menstruation, womb, pregnancy, and lactation. If no hormones are necessary, then genes for hormones are unnecessary. The genomes would be different.
Genetically, a fetus is half mother, half father. Why isn't the fetus rejected during pregnancy? Why is the immune system tolerating the fetus, which contains antigens that the maternal immune system recognizes as foreign because they are the products of genes inherited from the father? During pregnancy the foreign antigens of the developing fetus and the placenta come into direct contact with cells of the maternal immune system, but fail to evoke the typical tissue rejection response seen with organ transplants. The cause is the silencing of chemokine genes in the decidua, the specialized structure that encases the fetus and placenta (276). A hypothetical 'independently born fetus' has no father and mother, so does not have to solve these problems, but as soon it tries to reproduce it will have to. But it does not have the genes for it, because it was independently born. DNA has no foresight.
The fetus has internal and external sex organs which are useless in the womb. Furthermore, sex organs would be unnecessary for survival of the first individual. There is no reason to expect that the primordial pond would produce complete male and female genomes. Sex organs simply do not contribute to the health and survival of the first individual.
Why should the first female have a pair of breasts which grow considerably during puberty? A pair of breasts does not contribute in any way to the health and survival of the individual possessing them. They are a burden and a risk (breast cancer). Additionally, how does Senapathy explain that only 50% of the individuals have breasts? The first individual needs food, needs to escape disease and predation to survive, not sex. So why is the primordial pond not producing sexless individuals forever? (25).
Why are parents (especially mothers) motivated to care for their young? It certainly does not help the survival of the parents themselves.
Returning to the issue of complete genomes: what is a complete genome? If spontaneous generation of genomes were nature's method for producing animals and plants, then a healthy sexless individual is viable and complete. As an illustration: only one missing gene can make healthy male or female mice sterile (27). On the other hand one needs a few hundred genes, I guess, to add maleness and femaleness. To evaluate 'independent birth' we need to eliminate our deeply rooted prejudices about the necessity of sexual reproduction. Senapathy should take his primordial pond serious and reason from the point of view of the primordial pond, and resist relying on the benefit of hindsight.
In general, in bird and mammal biology altricial species ("requiring nourishment") are those whose newly hatched or born young are relatively immobile, have closed eyes, lack hair or down (naked), and must be cared for by the adults. Altricial young are born helpless and require care for a comparatively long time. Among birds, these include, for example,
herons, hawks, woodpeckers, parrots, owls and most passerines. Rodents and marsupials are altricial, as are cat, dog, fox, lion (they are carnivores) and humans. Altricial species usually have relatively short gestation periods. Altricial individuals, if ever 'born independently', don't survive.
The best known form of imprinting is filial imprinting, in which a young animal learns the characteristics of its parent.
It is most obvious in nidifugous birds, who imprint on their parents and then follow them around. Konrad Lorenz demonstrated how incubator-hatched geese would imprint on the first suitable moving stimulus they saw within what he called a "critical period" of about 36 hours shortly after hatching. Most famously, the goslings would imprint on Lorenz himself (wiki). In Senapathy's theory there are no mothers. In the absence of a mother the hatchling would follow the first creature in sight: a crocodile, a bat, or a Boa constrictor. In other words: the hatchling is doomed to die.
©Antal Festetics, 1983
Common descent versus independent origin
So far, I did not use common descent as an argument against Independent Origin. I restricted myself to the facts that make Independent Origin implausible, improbable, and impossible. This is more than enough to refute Independent Origin. However, contrasting both theories is helpful for understanding the origin of species.
Rejecting common descent comes at a huge cost: it equals reinventing the wheel a million times! All the combined adaptations that produce successful flight must be reinvented for each bird. All the combined adaptations that enable survival in the sea must be reinvented for each fish. It just seems crazy to reinvent a dog-like type repeatedly to explain wolf, fox and coyote. Small modifications of a basic dog-like type would suffice. Creationists and other critics of common descent must have suspected this problem and proposed a limited form of
common descent for similar organisms. Compare the two diagrams below. The first is from Senapathy and the second from intelligent design creationist Paul Nelson. Remarkably, both accept a limited form of common descent ('microevolution'). A dog-like 'basic type' produces
the dog, hyena, fox, wolf, and coyote species.
Fig 3. Senapathy, p.462
| similar species of a distinct creature|
millions of distinct independently-born creatures
Fig 4. Paul Nelson (7)
|pheasants ducks dogs cats horses|
creation of basic types
"many species within a genus usually connectable by evolution and many families within an order are sometimes connectable by evolution" (p.461)
However, this implies all the mechanisms for large-scale evolution such as mutation, selection, genetic drift, and the generation of new species! As the name of his theory suggests the 'independent birth' of organisms is the most important aspect of his theory.
Unexpectedly, Senapathy's theory is not a theory of independent origin! Further evidence comes from the primordial pond: numerous creatures originated from
"a common pool of genes in the same primordial pond" (p.455).
A common pool of genes denies the independent origin of genes. We have now a violation of independent origin at three levels: (1) common pool of genes, (2) microevolution (Fig 3), and (3) prokaryotes evolved from eukaryotes. Therefore, it is misleading to label his theory as 'independent origin'. That's cheating! Even worse: to claim simultaneously "That Evolutionary Theories Are Fundamentally Incorrect" (book title) is simply dishonest. Apart from the label, the amount of common genes is left unspecified. Probably because he has no theoretical reasons for their existence. I am afraid that random origin of DNA sequences predicts unique sequences, not multiple occurrences of the same sequence.
There is a practical implication of the hypothetical "common pool of genes": a common pool requires that there is only one pool on the earth. Where was it located? How big was it? How long ago? How long did exist? (25) Was it fresh or salt water? All unanswered questions! Furthermore, both mechanisms (independent and dependent) can be arbitrarily invoked to explain any pattern of similarities and dissimilarities in nature. Similarly, it could 'predict' any pattern. Far from being an advantage of the theory, it is actually a disadvantage. It is an ad hoc 'explanation'.
The scientific value of his theory becomes still worse (but still more comfortable for Senapathy), when he allows for arbitrary genome mixing:
"slightly changed creatures could also be produced in the primordial pond by mechanisms of genome mixing and genome alteration and or restructuring" (p.455)
"We stated in the new theory that an already successful genomes could be used, in part or full, in the construction of the genomes of later born organisms." (p.419).
In other words: stealth common descent! Stealth evolutionary reasoning! Please note: "in part or full"! My objections are:
Finally, let us not forget that in Senapathy's theory prokaryotes derived from eukaryotes, thereby contradicting independent origin again. It is an odd aspect of Senapathy's theory that bacteria, the most simple living organisms, did not arise directly from the primordial pond, but from more complex organisms!
- This is again contradicting independent origin;
- If 'genome mixing' completely mimics common descent, then there is no observational difference of his theory and common
- 'genome mixing' and 'in part or full' is arbitrary, too vague, too 'cheap', too 'easy' giving a maximum of freedom
- it does not make sense that anything goes at the moment that genomes originate and millions of years thereafter all
genomes are frozen (immutable)
- It is impossible to refute such a theory. When showing evidence that refutes independence, Senapathy always can claim
'my theory can explain this by a common pool of genes and genome mixing'. We do not learn anything new about nature.
See also: The final refutation of independent origin
On the other hand, in a sense evolutionary theory involves 'independent origin' of, for example, eyes and sex chromosomes (332). However, this is not independent origin of organisms from scratch, but of parts of organisms within the context of common descent.
The role of randomness and improbability
"When the number of random events are large enough, the unbelievable will certainly happen" (p.332).
Randomness is the single most important explanatory principle in Senapathy's theory. This is because his theory is based on random genomes. Ironically, randomness is very important for some evolutionists too: "the probability of even an extremely unlikely event happening is actually quite high" (100). The difference is that for evolutionists natural selection is an addition to unlikely events, while for Senapathy unlikely events are the only explanation for life on earth. Because he relies solely on random events, the available time is crucial. The available time is not infinite, but finite. The universe exists for 13.8 billion year and the earth exists for 4.54 billion years. So, not every possible event happens. Above that, major mass extinctions destroyed a lot what has been achieved and reset life to an earlier phase.
"When the number of random events are large enough, the unbelievable will certainly happen"|
(Senapathy, p.332). (124)
For the moment I will ignore that Senapathy introduces natural selection and micro-evolution in disguise through the back door. In the next sections I will discuss what the effect is of rejecting natural selection, mutation, adaptation, and time.
The role of natural selection
||See also: Adaptation, Ecology.
Consider the seed of a dandelion (Taraxacum officinale) in the picture above. Its physical properties are marvelously adapted to the physical properties of air to enable dispersal of the seed by air (67). Consider the number of hairs: too many or too few, too short or too long, too thin or too thick would fail to make it air born. The properties of the hairs ultimately depends on the density of air. The density of dry air at sea level is approximately 1/800th the density of water, but as altitude increases, the density drops dramatically. The density further depends on temperature and humidity. The density of air ultimately depends on its composition: roughly 78% nitrogen, 21% oxygen and 1% argon. So where does the perfect match of number of hairs, total weight of a seed and the density of air come from? Could this be a lucky accident following from the random assembly of DNA nucleotides in the primordial pond? (see also: The genome is blind).
Senapathy needs natural selection because not every assembled genome survives.
There is nothing in Senapathy's theory that tells us how many genome trials are needed to produce a human genome.
One? hundred? thousand? million? billion? trillion? Humans appeared in the fossil record 4,5 billion years after the origin of the earth. Why did it take so long?
If selection is a negligible factor, then the origin of (human) life could be a matter of hours!
The fundamental question here is how easy is it to produce a genome? (24)
The point of Fred Hoyle's Boeing-747 story (14) is that
building blocks are not enough to produce complex systems. Essentially Senapathy believes that a tornado in a junkyard produces a Boeing-747 (probably with a few selection steps).
Creationists and Darwinists reject the possibility that a complex system can arise by chance in one trial.
According to evolutionary biologists, numerous selection steps are needed.
According to creationists, 'intelligence' is needed.
Strickberger (15) compared the very low chance of getting the word 'EVOLUTION'
in one trial (figure 5) with the high probability of getting it in successive small steps (figure 6). Although some details of Strickbergers illustrations are confusing, it is clear that the method of figure 5 is essentially Senapathy's method. Therefore, Senapathy chooses the most difficult method (20). Senapathy's mechanism is a whole-genome-test. The Darwinian mechanism is a test of a small modification of a genome.
Another important difference between the Senapathy type of selection and Darwinian selection is that Senapathy's selection applies to unique genomes, while Darwinian selection applies to individuals of a species. The death of a Senapathy genome equals extinction, while Darwinian selection means that very similar genomes of the same species survive and can be improved.
The power of selection comes from endlessly repeated cycles of magnification of the successful genomes in populations of very similar individuals. Lucky accidents are magnified. This is crucial feature is completely absent in the theory of Independent Birth.
Suppose a healthy human 'male' originated from the primordial pond, but missing just one gene: the SRY-gen, which makes the unlucky individual completely infertile. That means no descendants and 100% selection against that individual. It means that this extremely rare and nearly perfect genome is extinct forever. The odds that the same genome with an intact SRY-gene will arise for the second time are astronomically low. Compare this with an endlessly repeated cycle of small improvements based upon successful individuals of previous generations. It will become clear that the intensity of selection in Senapathy's scenario is huge when compared to selection in the Darwinian scenario. The power of common descent is the accumulation of inventions and the power of natural selection is selection of small variations of proven successful individuals. I only realized the powerful advantages of common descent and natural selection when I compared them with independent origin.
Can we test whether genomes have a random origin? Of course. Senapathy should have given statistical tests of randomness of real genomes (they were available already in 1952, see box). For example, the frequencies of A,T,C and G should be equal if genomes have a random origin. Any deviation from randomness can only be explained by mutation and selection.
As far as Senapathy is concerned, a genome could have originated yesterday. His genomes are timeless fixed creations. Senapathy genomes do not contain any history.
Finally, any amount of selection after the creation of a genome destroys the whole idea of organisms arising directly and simultaneously from the primordial pond.
Common descent and Natural Selection are both central theories of Darwinism. Senapathy smuggles in downgraded versions of both and at the same time triumphantly claims that Darwin's theory is 'fundamentally incorrect'.
Fig. 7. Later Senapathy produced this figure on the internet, demonstrating the
extensive involvement of natural selection (although he disingenuously used different names such as 'failed trials', 'filter', 'window', 'pinhole'). He also uses the word 'natural selection', contradicting his book title "...Showing That Evolutionary Theories Are Fundamentally Incorrect."
Earth, Ecology and Climate
Natural Selection and ecology are connected issues. Example: The disappearance of vast tracts of tropical forest some 305 million years ago led to an explosion in the global diversity of reptiles and amphibians, thanks to the emergence of many new, fragmented habitats. During that period, climate change dried up equatorial rain forests in the land mass that later became Europe and North America. Many of the species that lived across these forests became extinct, and were replaced by a wealth of different types of reptile and amphibian that were particular to isolated habitats. Amphibians, which depend on aquatic environments, fared less well than reptiles, which were able to adapt to a drier world (Nature, 9 dec 2010). This cannot be predicted from the properties of spontaneously arising genomes because those properties derive from the laws of chemistry and did not change 305 million years ago. The only thing that changed was selection pressure.
Another counter example: the exploitation of the newly arisen angiosperms approximately 66 – 86 million years ago by early mammals triggered the diversification of mammals and a shift towards increased herbivory (273). Herbivores, frugivores, granivores, root- and bark-eaters, egg-eaters, insectivores, carnivores and omnivores all point to the dependence on plants, insects, reptiles. The primary pond is completely blind to these ecological factors.
Circadian cycle: gene activity shows a circadian cycle: a 24 hours clock (364).
See also: Adaptation, Adaptation and Ecology: how to start an ecosystem which discusses the importance of ecology.
The role of mutation: exploring DNA sequence space
"Mutations in a genome can only lead to normal individual variations, or to genetic defects, which are
absolutely (28) useless for organismal evolutionary change"(p.46)
and having said something about the effects of mutations, he goes on to declare the immutability of organisms:
"the genome of every independently born creature is unique and unchangeable into that of another unique creature, and
therefore is essentially immutable." (p.6)
Senapathy's use of the concept 'immutability' is very confusing. He does not clearly distinguish between
(im)mutability of an individual genome and that of its descendants; and between an individual and the population as a whole.
Evolutionary changes happen during million of years, not during the lifetime of an individual.
Neutral mutations are the stepping stones towards useful mutations.
Of course an individual has a unique genome. The cause of this uniqueness is mutation.
Recently, the DNA sequence of James D. Watson revealed 3.3 million single nucleotide mutations, of which 10,654 cause amino-acid substitution. In addition, 2-40,000 base pair (bp) insertions and deletions as well as copy number variation resulting in the large-scale gain and loss of chromosomal segments ranging from 26,000 to 1.5 million base pairs were detected (77). This means that there is a huge reservoir of genetic variation in a population of individuals. That is the material for natural selection to act upon.
"a snail can give rise to many different snail varieties, but never to a crab or a sea star." (p.46).
Exploring DNA sequence space
His 'immutability' concept introduces two more serious difficulties for his own theory.
Mutations are steps in genome space. If mutations are worthless, then steps in genome space are worthless.
This makes individuals isolated islands in genome space.
If genomes are essentially fixed, and cannot use mutations to explore genome space, then how
does the primordial pond find those rare viable genomes?
How does it avoid those unsuccessful genomes? This is impossible.
Furthermore, if there are no viable intermediates in 'genome space' and viable genomes are rare, then this is a problem for both independent origin and gradual evolution. There is only one world, therefore both theories have to deal with the same genome space.
Potentially, 'Independent Origin' has the advantage that it does not need to explore genome space by a limited number of trajectories through genome space, but can hit isolated sequences that are inaccessible for an evolutionary step-by-step process.
However, Senapathy needs either a huge amount of luck or a huge amount of selection.
A huge amount of luck is unsatisfactory and a huge amount of selection contradicts his own claim that selection is unimportant.
Senapathy postulates a very resourceful primordial pond ("The number of genes in it must have been several times more than that contained in all creatures that ever have lived on earth", p. 312 ).
Several times? Could you be more precise?
His claim that "mutations are absolutely useless for organismal evolutionary change" is in conflict with his statement:
"many species within a genus usually connectable by evolution and many families within an order are sometimes connectable by evolution" (see: Common descent versus independent origin). If one accepts common descent up to the level of families and orders, how could this be achieved without advantageous mutations? One cannot have common descent on such a large scale based exclusively on harmful mutations! One cannot create even one new species based on harmful mutations.
The role of adaptation: random perfection
"At the time of the birth of organisms, "random perfection" of organisms filtered the meaningful organisms from among the myriad mostly meaningless independently-born organisms. Those creatures that fit well with the physical environment survived while others perished. Among the physically fit immutable organisms, ecological fitness occurred by chance." (p. 204) (81).
One of the main functions of flowers is to attract animals. Why? How did this happen?
Even Goethe felt compelled to explain the origins of floral structures.
For Senapathy the answer is: Random perfection! Why would anybody opt for such a desperate 'explanation'?
If all adaptations are the direct result of randomly assembled genomes, then we can not ask any further questions about
those adaptations. We can not make any progress in our understanding of adaptation.
'Random perfection' caused by random genomes is the final answer. Don't ask any further questions.
In fact, every property of an organism must be explained by random genomes according to Senapathy's theory, since
mutation and natural selection are excluded (81).
This implies that we never will understand the big questions in biology: the origin of adaptations like
the brain, eye, ear, nose, hart, lung, digestion, photosynthesis, meiosis, respiration, blood circulation, warm-bloodedness,
sexual dimorphism, parental care and bird migration, let alone the interrelations between them
(61). This is an unacceptable drawback for a professional biologist.
Senapathy is forced to accept 'random perfection'. He has no alternative. He has no choice.
Fig. 8. Is the match between the extreme long spur of the orchid and the extreme long tongue of the moth an
© image Sinauer 2005 (60)
Senapathy misses a number of crucial points here: a few trials are not enough to determine if an
individual is 'ecological fit'. Genomes cannot be tested in isolation from other species, because species are each other's environments!
(See this page).
Senapathy's theory reduces organisms to isolated individuals. We need a theory that let species originate, evolve and
adapt to their local environments including other species. Additionally, the origin of species is completely
unconnected to the geological context (geographical differences, continental drift, ice ages, meteorite impacts, climate
changes, etc). In Darwinism the environment is an important external causal factor (externalism).
Furthermore, genomes cannot be tested at one point in time only, because that leaves unexplained how species are able to
adapt to ever changing environments.
Fig. 9. Which genome originated first:
the humming bird genome or the flower genome?
How could they survive
and reproduce without each other?
Is this mutual adaptation merely a genome accident?
© image Sinauer 2005 (60)
Timema poppensis partially camouflaged on its host,
coast redwood Sequoia sempervirens, California.
Could both the genomes of these organisms originate
randomly and yet the organisms look very similar?
Senapathy describes in several pages the complexities and the diversity of the eye in the animal
world and claims Darwinism can not explain this. He ignores that his theory implies that the eye has to be reinvented a
thousand times in mammals which all have the same type of eye. According to his theory, the eye has been independently
produced by the genomes of the rabbit, squirrel, mouse, bat, tiger, lion, leopard, deer, bear, giraffe, buffalo, dolphin,
rhinoceros, elephant, monkey, ape, human, etc, etc.
Creationists frequently claim that evolution relies exclusively on randomness, but in fact randomness is
an adequate characterization of Senapathy-genomes. For Senapathy, life is a 'genome accident'.
The genome is blind
"The genome is blind and cannot visualize the existing niches and environments. Therefore, millions of bizarre phenotypes
must be produced in a species for the selection of one useful structure. (p. 89, see also p.75. my emphasis).
This looks like a devastating argument against Independent Origin. Surprisingly, these words are written by Senapathy himself. He is perfectly aware of the problem that genomes are blind. Remarkable for someone rejecting natural selection, he uses the word 'selection' in the above quote!
He writes: "the genome of the reptile or the wingless invertebrate did not "know" that there was a medium called air in which the animal could fly if it developed a wing for its host" (p.89). "To the genome of an animal that lacks a wing, the new genes that code for the feathers have no meaning." (p.75). Surprisingly, he turns this into a problem for Darwinism by stating that an almost infinite number of random mutations should occur in order to arrive at a wing. 'Infinite' is exaggerated, but in principle 'Independent Origin' has the hypothetical advantage compared to Darwinism, and that is because it sidesteps the need to modify existing structures because all organisms originate de novo. Viewed from the genomic perspective: the potential advantage of 'Independent Origin' is that genomes are not restricted by paths or trajectories in genome space, while evolution is strongly constrained by accessible evolutionary paths.
Consider for example the origin of land animals. The vertebrate transition from water to land requires changes in a variety of functional systems including feeding, respiration, support and locomotion. The key question is: is it easier to produce an organism from chemicals or to modify an existing organism? How many random genomes are needed to produce a land animal or a flying animal from scratch compared with modifying aquatic animals or non-flying animals? That must be billions of times more. In the theory of evolution the origin of flight starts with a fully functional reptile or insect (evolution is cumulative). Senapathy has to produce a fully functional animal plus a pair of fully functional wings from a random genome. Which of the two is the most difficult task? He is also fully aware of the fact that birds have additional unique properties. Again: the probability of producing all those features from scratch must be very much lower than adding them one by one to an existing animal. Even if Darwinists would not have any idea about the genes and mutations involved, still the probability of adding features to an existing design must be orders of magnitude easier than developing a complete animal from scratch. Birds inherit all their features (metabolism, anatomy, cell structure, behavior, reproduction) from reptiles, tetrapods, multicellular organisms, eukaryotes, and single celled ancestors. The primordial pond has to reinvent everything as many times as there are species. Rejecting common descent comes at a huge cost: it equals reinventing the wheel a billion times!
See also: Common descent versus independent origin and: The role of natural selection).
Further examples: 'magnetic compass' (directional information, which enables an animal to maintain a consistent heading, for example towards the north or south) and 'magnetic map' (a few animals can also derive positional information from Earth's field). Magnetic sensitivity is phylogenetically widespread; it exists in all major groups of vertebrate animals, as well as in some molluscs, crustaceans and insects. The list includes groups such as flies, chickens and mole rats, none of which migrate (Nature). The molecular and genetic basis are cryptochromes that, in migratory birds, are thought to enable sensing of Earth's magnetic field. What does a blind genome know about the earth's magnetic field?
The role of time: the chronological order of life
Time does not play a big role in Senapathy's theory. Only 3 periods in earth's history are distinguished:
Figure 11.1. The chronology and time table of the independent birth of organisms.
Chapter 11: 'A New Look at The Fossil Record', page 497.
Current view of the Cambrian Explosion (541 million to 515 million years ago, that is a 26 million years period) indicated by red box. © Science (319)
The figure shows that the Cambrian Explosion was preceded by the Ediacaran Biota (EB) and that a huge number of Classes, Orders, Families, Genera and Species originated after the Cambrian Explosion.
Senapathy is careful to exclude absolute dates in the figure. However, from the legend of the figure it appears that the primordial pond existed from 600 - 595 million years ago. The start of the chemical evolution is dated at 4 billion years ago, which agrees with orthodox science. Furthermore, in the text of the chapter he claims that the beginning of the primordial pond coincides with the Cambrian explosion starting at 533 million years ago and lasting 5 - 10 million years (p.496) (94). In the figure he indicates 'the end of the birthing activity' but again no date is given. In the text he writes that the primordial pond existed for a few tens of millions of years (p. 504). On page 505 he writes that the fertile period in the history of the earth lasted 50 - 100 million years. On page 204 he writes that the primordial pond 'became barren millions of years ago' which does not help very much in pinning down the precise date. Ignoring contradictions in his data, the period of the existence of his primordial pond is from about 600 to 500 million years ago. Thereafter, no new organisms are born (only extinction). This is unfortunate for his theory because the following species appeared in the fossil record after the primordial pond became barren (98):
The reader is advised to have a look at: TimeTree: The Timescale of Life or: Deep Time. Interactive Infographic or: GSA Geologic Time Scale or: Geological time scale (wikipedia).
- first humans did appear 6-7 million years ago
- Old World Monkeys: 25 -33 million years ago
- first bats, dogs, weasels, elephants: 30 - 40 million years ago
- first placental mammals (rabbits, whales, rodents): 60 - 65 million years ago
- first mosquitoes, honeybees: 65 - 70 million years ago
- first turtles: 65 - 190 million years ago
- first butterflies (Lepidoptera): 150 million years ago
- first birds: 155 - 165 million years ago
- first frogs, crabs: 135 - 190 million years ago
- first flowering plants appeared about 135 million years ago; grasses appeared around 94 million years ago
- first sex chromosomes (XY) appeared some 200 million - 300 million years ago
- first amniotes (land-living vertebrates): 310 million years ago
- first amphibians: 360 million years ago
- first tetrapods: 375 - 363 million years ago
- first insect fossil appeared 412 million years ago
- first land plants appeared 465 - 470 million years ago; plant with leaves appear with a delay of 40 million years; trees: 475 million years ago.
- fish with jaws appeared: 416 - 359 million years ago
- jawless fish appeared: 500 - 435 million years ago
- first multicellular animals: 635 million years ago
- eukaryotes: 1,000-1,300 million years ago
- cyanobacteria: 2.15 billion years ago
- stromatolites: 3.4 billion years ago
- putative fossilized microorganisms that are at least 3,770 million and possibly 4,280 million years old
- Potentially biogenic carbon preserved in a 4.1 billion-year-old zircon
The reader will search in vain for human fossils in chapter 11 A New Look at The Fossil Record. Senapathy forgot to discuss human fossils. That's a pity, because I would like to learn why there are no human fossils known from the Cambrian period and the whole primordial pond period. If a human fossil was found in the Cambrian period the theory of evolution would be falsified and it could be compatible with the theory of independent origin.
Insect evolution © Nature 2 Aug 2012 (281)
Furthermore, many forms of life appeared in the fossil record before the start of Senapathy's primordial pond (600 Mya). There is solid evidence that life was present on the earth more than 3 billion years ago. Conclusion: life appeared 2400 million years before the start, and 500 million years after the end of Senapathy's primordial pond. Secondly: the first fossil animals found in the oldest layers were creatures that lived in the sea (trilobites, brachiopods), only later animals and plants living on land are found. Why? The first animals on land were amphibians and reptiles. Reptiles dominated the earth before mammals appeared. Mammals appeared much later. This was known before Darwin (1859). There was little overlap between "the age of reptiles" and "the age of mammals". Traces of humanity did not appear until the very end of the record. This is chronology.
Anyone proposing a non-evolutionary explanation for the origin of life and species must start with explaining the fossil record as known in 1859 and everything that has been learned since then. Senapathy did not do this. This fossil record cannot be explained by a primordial pond that produces every species at the same time.
Biological and physical necessity of chronology
The chronology of appearance of different forms of life, is not only a blind fact of the fossil record, but also biologically and physically necessary. According to Senapathy the primordial pond produced complex eukaryotic organisms at the same time as single-celled eukaryotes. But, how can it be that only after 3 billion years, about 600 million years ago, organisms emerged from the microscopic world, became larger, and shortly thereafter developed skeletons and shells? (159).
||Life got big after nearly 3 billion years of microbial evolution. Soft-bodied organisms of centimeter- to meter-scale first appeared 579 million years ago (180). Why the delay if every genome (mono-cellular and multi-cellular) was produced?
We humans are only here because of a remarkable series of revolutions in Earth history, each revolution built on the previous ones. The Earth was not ready for 'higher' forms of life. By a process of niche construction the Earth prepared itself for land animals and plants. The first forms of life derived their energy from simple chemicals coming from the Earth ('chemo-litho-autotrophs'), and not from sunlight or by consuming organic material (325).
Chronology fact: plants
The first plants on land were small and leafless. Plants with leaves appeared 40 million years later (72). Why don't leafless and leaved plants appear randomly in the fossil record? Why were plants (green algae) present in oceans 500 million years before plants colonized the land? Why do only land plants (especially trees) possess lignin? (277). What would be the use of enzymatic lignin decomposition gene in a genome of an organism born in the primordial pond? Why the chronological order of appearance? In Senapathy's genome-centered view the chronological and environmental context is absent. There is no possibility to answer questions why certain fossils are where they are.
All organisms are born in the Primordial Pond and move out (see cover illustration of his book). But what about plants? Rooted and unable to flee from the Pond, plants drown and die. How could seedlings survive in the primordial pond? Could seeds germinate in a pond at all? And how do get from naked DNA to a seed anyway?
Example: the desert-dwelling tobacco species Nicotiana attenuata has an array of mechanisms to survive severe environmental stresses including fire, herbivores and drought. However, could it survive in a pond? How does it move from the Primordial Pond to the desert?
Chronology fact: photosynthesis
There ares two photosynthesis methods: C3 and C4 (used by 7500 plant species,
mostly subtropical grasses, maize, sugarcane).
The C3 method is optimal for high atmospheric carbon dioxide levels (100x today) and the C4 is optimal
for low carbon dioxide levels.
However, there is only one atmosphere and one primordial pond in Senapathy's scenario.
Data support the view that the C3 system arose more than 2800 million years ago under high carbon dioxide levels,
the C4 system arose as an adaptation to low carbon dioxide levels about 30 million years ago (72).
If both systems were produced by the primordial pond, then the primordial pond must have existed from 3 billion years ago
up to 30 million years ago. Unlikely as it is, it still does not explain why it produced the systems in that chronological
order and why with such a huge time interval.
See also: § 22 Incompatible primordial ponds.
Chronology fact: Great Oxygenation Event
Great Oxidation Event (appearance of free oxygen O2 in Earth's atmosphere around 2.4 billion years ago) is
believed to have followed the development of oxygenic photosynthesis by ancestors of modern cyanobacteria. Cyanobacteria have played a central role in the evolution of life on Earth, both by producing oxygen as a photosynthetic byproduct and by generating organic carbon, the major ecological energy commodity.
DNA sequences from extant organisms bear an imprint of the GOE. Enzymes that bind molecular oxygen are more likely to appear in organisms that emerged after the Great Oxidation Event (155).
However, it takes time to fill the earth's atmosphere with oxygen. To be precise: nearly 2 billion years! It was not until oxygen levels rose even higher, around half a billion years ago, that the oceans could support large multicellular eukaryotes that got their energy by burning food. In Senapathy's genome-centered view the chronological and environmental context is absent.
See: Great Oxygenetion Event (wikipedia)
Chronology fact: Nitrogen cycle: bacteria first
Proteins and DNA contain Nitrogen (N). The atmosphere of the earth contains enough nitrogen (78%), but remarkably, animals and plants can not use it. Only nitrogen-fixing bacteria (prokaryotes!) can use nitrogen from the air. (Some fixation occurs in lightning strikes). This imposes a chronological order: first bacteria, then plants, then animals. So, independent origin of animal and plant genomes is impossible. In Senapathy's genome-centered view the chronological and ecological context is absent. 2 Jun 11
See: Nitrogen cycle (wikipedia)
Chronology fact: chronology of sex
Why on earth was there no sexual reproduction in the first half of the history of the earth? These are facts from the geological record. One does not need to be an evolutionist to accept them. On the theory of independent origin such groups of organisms should appear randomly in the history of the earth or why not all at the same time?
How many primordial ponds?
Why not propose thousands or millions of primordial ponds in stead off one?
It could make independent origin much and much more easier! Maybe Senapathy needs
"a common pool of genes in the same primordial pond"? (see: §13).
That suggests he needs just one pool.
If necessary he claims a separate primordial pond ('Burgess pond', p. 324, 328) without discussing what it means for the
universal genetic code.
Why not propose a primordial pond that lasts millions of years?
In the Preface he notes that numerous organisms might have been assembled more or less simultaneously within the
primordial pond (page x).
Amazingly, Senapathy did all this! He claimed that there exists a single primordial pond (p.8), two distinct ponds
(p.500), many (p.502) and 'millions of small and large ponds must have existed on the primitive earth' (p.214).
Please note that millions of separate ponds make the existence of a universal genetic code an unsolved mystery, to put it mildly.
Extinctions and recoveries
The independent birth theory does not predict a specific order of appearance or extinction in time because genomes
are randomly generated in time. However, the fossil records shows for example that "the extinction of dinosaurs at the Cretaceous/Paleogene boundary was the seminal event that opened the door for the subsequent diversification of terrestrial mammals" (133). For the first 140 million years of their evolutionary history, mammals were small (up to 15 kg).
Such a pattern should not exist according to the theory of independent birth.
How does the theory of independent origin deal with the big extinctions and recoveries? The figure shows 5 major mass extinctions.
Each is followed by recovery of the number of species. Overall, there is a clear increase in the number of species from 600 million years ago to today. This would require that primordial pond(s) must have been active continuously from 600 million years ago up to today.
Figure: Nicholas Barton et al (2007) Evolution, p. 283.
Relation between macro evolution and regulatory innovation:
Macroevolutionary trends in animal diversity and gene regulation show that there is a chronological order in the appearance of animal groups and regulatory sequences contradicting expectations of random origin in a primordial pond. Source: (221).
Conclusion: It is OK to reject neo-Darwinism. However, the first appearances of animal and plant groups in the fossil record are the raw data which any theory of the origin of species has to explain. Both Senapathy's theory and neo-Darwinism have to explain exactly the same data of the fossil record. It does not help to deny the existence of 'missing links': the chronology of fossils are facts.
Introns and exons do not change the facts of the fossil record.
The role of place: the biogeography of the origin of species
According to Senapathy, life originated in the primordial pond. Where on earth is the primordial pond located? Senapathy does not tell us.
Biogeography reminds us that the word "origin" denotes both a process and a place—that the great variety of life did not just arise in some indistinct and misty nowhere. Instead location matters. When we study distributions we begin to associate the evolution of plants and animals with a particular setting, thus providing a tangible background to the birth and development of species. The Earth is not merely the cradle of life; it is its whomb.
Imagine life evolving on a planet covered by a single, uniform ocean—of constant depth, stable temperature, and few currents, and you have imagined a planet where life would very likely remain simple and relatively homogeneous (123).
Endemism: a species is unique to a defined geographic location, such as islands (Hawaii, Galápagos Islands, Socotra, Tasmania), isolated areas such as the highlands of Ethiopia, or large bodies of water like Lake Baikal. So, every island its own primordial pool?
Transition from sea to land
There was a time that there was not much land (see illustration above). There was a time when there were no animals or plants on land.
Animals and plants originated in water and colonized land when land became available. Hemichordates are exclusively marine animals. The green alga Chlamydomonas reinhardtii represents early photosynthesizers, confined to water and never expanding beyond a simple single cell. Bryophytes (mosses, liverworts, and hornworts), which evolved 450 million years ago, were among the first plants to colonize land. For the moss Physcomitrella patens, the climb to shore required new genes for surviving dry spells and temperature swings, and that resulted in a more complex genome, with expanded families of genes (210). Later came woody plants. For animals a similar story can be told (see: Neil Shubin). Keywords: place and time.
See also: Biogeography.
The clumpiness of morphospace
"What can be more curious than that the hand of a man, formed for grasping, that of a mole for digging, the leg of the horse, the paddle of the porpoise, and the wing of the bat, should all be constructed on the same pattern, and should contain the same bones in the same relative proportions?" (300).
Most organisms are well adapted to their immediate environments, but also built on anatomical ground plans that transcend
any particular circumstance. Why should structures adapted for particular ends, root their basic structure in homologies
that do not have a common function? Why should this be so, if all organisms arose independently?
Genome sizes are not randomly distributed over species.
This feature of life on earth is called 'the clumpiness of morphospace': the inhomogeneous occupation of all possible forms of extant or extinct animals. This clumpiness must be explained. In the theory of evolution, the cluster of cats exists primarily as a consequence of homology and historical constraint. All cat-like animals (lion, tiger, puma, leopard) share a basic morphology because they arose from the common ancestor of all cat-like animals.
- Why are genomesizes not randomly distributed over all living species? Why are they clumped? Why do genomesizes of birds and mammals not overlap?
- Why do all backboned animals have four fins or limbs, one pair in front and one pair behind? Why are all land vertebrates 'tetrapods' (4 legs), while none have six or eight legs?
- With a few exceptions, all mammals and birds are warm-blooded, and all reptiles, insects, arachnids, amphibians and fish are cold-blooded (218). Why is this, if they are independently born?
- Why do birds (of prey) have no teeth? It would be advantageous for larger pieces of food which cannot be swallowed in one piece. Why don't birds have a third pair of arms to handle their food? (since they can't use their wings for that task). Humans, apes, squirrels can use their hands for handling food. Why aren't there birds with internal gestation like mammals? Why are the chicks of all passerines (songbirds) altricial (blind, featherless, and helpless when hatched from their eggs)? Why are most passerines smaller than typical members of other avian orders? Why has the foot of all passerines 3 toes directed forward and one toe directed backwards? Why do birds embryologically start with paired oviducts, but one or the other side fails to develop (together with the corresponding ovary), and only one functional oviduct develops?
- What is the matter with genomes: birds have remarkably small genomes, averaging 1/2 to 1/3 of the size of typical mammalian genomes. Why should that be when all eukaryotic genomes arose independently? Small genome size is intriguingly correlated with flight. Bats, compared to other mammals, have small genomes, and flightless birds, compared to other birds, have larger genomes. (220). Why are introns and intergenic regions of birds half the size of mammals? (348).
- Why do only birds and insects fly, not frogs and mammals (except bats)? Why does not one of the almost 40,000 species of spiders fly? Why do all spiders have eight legs? Why do all spiders produce silk? Why are all spiders predatory and not herbivorous?
- What is the matter with genomes that there are twenty thousand species of birds and only twenty species of crocodile? (Mike Benton)
- What is the matter with genomes that insects are more numerous than any other type of animal, accounting for 80% of species?
- What is the matter with genomes that, unlike most animals (females: XX, males: XY), female birds are the heterogametic sex, having the equivalent of a human Y chromosome, called the W chromosome (females: ZW, males: ZZ) except ostrich and emu. Why is this property not randomly distributed over animals and birds?
- Why does the domain of mammalian carnivores contain a large cluster of cats, another of dogs, a third of bears, leaving so much unoccupied morphological space between? This is not expected on the random genome origin scenario.
- Why do each of the more than 1,000 species among one group of centipedes have an odd number of leg-bearing segments?
In a world of independent origin, a world without history, where all features of organisms express their initially created state, why does homology exist at all? If organisms arose independently, they would show more structural variation, and not be morphologically clustered as varied
manifestations of 'archetypes' (47). Senapathy cannot use historical developmental constraints, because Independent Origin is an unhistorical or even anti-historical theory. Senapathy can not use limitations of genome production either, because if genomes are random, then any genome is possible.
The primordial pond is a free lunch
There is only one primordial pond (see: flap text and here). The primordial pond is the birthplace of all species. It must have been a very busy place with millions of 'species'. Water is the natural home of fishes. Predatory fishes (for example tuna) prey on other fishes.
As soon as a predatory fish originates in the primordial pond, it starts eating. It swallows everything it can get. Therefore, predatory fish easily cause the extinction of every 'species' it can swallow, because the method of 'independent birth' does not produce species but single unique individuals. Thus before a single unique individual can multiply and become a population, it has been swallowed by a predatory fish. Likewise, plankton feeders will exterminate all plankton. The primordial pond is a free lunch until the predators die of starvation when all prey has been eaten. Likewise, pathogenic bacteria, viruses (213) and fungi (286) responsible for infectious diseases, will kill their hosts. (fungi are eukaryotes, bacteria and viruses are not eukaryotes and are supposed not to originate in the primordial pond). That is why Darwin, Oparin and Haldane already argued that life could have emerged only on a sterile, lifeless planet (50).
On a molecular level, chemical inhibitors will prevent DNA synthesis, replication, transcription and translation or any other enzymatic reaction. For example, the peptide alpha-Amanitin is an inhibitor of RNA polymerase II.
Incompatible requirements for a primordial pond
The primordial pond is the 'birthplace' of all species. But organisms have incompatible environmental demands: some require oxygen, others require anoxic (anaerobic) conditions. Some bacteria require hydrogen sulfide as a source of energy. Some require high temperatures (thermophilic), others require low or very cold temperatures. Some are acid-loving, others are acido-phobe. Some plants and animals require salt water (sea), freshwater flora and fauna requires freshwater (lakes, rivers). (The most famous halophilic algae, Dunaliella salina, survives up to 23% salt and the extremely halophilic archaeon Haloquadratum walsbyi). Reproduction of fishes in the tropics (no seasons) is during the whole year, while in temperate regions they reproduce at the end of spring. All those conditions cannot be combined in one and the same primordial pond.
Furthermore, how could the primordial pond produce anything else but organisms that can live in water and extract oxygen from the water? (fishes, mollusks). A pond is deadly for air-breathing animals except those that live near the surface (whales, crocodiles, otters, etc) or live only partly in water (sea lions, seals, etc).
Furthermore, if born, why and how do organisms migrate from the primordial pond to all those different locations? Some organisms only survive at deep sea, hydrothermal vent communities are found at depths ranging from 1,500 to 3,200 m. Giant tube worms (chemoautotrophs), in the absence of sunlight, subsist on hydrogen sulfide found in the warm waters surrounding vent communities. There is evidence for living prokaryotic cells in 1626 meters below the sea floor sediments that are 111 My old and at 60° to 100°C (79). The bacterium D. audaxviator lives at 2.8 kilometer depth in a South African gold mine and is lacking a complete system for oxygen resistance, suggesting the long-term isolation from O2. That means it is damaged by oxygen (102).
Emperor penguins live in probably the most extreme conditions endured by any warm-blooded animal on earth. They even breed in the depths of the Antarctic winter at temperatures of -30°C (-22°F). They have so many cold adaptations that in warmer weather, overheating can be a problem. So, if the primary pond is located in moderate or tropical regions they will die from overheating.
Eggs incompatible with water
In the primordial pond organisms develop from 'egg cells'. Please, have a look at the cover illustration again: not accidentally, there is no bird or mammal creeping out of the primordial pond. Bird eggs do not survive in water (apart from the requirement of incubation). Saltwater crocodiles, marine iguanas and sea turtles, although marine, lay eggs on land. Please have a look at the cover illustration: a turtle is coming out of the primordial pond! However, all turtles lay eggs on land. Also, reptiles cannot successfully lay eggs under water because gas exchange across the eggshell is much slower in water than in air.
Bar-headed goose. © Chalto Digital Images
On the other extreme are bar-headed geese (Anser indicus)- the world's highest-altitude migrants - fly from their winter feeding grounds in the lowlands of India, sometimes even directly above Mount Everest (29,000 feet or 8,800 meters), on their way to their nesting grounds on the Tibetan plateau (only a third of the oxygen available at sea level). They have a special type of hemoglobin that absorbs oxygen very quickly when the birds are at high altitudes; as a result, they can extract more oxygen from each breath of rarefied air than other birds can. The most plausible explanation for this migratory behaviour is the geological history of the region (80). How do these geese, if born in the primordial pond anywhere in the world, -ignoring all other problems-, know to fly to the Tibetan plateau?
If one wishes to escape from incompatible requirements, why not propose thousands or millions of primordial ponds in stead off one? It could make independent origin much and much more easier! For example it would make it easier to explain the unusual breathing system in some dinosaurs, a group called Saurischian dinosaurs who lived at a time when the oxygen level at the surface of the earth was only 10 percent (103). Alternatively, why not propose a primordial pond that lasts millions of years? (see: §18 The role of time: the chronological order of life). The secret could be that Senapathy needs "a common pool of genes in the same primordial pond"! (see: §13 Common descent versus independent origin).
How big is the primordial pond?
Would a 90-tonne blue whale (Balaenoptera musculus) fit in the primordial pond? or only a juvenile? Just asking...
The final refutation of independent origin
Conserved chromosome segments between human and mouse are the final refutation of independent origin. If all genomes arose independently from the primordial pond and if the distribution of genes over chromosomes were random, then genes of related species should not have the same linear order on their chromosomes. However, if a great number of genes appear in the same order in different species, this cannot be explained by pure chance. This is exactly what has been found when geneticists recently compared the genome of mouse and man
(8,9,10). A segment of roughly 90,5 million bases on human chromosome 4 is similar to mouse chromosome 5. (11). Almost all human genes on chromosome 17 are found on mouse chromosome 11 (12) and human chromosome 20 appears to be entirely orthologous to the bottom half of mouse chromosome 2, apparently in a single segment (13). That means that thousands of genes are in the same order in mouse and man. A few genes might be expected to be in the same order by pure chance, but not thousands. This can only be explained by common descent of mouse and man. If all species were independently born, then the probability of finding similarities in a human-mouse comparison should equal the probability of finding it in, say, a human-turtle, a human-fish or a human-mushroom comparison.
Of course, Senapathy could not have known all these facts in 1994, but conserved chromosome segments are now the most impressive
refutation of independent origin. This evidence alone is sufficient to refute independent origin. No theory of independent origin can survive this evidence. The above argument is only about genomics. Anatomy also has a story to tell: Neil Shubin (2008) Your Inner Fish: A Journey into the 3.5-Billion-Year History of the Human Body.
Her response was: "Do you really think that an insect or a rat simply came about as it is?"I simply answered "Yes, I do!"
Today, I would point out that the most crucial and unequivocal fact against Senapathy's theory of independent origin is the fact that DNA cannot spontaneously originate from building blocks, or the universality of the genetic code, or the fact that DNA needs proteins to be transcribed or translated.
The formal refutation of independent origin was the publication 'A formal test of the theory of universal common ancestry'
by evolutionary biologist Douglas L. Theobald (130).
The origin of life
3 Sep 13
Senapathy's theory is a modern version of the theory of spontaneous generation. The Greek philosopher Aristotle believed in spontaneous generation of life. As late as the seventeenth century philosophers believed that mice, frogs, and eels could emerge from garbage, mud, and river water. According to Alec Panchen "Lamarck rejected common descent. Lamarck's theory was of continuous events of spontaneous generation with descent from generated organisms of innumerable parallel evolutionary lines. I know of no 20th-century evolutionist who accepts this view." (69). Indeed, Senapathy is no evolutionist. Louis Pasteur gave the final deathblow to Spontaneous Generation of bacteria. A very useful history of Spontaneous Generation is given by Iris Fry in The emergence of life on Earth (chapters 2,3,4).
In modern science the Origin Of Life field has grown into a separate research field with strong connections to astrobiology,
organic chemistry and geochemistry. According to the most recent textbook of Evolution (89)
there are seven critical steps in the origin of life:
Please note that here DNA is the final step! Senapathy starts with DNA and ignores or denies the necessity of all previous steps!
But even starting with DNA there are 3 separate problems: 1) origin of the double helix (4 bases internal, phosphate-sugar backbone external), 2) origin of the sequence, 3) the origin of the genetic code (translation problem, mRNA, tRNA, ribosome). Senapathy wants to solve these 3 problems at the same time. No organic chemist believes that is possible that DNA bases or sugars (ribose) spontaneously form and assemble into DNA in a prebiotic world (263). But Senapathy proposes a theory of the origin of life! He thinks all (eukaryotic) genes are assembled from scratch and so nothing is inherited from a previous phase (RNA-world). So, he also ignores or rejects the RNA-world hypothesis:
- the generation of simple organic molecules from inorganic molecules
- chemical "evolution" to produce more complex organic molecules and primitive metabolic networks
- the origin of self-replication and the creation of "genotypes"
- compartmentalization and the creation of cells
- the linking of genotype and phenotype
- the origin of the genetic code
- the takeover of early replication systems by one involving DNA
Genes–First versus Metabolism–First
Another way to look at the origin of life is two competing models: "genes first" ("replication first") versus "metabolism first". In the "genes first" model replicating DNA or RNA sequences arise first, in the "metabolism first" inorganic catalyzers convert simple and abundant inorganic compounds, such as carbon dioxide, into more complex organic molecules. Obviously, Senapathy represents the "genes first" model.
The RNA World
The discovery that RNA molecules can act as catalysts provides a possible solution to a long-standing 'chicken and egg' dilemma:
In other words: the interdependent world of nucleic acids and proteins which forms the basis of all modern life (345).
If RNA can serve both as:
- DNA encodes the genetic information of proteins
- DNA replication and transcription requires proteins
- proteins cannot self-replicate (except prions)
- proteins cannot encode the information in DNA (Cricks 'Central Dogma')
then the dilemma is solved. This provides the basis for the hypothesis that life began as RNA – the so-called RNA World (183): an RNA-based genetic and catalytic system. The unexpected observation that deoxyribonucleotides are synthesized from ribonucleotides in cellular pathways supports the notion that DNA arose later in cellular evolution than RNA (254). Further evidence: Michael Yarus points out that the evidence for the RNA-world is actually scattered throughout modern-day biochemistry and cell biology ('the ancestor within': mRNA, tRNA, rRNA, microRNA, ). Senapathy completely overlooked the DNA-protein dilemma, and why the RNA-world is the solution to that dilemma. And there must have been a Pre-RNA-world, proto-RNA-world (346). Consequently, his theory about random DNA is pointless. Intriguingly, in Figure 11.1. (see above § 18) at the basis is a pointer to 'Start of chemical evolution'! What would that mean?
- a repository of information (in its sequence of nucleotides)
- a catalyst
Multiple origins of life hypothesis
Raup and Valentine (140) propose multiple origins of life: "The probability of survival of life is low unless there are multiple origins, and given survival of life and given as many as 10 independent origins of life, the odds are that all but one would have gone extinct, yielding the monophyletic biota we have now." This mainstream hypothesis does not contradict common descent of all life. It only proposes that there must have been many origins, but they did not survive except one which formed the universal tree of life. Senapathy proposes independent origin of all species.
What is life?
Senapathy claims to explain the origin of life. But, what is life? If one has a wrong idea of what life is, then the theory to explain 'life' is useless. So, what is life? According to chemical engineer Tibor Ganti (43) life consists of 3 subsystems (see figure):
| L I F E|
|2. Chemical boundary system |
|3. Chemical |
Together these 3 subsystems are a living system. Senapathy's theory is concerned with the information-carrying subsystem (DNA) only. So he has a mistaken view of what life is. Therefore, his theory is useless. Furthermore, he got the order of origin of the 3 subsystems wrong. Several scientists believe that metabolism originated first, and that the information carrying subsystem arose later (as a by-product). The reason is that whereas the abiotic synthesis of amino acids is easy, the abiotic synthesis of nucleotides is difficult (44). Whatever the order of appearance, the point is that according to Gánti the genetic code and the reading machine can only function together, they originate and function together. If there is an order, it is the machine first, because a machine can exist without program control, but the program cannot exist without machine (43, p. 16).
- a chemical motor (metabolism) that supplies energy to synthesize compounds necessary for the other 2 subsystems
and is stable
- a membrane which keeps the other 2 subsystems together, protects against dilution and is itself stable
- an information-carrying subsystem (for example DNA) which enables reproduction of the 3 subsystems
According to John Maynard Smith (92) "entities are alive if they have the properties of multiplication, variation, and heredity". Senapathy's theory does not say a word why independently born organisms should have the property of multiplication (reproduction). Since organisms could be produced by the primordial pond indefinitely, why did the primordial pond not produce organisms lacking the power to reproduce themselves? Why such an improbably complex feature as reproduction? Theoretically, organisms could be produced that live for ever. DNA is necessary for building a body and keeping that body alive, reproduction is extra.
Why and when did the primary pond cease to exist?
Energy. Without energy no life. Energy is the chemical motor and is the first subsystem
of life (Ganti).
We consume carbohydrates and fats, combining them with oxygen that we inhale, to keep ourselves alive.
Microorganisms are more versatile and can use minerals in place of the food or the oxygen.
In either case, the transformations that are involved are called redox reactions. They entail the transfer of electrons
from an electron-rich (or reduced) substance to an electron-poor (or oxidized) one (68).
By defining the origin and the evolution of life as the same problem, Senapathy has to show that the origin of organisms that consume
carbohydrates and fats (which are of biological origin) is as plausible as the origin of organisms requiring only minerals.
Furthermore, multicellular animals need an order of magnitude more energy and so must use aerobic respiration (76). But Oxygen levels on Earth rose gradually to the current level. How do primary ponds know when to produce small mono-cellular or large multi-cellular animals?
Rerun the tape of life
There are more fundamental questions, which are not yet solved by any biological theory.
If we would rerun the tape of life, or if life evolved on a million earth-like planets, would we see the same survival
Would we see lions, mushrooms, eagles, and HIV again?
Would we see the same animal body plans? Five-digits on each hand and leg?
Again the same haemoglobin molecule to transport oxygen?
The same genetic code? Photosynthesis? We do not know.
However, if Senapathy is right and life originated really independently a million times on this earth,
then the universal genetic code must be the predictable outcome of the laws of nature.
Moreover, all genes and proteins common to all species on earth
must be natural and inevitable. How else could they be common to all life?
Evolution is a mix of accident and necessity. For Senapathy all common features of life must be the inevitable outcome
of the laws of nature (including statistical laws).
PLOS ONE article
3 Nov 08
5 Nov 08
On 21 October 2008 I received a very kind email from Senapathy to notify me that he published
an article in PLOS ONE (105).
It is a huge article with many data (graphs, tables) to support his theory of independent origin
(now called ROSG model). It is best viewed in pdf version.
"This project is purely an academic project, fulfilling the academic interest of the corresponding author".
Indeed, one could justly say that it is a lifelong interest.
I tried to decipher the logic of his argument in the PLOS article. Despite I studied his argument for years,
it is not easy. I guess, he does not want to defend the idea that the present-day human genome is random with respect to
the frequency of stop codons. No scientist would want to do that. An arbitrary random genome could not produce a human being.
On the other hand, nobody can argue against the claim that "The presence of three stop codons for every 64 codons limits the average ORF [Open Reading Frame] length to about 60 bases in random DNA" ('random DNA' is a computer generated random string of four different symbols). What he seems to argue is that, despite the predominant non-random nature of present-day genomes, exon length still has a random signature.
The best evidence of what he is really after is:
Figure 7A. The ROSG model. Origin of an eukaryotic gene from primordial random DNA.
ORF = Open Reading Frame, is DNA sequence between two stop codons.
Red: coding sequence is an exon. stop codons occurred too
frequently to allow functional proteins to be encoded in random DNA.
That's why processing is necessary. Please note there are no startcodons.
"It is remarkable that all the characteristics of random DNA are still essentially present in the split genes of present day intron-dense large genomes such as those in the human."
This is his goal and conclusion. Please note: 'essentially' and 'present day'. I sifted through the article several times to find the most clear example of the logic of his reasoning. This is the most succinct example:
"The average exon length from the intron-rich genomes is about 170 bases whereas that expected from random ORF lengths is 60 bases. This may indicate that there has been a selection for longer exons within the allowed maximum ORF length of 600 bases for optimizing the frequency of suitable exon lengths."
Obviously, when your model predicts 60 bases and you find 170 bases (211), your model is wrong. If that is not enough, his statement "small minority (~2%) of exons were >750 bases" should refute his model.
He sees the contradiction, because he suggests "there has been a selection for longer exons".
Selection? If there has been natural selection, then the original signal is destroyed.
The difficulty is, that Senapathy has no independent evidence of the first random DNA sequences and independent
evidence of subsequent processing. If you allow for any amount of processing and selection, then any exon length can be 'explained' simply because his model does not specify restrictions on the amount of processing. (However, there is one escape: long noncoding RNA (ncRNA) could be closer to randomness because they are not translated into proteins and therefore stopcodon statistics, indeed any codon could by random. Even the necessity of triplets is absent).
Senapathy does not distinguish clearly between the timing of the different events: 1) the origin of the very first DNA sequences, 2) the processing of those sequences, 3) subsequent genome evolution during 2 billion years. But prebiotic environments are completely different from those of a living organism. Irrespective of when (106) this processing occurred, after 'splicing together short coding pieces', exons lengths, gene lengths and genomes are not random anymore. Any selective processing of random DNA makes it non-random. Any non-random removal of stop codons change the
statistical properties of the sequence. If present-day exons are a combination of shorter pieces joined together, than by definition current exon lengths are not random.
His figure shows the combination of 4 small exons into one large exon, but any exon length can be explained in this
way. An arbitrary number of exons with arbitrary lengths can be joined to an arbitrary number of new exons with arbitrary
lengths. Also, an arbitrary number of exons may stay unmodified. Exon length distribution is the main thing Senapathy wants to explain and he introduces a mechanism that changes them in an unspecified way. Why do introns still exist? Why are there still so many introns in our genes? Why did his mechanism not eliminate all introns? Senapathy observes present-day exon lengths and postulates a hypothetical mechanism that produces exactly the exon sizes of today. That does not add anything to our knowledge.
My own suggestion would be: it is true that stop codons would limit gene length, but a far more simpler solution would be elimination of stop codons by a one-base mutation of the stopcodon in the context of living organisms. That is a far more simple because it does not require complicated splicing machinery. Certainly under prebiotic conditions where no functional enzymes are present.
He did not give evidence that this complicated processing is possible in prebiotic chemistry. He stated that functional proteins cannot be encoded in genes with average ORF length of about 60 bases.
In addition to the contradiction of his own model with his own data (called 'non-confirming' data by Senapathy), two types of evidence are contradicting his hypothesis: too many short exons and too many long exons. Humans have 170 exons of length up to 25 bp (107) which is significantly lower than the expected size of 60 and exon sizes up to 2087 bp exist (108) which is much outside the predicted maximum. The length of an average human exon is 126 bp which is more than twice the expected 60 bp.
Possibly, his ideas and data about the origin of splice signals (Figure 7B, not shown) are interesting. I would suggest submitting it to scientific journals such as Genomics.
Finally, the idea that the very first DNA sequence must have been random, is plausible only if that idea is part of a plausible 'DNA-first' theory of the origin of life.
The Nature Precedings articles
5 jan 11
23 Jan 11
On 13 December 2010 Senapathy posted 3 articles on the Nature Precedings website (141). This is a permanent, citable archive for non-peer-reviewed pre-publication research and preliminary findings and is run by the publishers of the famous Nature journal. The article 'Origin of biological information' appears to contain more modest claims than his previous writings. For example, this article does not contain the words 'Darwin' or 'Darwinism', which means that Senapathy does not openly attack Darwinism anymore (subtitle of his 1994 book!). Some of my criticisms are addressed (so it seems), but errors I pointed out are repeated. New is the calculation that genes + regulatory + splicing sequences "can occur within one milligram (~1019 bases) or so of pre-biotic random DNA"! (p. 8). For comparison: it takes the largest Gene Synthesis Supplier in the USA a year to synthesize 54 million (=106) base pairs and it will cost you $18 million (156).
The Conclusion of the article starts with: "This work does not claim to provide historical details of early evolution." (p. 9). You can focus your Origin of Life theory on any aspect you are interested in. Maybe you are not interested in the history of life. That's OK. However, the moment you want to test your theory, you cannot ignore history. Simply because your theory could conflict with the fossil record. And it does. Eukaryotes include vertebrates like birds, whales, elephants, horses and humans. They easily fossilize. If they are produced in the primordial pond, why are those fossils not found during the 'Cambrian explosion of multicellular organisms'? Where are the fossils? See: §18: the chronological order of life.
Some of his claims seem to be more modest than earlier claims: "Our findings demonstrate that complete split genes encoding complex proteins could have arisen within a minute amount of pre-biotic random DNA, explaining the origin of biological information and serving as the basis for the evolution of the very first genome." (p. 2. my emphasis).
And in the conclusion he writes: "that these genes could have been used in the self-assembly process to create countless eukaryotic genomes" (p.9 my emphasis). So, Senapathy does not explain genomes (strong claim), but genes (weak claim). The word 'evolution' in a non-evolutionary theory? Self-assembly of genomes? This is pure magic!
He provides no details, no mechanism, no probabilities, no evidence for self-assembly of genomes. That means that the most important part of his theory is left unspecified. Again: explaining genes does not explain genomes. Genomes are not arbitrary collections of arbitrary genes (See: § 6). What determines whether a specific collection of genes is a genome? Sooner or later one must invoke selection of viable collection of genes, and death of inviable collections of genes. So, the crucial step to genome formation in the independent origin scenario would invoke the Darwinian principle of natural selection! Question: could it be that if genome formation depends on self-assembly of isolated genes, that the larger the genome, the lower the probability of successful self-assembly of that genome? It must be.
The stopcodon statistics story, which was an important part of the PLOS article, is not defended explicitly in this article, although he refers to it. The stopcodon issue is part of the genetic code problem (see 'The elephant in the room').
The genetic code problem, the main ingredient of the origin of life problem, is stated and dismissed in a shallow way in a two-sentence paragraph. This is very disappointing from a scientific point of view. (See: § 6)
Figure 1. The common ancestor of eukaryotes, bacteria, and archaea may
have been a community of organisms.
+M = endosymbiosis of mitochondrial ancestor.
Kurland et al, ©Science
Senapathy's genome-centered view of life explains life by reducing life to genomes, and reducing genomes to the statistics of
4 symbols (1/4 x 1/4 x 1/4 etc). The existence of organelles in the eukaryotic cell (see: § 7) is troublesome and annoying for the genome-centered view of life, because organelles (such as mitochondria) cannot be explained by calculating probabilities in the same way (157). It is disappointing that Senapathy tries to dismiss the endosymbiosis theory in a single paragraph (p. 8): "Even after decades of research, no consensus framework for the evolution of a eukaryotic cell from
bacterium-like cells has emerged" (p. 8).
He refers among others to Kurland et al (147). Although Kurland et al suggest that "eukaryotes are a unique primordial lineage", they certainly do not claim that prokaryotes descended from eukaryotes, nor that eukaryotes were the first forms of life. Furthermore, in Fig. 1 Kurland et al clearly show endosymbiosis of a mitochondrial ancestor (+M) from Bacteria to Eukarya. Whatever the origin of the mitochondrion, it is -almost by definition- present in eukaryotes! That is a fact of life which Senapathy ignores. Controversy about the evolution of the eukaryotic cell does not deny that all eukaryotic cells harbor mitochondria. If eukaryotes arose directly from random primordial DNA, then Senapathy has to explain where mitochondria came from; why mitochondria have a circular chromosome with 37 genes; and why mitochondrial DNA lacks introns, as is the case in the human mitochondrial genome (wiki).
Remember, according to his own theory, the absence of introns makes independent origin impossible because intronless genes cannot be found in reasonable amounts of random DNA. That's why Senapathy is forced to conclude that Prokaryotes descended from Eukaryotes (172). Furthermore, figure 1 of Kurland et al shows that cellular life predated the common ancestor of Bacteria, Eukarya and Archaea, contradicting again the idea that Eukarya were the first forms of life.
An overview of possible relationships:
|1||— P —> E||Eukaryotes descended from Prokaryotes||mainstream|
|2||— E —> P||Prokaryotes descended from Eukaryotes|| 148|
|3||E <— —> P||Eukaryotes and Prokaryotes had a common ancestor||Kurland, Darnell|
|4|| E —> P||Eukaryotes = first life, Prokaryotes from Eukaryotes||Senapathy|
Senapathy also refers to a mainstream publication (148) proposing the hypothesis
that 'prokaryotes might be derived from eukaryotes'. This is a remarkable proposal, and seems to support his own theory, but it has not been established, is proposed for quite different reasons, and won't explain the origin of eukaryotes themselves.
There is an even more remarkable mainstream publication of J.E. Darnell (1978), not cited by Senapathy as far as I know, which claims "that eukaryotes evolved independently of prokaryotes" for exactly the same reason as Senapathy: "noncontiguous sequences in eukaryotic DNA" and that eukaryotic DNA "may reflect an ancient, rather than a new, distribution of information in DNA" (169). Martin and Koonin ((170) say about this:
"James Darnell submitted similar ideas at a time when the issue in early evolution was how to generate long coding sequences from scratch" (my emphasis). From scratch? (This is exactly Senapathy!). At a time? Was it a mainstream issue at the time? But has become irrelevant? Because of what? Mysterious! Whatever, Darnell did not claim that complex multicellular eukaryotes could have originated from prebiotic random DNA.
Anyway, it is a logical mistake to think that gaps in knowledge of eukaryotic origins is evidence for any alternative theory. Senapathy needs positive evidence for his theory. (See also: § 11).
Senapathy uses references to the literature to support his views, which do not support his views at all. For example: "Even after the knowledge that eukaryotic split genes may have been the very first genes became widespread, ..." (p.7, my emphasis). This is a serious misrepresentation of the literature. What the publication (153) says: "the proposal that introns arose before the origin of genetically encoded proteins and DNA," (my emphasis), which refers to a RNA world (self-splicing RNA's). Nobody in the literature claims that complex eukaryotic genomes (like the human genome!) were produced at the time of the origin of life. There is simply no evidence in the fossil record of vertebrates in the Cambrian fossil record. Nobody claims that the "last eukaryotic common ancestor had an intron-dense genome" means that these were the first forms of life.
Senapathy's interpretation of the relevant scientific literature appears to be idiosyncratic, unorthodox, and often erroneous. His references are so numerous, that it is a huge undertaking to check all his references.
Introns-Early — Introns-Late
Senapathy's theory is beyond the Introns-Early (IE) versus Introns-Late (IL) controversy. It would not be correct to label his theory as 'Intron-Early'. If anything, it could be called 'Intron-First' (IF), (but the context of IF is the RNA world).
Introns-Late is certainly incompatible with his theory. In Senapathy's theory introns and exons simply do occur in random DNA. It is a static phenomenon. Introns do not have biological causes. Introns did not invade genomes. The mainstream view, whether Introns-Early or Introns-Late, is that introns are a dynamic phenomenon: they invade existing genomes like parasites. In Senapathy's theory it makes no sense to talk about 'mechanisms of intron loss and gain'. They were just there from the beginning.
Mainstream science agrees with Senapathy in one point and that is that introns have evolved extremely early:
"introns that currently reside in eukaryotic genes, after all, do derive, through an uninterrupted lineage of selfish elements, from primordial genetic elements." and "introns have evolved extremely early, very likely, earlier than cells themselves." (Koonin 165). Although Koonin uses the suggestive phrase "primordial pool of genetic elements" (233), big differences are that "the primordial genetic pool is believed to have evolved from a pure RNA world to a RNA-protein system to the modern world of the Central Dogma (DNA-RNA-protein)" (165), and that the method was descent with variation (tree of life). Ford Doolittle: "Really we know nothing about how genes arose, and to suppose that they sprang full blown and full length from noncoding polynucleotides seems to me more of a stretch than to imagine that they were cobbled together from smaller oligopeptide-encoding modules" (165).
Intron — genetic code
A fundamental problem with Senapathy's view of introns is that the very word 'intron' defined as "intervening random sequences" (p. 10) has no meaning without 'exon', defined as "split coding sequences corresponding to the split protein sequences" (p. 10). But 'coding for proteins' implies the genetic code. Senapathy is confronted with the genetic code problem again. The issue is not statistics. From the point of view of coding capacity there is no fundamental difference between introns and exons, because intron splicing sites can be deleted or created de novo quite easily by mutation (179).
(see also: § 3, 4).
| Top |
28 Feb 2018
SUMMARY OF THE CONCLUSION: the idea that random DNA could contain eukaryotic genes is refuted by the fact that the predicted Open Reading Frames (ORFs) are on average less than 21 codons (63 bases), which is far too short for a gene.
The elephant in the room
For a long time I did not see clearly that even the statistics of stop codons is not pure statistics, but tacitly assumes something very important. I got distracted by the details of exons and introns and the subtle way in which he –unintentionally– confuses the reader. What I did not see clearly is that even these 'pure' statistical predictions about the frequency of stop codons and exon lengths simply assume the presence of a full-blown canonical genetic code. Even the word 'stop codon' alone is meaningless without the full-blown canonical genetic code. A 'stop codon' only means something if a DNA sequence is translated into the amino acid sequence of a protein. The word 'codon' itself implies a genetic code. The concepts 'stop codon', 'exon', 'intron', 'triplet', 'codon degeneracy', 'Open Reading Frame' (ORF) make only sense in the context of a DNA-protein world. These concepts are not justified in the context of the origin of life. They all depend critically on the presence of the complete transcription and translation machinery which cannot be assumed at the origin of life. That's the elephant in the room!
Viewed from the point of information theory: there is no difference between 'message' and 'noise' without an interpreter (327). So, even the concept 'noise' is inapplicable to random genomes. Even worse: even in the DNA-protein world (the genome of a human, a dog, or a mouse) the occurrence of a big nonconserved ORF is not a proof of a protein coding gene unless a protein is experimentally proven (326).
To give an arbitrary example: the idea of a 'stop codon' implies a release factor which is a protein that allows for the termination of translation by recognizing the stop codon in an mRNA. So, to produce one molecule of the release factor (from DNA), the release factor must already be present. That's a vicious circle. (314).
In Senapathy's view, life started with DNA and genetically encoded protein synthesis. But it is widely accepted that an RNA world, with neither DNA nor genetically encoded proteins, was a necessary stage during the origin of life (173). So, the origin of genetically encoded protein synthesis, and the steps before, is the most important part of the whole the origin of life problem! And Senapathy –without mentioning– assumes it is there. But if the DNA-world was preceded by an RNA-world, then it makes no sense to generate random DNA from scratch. In the RNA-world there is no genetic code, there are no triplets, no stop codons, no exons, no introns.
In 1994 Senapathy wrote a 600-page book Independent Birth of Organisms to refute evolution (200 pages) and to argue that all life forms except bacteria and viruses, but including humans, arose independently from a single primordial pond. It is a non-religious (112) alternative for evolution. A naturalistic alternative for evolution is rare. In my opinion Neo-Darwinism and the mainstream theories of the origin of life are not necessarily true (88). Independent origin is not false simply because 'everybody knows that evolution is true', but because of specific facts. However, let's first list some facts everybody can agree with:
But then the agreements end. Senapathy's theory of the origin of life suffers from severe reductionism: he overlooks crucial aspects:
- science should seek a natural explanation of the origin of life
- prokaryotes very rarely have introns, have high gene densities, low-entropy, compact, relatively small genomes (246, p.53, p.229)
- eukaryotes have many large introns, have low gene densities, have high-entropy, noisy, large genomes (246, p.53, p.229), (310), however: (287), (288)
- it is mathematically certain to find the sequence of any eukaryotic gene in computer generated random strings of A,T,C,G if the sequence is long enough (262), (326)
- theoretically stopcodon statistics in random computer DNA logically follow from the premiss that there are 3 stop codons in 64 codons
- it has been found empirically in Drosophila melanogaster that 59.9% of random 800-bp intergenic sequences were associated with a ≥ 150-bp single-exon Open Reading Frame (ORF) (328) which could theoretically produce a 50 amino acid long peptide
- some mainstream scientists use the concept "primordial gene pool" ... (233)
- Recently it has been claimed by a genome researcher that a hundred million base chromosome of entirely random DNA will be transcribed, bound by DNA-binding proteins and the chromatin marked (289)
- Random DNA does occur in the human genome (intergenic regions, junk DNA)
- Any random DNA sequence of sufficient length will contain transcription-factor binding sites (309).
- Recently, it has been demonstrated for the first time that three human genes have arisen de novo from noncoding DNA (308) but the human genome is not in a primordial pond!
Senapathy starts with simplifying and thus distorting the facts that his theory has to explain. Among the many facts are: eukaryotic chromosomes, not naked DNA (265); the chimeric nature of eukaryote genomes, etc, etc, etc (see all above). His primary argument is that it is possible to find the sequence of any eukaryotic gene in computer generated random DNA. This is true by definition if the random sequence is long enough. Regrettably, this brings us nowhere near the origin of life.
- reduction of life to eukaryotes: neglecting the origin of prokaryotes
- reduction of males and females to a sexless genome, ignoring male and female genomes and -chromosomes
- reduction of eukaryotes to DNA sequences: genome-centric view of life, ignoring cells and (sex-)chromosomes
- reduction of genomes to nuclear genomes, ignoring mitochondrial genomes (animals, plants) and chloroplast genomes (plants)
- reduction of genomes to a random collection of genes, ignoring the probability of a complete viable genome
- reduction of genomes to protein-coding genes ('protein-centric view'): ignoring everything else (an abundance of short and long noncoding RNAs, promoter regions, enhancer regions, intergenic regions, histone modification, DNA methylation, chromosome-interacting regions, transposons, repetitive DNA)
- reduction of genes to exons: discarding introns
- reduction of exons to a sequence of 4 'symbols': reduction of biology to statistics, ignoring chemistry and thus the origin of the double helix
The first major error: the argument assumes that if a gene (genome) can be generated by a computer, it can also be generated naturally (prebiotically) and secondly without highly specific enzymes. The origin of life is a chemical problem. It is true that (random) polymers must have been synthesized prebiotically. But assuming that the polymer must have been DNA and that it could form spontaneously a length of billions of bases, and then building your whole theory on it is very risky. Very risky indeed, because he published his book one year before the first complete eukaryotic genome was sequenced in 1995 (301), so he could not calculate the probability of a whole genome sequence. Yes, it is true that DNA is crucial for life, but it does not necessary imply DNA was involved in the origin of life. DNA does not form spontaneously and even if it were, it would be 'dead': unable to do anything. Furthermore, DNA can only survive a long time under very favorable conditions that protect it from degradation such as permafrost or ice (see: ancient DNA).
The second major error (I became aware of this only after nearly 10 years) is that even a 100% accurate human DNA sequence without a cellular context has no more meaning than a random DNA sequence of the same length. Even if a correct human diploid sequence of 6 billion base pairs would be found in a random sequence, including all introns, exons, splice sites, regulatory sequences, promoters, enhancers, etc, would be present as a double-helix and in a diploid heterozygote state and correctly distributed over 46 units (simulating human chromosomes; 2n=46, XY or XX pair), and even if all sequence characteristics of the chromosomes were present (telomeres, centromeres), and even admitting that 95% of the genome is 'junk-DNA', and even admitting that it is identical to a human genome, the sequence would be as dead as a doornail! The human genome sequence could not even produce a human. The sequence would not even be 'living' (see: here). Even if the most simple prokaryote (a bacterium) would have contained sufficient introns, it could not originate from naked DNA in a primordial pool. Worse, it is completely irrelevant whether there are introns or not. The reason is the same as why a DNA virus is unable to reproduce outside a living cell (331): a naked DNA sequence is unable to do anything outside a living cell.
There is a profound reason for this error: the genome-centered view of life ('genetic determinism' 219):
This is wrong if applied to the origin of an individual (216). But Senapathy got carried away by the idea, took it too literally, and transformed it into:
- THE SEQUENCE IS NECESSARY AND SUFFICIENT TO CREATE AN ORGANISM
- THE SEQUENCE IS NECESSARY AND SUFFICIENT TO EXPLAIN THE ORIGIN OF LIFE
The idea that The Sequence is necessary and sufficient to create an organism is an absolute requirement for independent origin. For, if other factors than The Sequence would be required, The Sequence itself could never explain the origin of life. The next step is: let the Sequence be created abiotically by the Primordial Pond and the origin of life would be explained. Both scenarios (1) and (2) fail for the same reason as why a virus could not be the explanation for the origin of life: the Sequence depends on other factors to do anything. Both naked DNA and a virus need a living cell to do anything. A virus, despite containing genetic information, cannot be called 'living' because it does not and cannot do anything itself (see: Ganti). A virus is parasitic. Even the notion of 'genetic information' in a virus completely depends on the cell's interpretation machinery. Without a cell we would not be justified to claim that a virus carries genetic information. Just as naked DNA. Just as a sperm, despite the fact that a sperm has a complete human genome. A sperm itself cannot produce a human. It needs an egg cell. An egg cel is more than DNA.
These objections are so fundamental that no future developments can change that situation.
Possibly, there is a role for randomness in the origin of life. Possibly, the origin of catalytic RNA molecules from random RNA sequences (217) plays a role in the origin of life. But nobody demonstrated the spontaneous origin of DNA sequences of billions of bases long. Even a bacterium like E. coli is too complex to originate in a primordial pond. Origin of Life researchers aim at something 1000 times less complex as E. coli, while Senapathy aims at something a thousand times more complex than a bacterium. Possibly, there is a role for random DNA sequences in the origin of new genes in existing organisms (279, 280). Possibly, there is a role for computer simulations of random networks (see: Stuart Kauffman), but all computer simulations require empirical support.
The origin of life requires a bottom-up approach starting from chemical buildings blocks, and using laws of chemistry. Senapathy assumes the complete cellular transcription and translation machinery of a cell. Importantly, since eukaryotic introns are not self-splicing, one needs the complex eukaryotic splicing machinery. The origin of life is no solved by simply assuming all these things.
Surprisingly, and contrary to his claims, his theory is not truly independent birth of organisms, because it involves micro-evolution, mutation and natural selection. Even more clearly contradicting independent origin is the idea that prokaryotes are not independently born, but evolved from eukaryotes. This contradicts his revolutionary claim "That Evolutionary Theories Are Fundamentally Incorrect" (subtitle of book!). Significantly, the only non-supernatural alternative to Darwinism invokes microevolution and a lot of selection.
Nevertheless, to put it mildly, I am not convinced by the parts of his theory that do claim a truly independent origin. His theory creates many more problems than it solves, if it solves any problem at all. The trouble starts already with his own data (which he produced in copious amounts) and his interpretation of the data.
thinking too far outside the box...
Anyway, the origin of the life is the most difficult problem in biology (for an introduction see wiki article). The best scientists in the world have not yet solved it, but Senapathy states that "the new theory is able to explain the origin and diversity of complex creatures" (p.8).
Apart from computer simulations and statistics, Senapathy did a lot of primordial-pond-story-telling and wishful thinking. He did not address the hard problems of the origin of life. He isn't even aware of the hard problems. That makes his theory a stillborn baby.
The only other scientist I know who defends naturalistic independent origin is biochemist Christian Schwabe (16). An important difference is that Schwabe is a biochemist and approached the origin of life as a biochemical problem. The difference with creationists is that Senapathy almost certainly has no religious motives. In this review I used a lot of information not available to Senapathy in 1994. However, Senapathy is unaware of too many biological facts known in 1994 and he continues the same approach in recent publications (141). He underrates the profoundness and complexity of the origin of life problem, and overrates the power of his own theory. He never doubts. At the same time he is much more critical of the theory of evolution, than of his own theory. In contrast to Darwin, Senapathy did not include a chapter 'Difficulties of the theory'. On the contrary. It seems he is unable or unwilling to write such a chapter.
Senapathy is a baffling personality and an outside the box thinker. On the one hand he had a job at the National Institutes of Health in Bethesda, he had been the director of a software company, is very ambitious, published in scientific journals such as Science, Genomics, Bioinformatics, PNAS (194), and was cited in books (175), in the New Scientist (236) and other journals (195), but on the other hand he ignores basic biological facts, and often behaves like a crank. However, this review has not been written to attack the person, but because I was curious how he explained the origin of life.
What I learned
Studying this and religious alternatives for evolution, convinced me that it is extremely hard to develop a consistent non-evolutionary alternative for the history of life on earth which does not contradict the facts of life. At the same time this alternative theory stimulates thinking far more than creationism does. Further, I learned that evolution theory is certainly not a shallow idea that is dogmatically and mindlessly defended by evolutionary biologists. I am impressed and a little bit surprised that evolution theory escapes so much of the traps Periannan Senapathy fell in. I am equally surprised how thousands of seemingly neutral facts turn out to be evidence against 'Independent Origin'. This endless stream of facts continues to inspire and give me deep insights into the fundamental properties of life on earth and the structure of the theory of evolution. And that makes it very rewarding. The best scientists admit that they do not understand every fact, that they cannot solve every problem, and that their favorite theory is not perfect and does not explain everything. Therefore, it is always very unwise and unscientific to say: my theory easily explains everything!
Appendix: Genetics Primer ( added: 22 Dec 2011 )
| Top |
Senapathy's book contains a large 30-page 'Appendix: Genetics Primer'. With this section we can compare 1) What did Senapathy know in 1994? 2) What was known by scientists in 1994?
A Eukaryotic cell according to Senapathy (Fig 7 Appendix)
In the section 'The Genome' (p.543) he describes the structure of 'the genome' in general, but this appears to be only the eukaryotic or human genome. As if prokayrotes do not exist. Only on page 551 are prokaryotes introduced. The information is correct, but he omits a crucial fact: eukaryotic mitochondria contain DNA with prokaryote-like characteristics. It is also not indicated in figure 7 of the Appendix (p. 552). There is no mention of the endosymbiosis theory. Chloroplasts are not mentioned. Only on the last page ('Some anecdotes', p. 566) we learn that human DNA is about 600 times larger than the bacterium E. coli. The concept 'zygote' is mentioned on page 559 but it is not explained that a zygote originates by fusion of an haploid egg cell and an halpoid sperm cell. 'Transfer RNA' and 'Ribosome' is mentioned (page 556) but not ribosomal genes.
Assessment: only somebody who read the appendix would notice that his theory implies the very counterintuitive idea that a simple bacterium cannot arise spontaneously in the primordial pond, whereas the human genome, which is according to his own data 600 times bigger, easily does! Furthermore, in the context of gene regulation he mentions "a simple bacterial cell [prokaryote], and a more complex cell [multicellular organisms]" (p.559). Additionally, eukaryotes mainly have sexual reproduction which is more complex than asexual reproduction (prokaryotes such as bacteria).
It certainly is intriguing that vertebrate genomes have low gene- and information densities and high intron densities compared to bacteria (246, p.233), but at the end of the day it does not help in the slightest degree. By –conscious or unconscious– omitting several important facts which cause trouble for his theory, his Genetics Primer is a biased presentation of Genetics. Even considering what was known in 1994.
Notes & References
| Top |
- Senapathy: "When I initially tried to explain my theory to my wife, I said, All the organisms could have come about just as they are, independently from the primordial pond. Her response..." (p.295). Please note, she forgot to ask about humans.
- Source of chromosomes. These chromosomes are not from Senapathy's book. There are no chromosomes in his book. That is the point.
- Only male honeybees hatch from unfertilized eggs [are haploid], but female honeybees hatch from fertilized eggs [are diploid], therefore that won't help independent origin theory very much. See: Olivia Judson(2002) Dr. Tatiana's sex advice to all creation, p.18 . Hermaphrodites (organism with both female and male sex organs) usually need another individual to reproduce.
- Helen Pearson: Human genetics: "Dual identities", news feature, Nature 417, 10-11 (2002), 2 May 2002.
- See for a full exposition for the non-specialist Mark Ridley (2000) Mendel's Demon. Gene Justice and the Complexity of Life (review), Chapter 6 "Darwinian merger and acquisition" is about the far reaching implications of having mitochondria in the cell.
- Christian de Duve (2002) Life Evolving, Oxford Univeristy Press, p. 141.
- Robert Pennock (2001) Intelligent Design Creationism and its Critics, p. 685.
- S.G. Gregory et al (2002) A Physical map of the mouse genome. Nature AOP, published online 4 August 2002.
- Carina Dennis and Richard Gallagher (2001), The Human Genome, p.120: "The largest apparently contiguous conserved segment in the human genome is on chromosome 4, including roughly 90,5 Mb of human DNA that is orthologous to mouse chromosome 5."
- Similarities found in mouse genes and human's. Nicholas Wade, NewYork Times Science, 5 Dec 2002.
- Comparision of Human chromosome 4 and Mouse chromosome 5.
- Comparision of Human chromosome 17 and Mouse chromosome 11.
- Comparision of Human chromosome 20 and Mouse chromosome 2.
- A memorable misunderstanding on this site.
- Evolution (Third Edition) on this site.
- A Chemist's View of Life: Ultimate Reductionism & Dissent on this site.
- Ernst Mayr (2001) What Evolution is, p.46. See also: Lynn Margulis(1998) Five Kingdoms. An illustrated guide to the phyla of life on earth, third edition. p.12.
- Senapathy has a short paragraph What is a "seed cell"? (p.307), in which he uses 'haploid' and 'diploid', but there he neither explains what a seed cell is, nor how a diploid cell arises out of the primordial pond.
- Portrait of a molecule by Philip Ball. This is a good article for those who think that a genome is just naked DNA. Nature 421, 421 - 422 (2003) (free). Have a look at the beautiful diagram of the 3-D structure of the chromosome showing that a genome is more than just the sequence of the bases! Looking at this image it is clear that Senapathy's discovery about split genes in random DNA is almost irrelevant. He did not explain the massive amounts of highly specialised proteins (histones), which form the complex 3-D structure of the eukaryotic chromosome.
- Richard Dawkins used the now famous weasel computer experiment to demonstrate the difference between one-step and cumulative selection in The Blind Watchmaker, chapter 3. See also my Spetner review.
- David Foster attributed this argument to Thomas Huxley (see review of Foster's book).
- Senapathy states (p.222) that the occurrence of the uninterrupted text of Shakespeare is improbable.
- Senapathy easily contradicts his own theory: "Thus it is possible for the prokaryotic genome to have been derived directly from contiguous genes in the open primordial pond". (p.238)
- In fact, this question is wrong. There is no such a thing is "the human genome". Only female and male human genomes exist.
- "the primordial pond could have been productive for a very long geological time" (p.345).
- "indentifiying exon-intron borders is a notoriously difficult task", Antoine Danchin(2002) "The Delphic Boat", p.238.
- Charles Spruck (2003) Requirement of Cks2 for the first metaphase/anaphase transition of mammalian meiosis. Science 300 (5619):647. 25 Apr 2003.
- Whenever Senapthy is uncertain, he says "absolutely". The word occurs 138 times in his book!
- Even Jesus, the son of God, had a mother. Significantly, this is claimed by people who otherwise accept miracles. However, Adam and Eve did not have a mother and father, but were created as adults. Senapthy's creatures also did not have a father and mother, but at least did not start as adults!
- A Conversation with James D. Watson, Scientific American, March/April 2003.
- David Haig (2002) Genomic imprinting and kinship, Rutgers University Press, p.11. A further reason for the absence of parthenogenesis in animals is that the sperm also contributes the centrosome to the egg which is essential for initial divisions of the fertilized egg (Christiane Nüsslein-Volhard, 2006, p.15).
- John Maynard Smith & Eörs Szathmáry (1999) The Origins of Life. From the Birth of Life to the Origin of Language. Furthermore, the members of higher levels are composed out of members of lower levels.
- Graur and Li (2000) Fundamentals of Molecular Evolution. Second edition. p. 136.
- Donald Forsdyke (2001) The Origin of Species Revisited, p. 103. (see review on this site).
- Syozo Osawa (1995) Evolution of the Genetic Code, p. 45.
- Michael Majerus (2003) Sex wars. - Genes, bacteria, and biased sex ratios, p.63,66.
- Actually, meiosis is more complex. In males, the products of meiosis are four sperm, each sex chromosome in the original diploid cell being present in two of the products. In females of most species, however, only one egg is produced for each parent cell that undergoes meiosis, the other three haploid products together giving rise to the yolk of the ensuing egg. See also review of Mendel's Demon (Unexpected predictions and explanations).
- Jan Sapp (2003) Genesis. The Evolution of Biology, Oxford University Press, paperback, p.x (Prefeace). This is elaborated in the chapter "Beyond the Genome".
- M. Lynch (2002) 'Intron evolution as a population-genetic process' Proc. Natl. Acad. Sci. U.S.A. 99, 6118 (2002)
- Paul Davies (1999) The Fifth Miracle. The Search for the Origin and Meaning of Life. p.119. Very important book!
- What is known about the function of introns? , Scientific American, Ask the experts/Biology, 1999. Of course Senapathy could not have known this in 1994.
- Louis Berman (2003) The Puzzle. Exploring the evolutionary puzzle of male homosexuality, p.478
- Tibor Ganti (2003) The Principles of Life, Oxford University Press. (review). Furthermore, Ganti writes: "A living organism can never be developed from genetic material alone", "An egg, a seed, or a spore must always contain the substances of the cytoplasm". p.126.
- Freeman Dyson (1999) "Origins of Life", second edition. p.18.
- J.J. Emerson et al (2004) "Extensive Gene Traffic on the Mammalian X Chromosome", Science 303, nr 5657, 23 Jan 2004, pp. 537-540.
- However, retrogenes are known for a long time. Examples of intronless retrogenes are: PGK (1987), calmodulin gene (1987), globin gene (1987), actin gene (1985). See Wen-Hsiung Li (1997), p.347.
- S. J. Gould (2002) "The Structure of Evolutionary Theory", pp 252-253, 325, 527-528, 1174 (slightly adapted).
- Solving the origin of life without the origin of species is difficult enough. However, even the origin of life itself is difficult enough because it commonly includes the origin of the genetic code. Hungarian chemist Gánti simplified the question by distinguishing between the origin of life an the origin of the genetic code.
- Motomichi Matsuzaki et al (2004) 'Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D', Nature 428, 653-657. 08 Apr 2004. This species has 5331 genes, only 26 genes have introns.
- Iris Fry (2000) The emergence of life on earth, p.56, 170.
- Paul G. Falkowski and Colomban de Vargas (2004) Shotgun Sequencing in the Sea: A Blast from the Past? Science, 304, 58-60, 2 Apr 2004.
- Iain Cheeseman and Arshad Desai (2004) Cell division: Feeling tense enough?, Nature, 428, 32-33, 4 March 2004.
- Radu Popa (2004) "Between Necessity and Probability: Searching for the Definition and Origin of Life", p. 95-96. [ 18 June 2004 ]
- David Bainbridge (2000) Making babies. The science of pregnancy, page 35-36.
- Philip Ball (2004) "Synthetic Biology: starting from scratch", Nature, 431, 624-626 (7 Oct 2004). "Bacterial genomes are within the range of current DNA-synthesis technology" says John Mulligan, president of the DBA-synthesizing company Blue Heron Technology. But bacterial genomes must be embedded within a cell and its attendant biochemical machinery, making them much harder to synthesize than viruses.". [ 9 Oct 2004 ]
- Why are stem-cells so important? Stem-cell biology is the second pillar of twenty-first-century biology. If a genome were enough, why are stemm cells so important for medicine? See: Ann Parson (2004) The Proteus Effect: Stem Cells and their Promise for Medicine. [ 24 Oct 2004 ]
- Mark T. Ross et al (2005) "The DNA sequence of the human X chromosome.", Nature, 434, 17 Mar 2005, 325-337.
- Christian de Duve (2002) Life Evolving, p.38.
- Gil Ast (2005) The Alternative Genome, Scientific American March/April 2005 pp 40-47.
- Douglas Futuyma (2005) Evolution, Sinauer Associates, page 53 and 440. (figures adapted for the web).
- This is similar to explaining everything by saying 'God created the perfect fit between organism and environment'.
- Large genomic differences explain our little quirks, Nature, 19 May 2005, 252.
- Patrick Forterre (France) argues "that bacteria probably evolved more recently, and that LUCA was in fact a eukaryote", NewScientist 3 Sept 2005, p.28. (LUCA=Last Universal Common Ancestor). So Forterre seems to suggest that bacteria evolved from eukaryotes, but the difference with Senapathy is that Forterre does not claim that all eukaryotes arose independently. So although Forterre's view is highly unorthodox and implausible, Senapathy's view is a million times more unlikely.
- Nick Lane (2005) Power, Sex, Suicide. Mitochondria and the Meaning of Life, p.143.
- Asexual reproduction in animals is rare: the freshwater polyp Hydra reproduces by budding and some insects like aphids show life phases of quick multiplication through diploid eggs that form large, genetically identical clones. But in difficult times, even these animals reproduce sexually. (Christiane Nüsslein-Volhard, 2006, p.21).
- Please note that the primordial pond illustration on the cover shows a turtle, a frog, a crab, a butterfly, a worm, a fern, but no mammal and no human being. Please note that the illustration suggest frogs develop directly from DNA, but in reality they develop from tadpoles. Similarly, butterflies develop from larva; pupa (not in water!).
- Blowing In The Wind. Seeds & Fruits Dispersed By Wind. (a beautifully illustrated page about all kinds of seed dispersal).
- Robert Shapiro (2007) 'A simpler Origin for Life', Scientific American, june 2007, page 28.
- Alec Panchen (1993) Evolution, p.175.
- Reduced fitness in individuals due to homozygous deleterious alleles is known as "inbreeding depression". See: Scott Freeman and Jon Herron (2007) 'Evolutionary Analysis', page 270.
See also: "the analysis found more than 4 million variants between Venter's maternal and paternal chromosomes. This suggests that humans differ by 0.5%, not 0.1%, as suggested by earlier estimates." Jon Cohen (2007) Venter's Genome Sheds New Light on Human Variation, Science, 7 Sep 2007. On the other hand inbreeding (of dogs) can result in extremely long stretches of identical DNA common in different individuals of the same breed -millions of bases long compared to the typical tens of thousands of bases in humans. This is artificial selection with probably high costs (Science 21 September 2007). It is no accident that DNA of dog breeds are now investigated for genes for 18 diseases including four cancers, four inflammatory disorders, and three heart diseases.
- Catherine Jessus & Olivier Haccard (2007) 'Fertilization: Calcium's double punch', Nature 449, 297-298 (20 September 2007).
- David Beerling (2007) The Emerald Planet: How Plants Changed Earth's History, pp.180-183. There are regions on earth (Athi Plains in Kenya) were C3 and C4 plants coexist, so that would be the place for Senapathy's primordial pond!
- See: Carl Woese (review).
- Catherine Brady (2007) Elizabeth Blackburn and the Story of Telomeres: Deciphering the Ends of DNA, The MIT Press.
- Erika Check Hayden (2008) 'Evolution: Scandal! Sex-starved and still surviving', Nature, 10 april 2008. "
"Bdelloid rotifers reproduce entirely without males: females package a complete copy of their DNA into eggs that
develop, sans fertilization, into the next generation. Asexual reproduction certainly isn't unheard of in the
animal world: parasitic bacteria force some insects to reproduce without males and female sharks kept alone in
captivity have surprised their keepers by giving birth to baby sharks."
- Vaclav Smil Energy in Nature and Society, MIT Press: 2008. 512 pp. reviewed in Nature, 10 april 2008.
- David A. Wheeler et al (2008) 'The complete genome of an individual by massively parallel DNA sequencing', Nature, 452, 872-876 (17 April 2008)
- Elliott Sober (2008) Evidence and Evolution. The logic behind the science, p.116.
- Erwan G. Roussel et al (2008) 'Extending the Sub-Sea-Floor Biosphere', Science, 23 May 2008.
- For this amazing story see: Audubon Magazine. For a map see here. If the primordial pond is at sea level, do these geese survive at that level? How does the bizar migratory behavior originate? Random? That's a helpful explanation! Senapathy never tells us when, and where the primordial pond existed!
- In fact when one looks closely to the quote from page 204, Senapathy is describing random mutation followed by natural selection!!!
- See for explanation 'Vicious circle' box in my review of Hubert Yockey. This is a devastating obstacle for Independent Origin for the same reason that any mutation in the tRNA genes is lethal. See also: 'Does life look unlike evolution?' in my review of Walter Remine.
- "RNA nucleotides have never been synthesized from scratch, in spite of decades of focused effort" (Robert M. Hazen (2005) Genesis. The scientific quest for life's origin, p.219. Also: "Nucleotides, the building blocks of DNA have never been produced in any prebiotic synthesis experiment" (Barton, p.95). Senapathy can never solve this problem by supposing that nucleotides where synthesised under genome control, because without nucleotides no genomes!
- But not impossible. It is quite funny that Senapathy did not claim that with enough time prokaryote genomes could originate.
- Gordon Campbell in: M.R. Wright (2000) 'Reason and necessity'.
- Pier Luigi Luisi (2006) The Emergence of Life. From chemcial origins to synthetic biology, page 208.
- Sheref S. Mansy et al (2008) 'Template-directed synthesis of a genetic polymer in a model protocell', Nature, 3 Jul 2008.
- "Neo-Darwinism is in fact falsifiable, for there are many empirically testable claims
made, for example within modern genetics which currently explains the core principle of inheritance. However, if it were to be falsified a new theory would have to replace it, in order to explain design in a non-theological fashion, and this
would have very many features in common with neo-Darwinism simply because of the explanatory burden such a theory would have to carry." Thomas E. Dickins in: Evolutionary Psychology, 2005. 3: 79-84.
This is very interesting: not only any alternative to evolution needs to explain the same set of facts, it also is expected to have much in common with neo-Darwinism. This is exactly what Senapathy is doing: he copies natural selection and common descent into his theory!!! What he does not do is use the same set of data as neo-Darwinism. Furthermore, Intelligent Design theorist Michael Behe incorporates natural selection and common descent into his ID theory!
- Nicholas H. Barton, Derek E.G. Briggs, Jonathan A. Eisen, David B. Goldstein, Nipam H. Patel (2007) Evolution, Cold Spring Harbor Laboratory Press, hardback 833 pp. (review).
- See for the probability of the spontaneous origin of a well-designed body: Richard Dawkins (1991) The Blind Watchmaker, Penguin books 1991 paperback edition, page 146.
- Additionally, meiotic recombination shuffles the genome, so each generation inherits a new combination of parental traits. How does Senapathy's theory explain the origin and continued existence of such a widespread, complex and costly proces as meiotic recombination? Meiotic recombination contradicts the idea that genomes are essentially fixed. What is the purpose of a recombination of parental genomes?
- John Maynard Smith (1999) The Origin of Life, page 3.
- John Maynard Smith (1995,1997) The Theory of Evolution, Cambridge University Press paperback. p.110.
- I overlooked this figure and his claim that the primordial pond coincides with the Cambrian explosion. However, it only makes my argument stronger and clearer.
- Senapathy appears to know that chromosomes occur in pairs. On pag 588 in note 109 he mentions homologous chromosomes. He even knows that one chromosome of a homologous pair of chromosomes is from the father, the other from the mother.
- However, he seems to adopt an infinite universe of DNA sequences which does the trick for him.
- Of the myriad problems of this scenario I mention a simple error:
"Only those individuals with the absolutely right organs will survive" is not correct for reproductive organs!
The situation is not analogous.
One does not need sex organs to survive, while one does need hart, lungs, kidney, liver, mouth, teeth, stomach, intestines,
and anus to survive. Infertile people don't die. See: 'The female and male genome' on this page.
- A good overview is: chapter 17 in: Stephen Stearns and Rolf Hoekstra (2005), second edition.
- page 8. This is a revealing and charming description of his naive way of thinking.
- Daniel Fairbanks (2007) Relics of Eden. The powerful evidence of evolution in human DNA, p.154.
- John Allman (2000) Evolving Brains, Scientific American Library, page 86
- Dylan Chivian et al (2008) 'Environmental Genomics Reveals a Single-Species Ecosystem Deep Within Earth', Science 10 October 2008.
- Science Daily (2003) Ultra-low Oxygen Could Have Triggered Die-offs, Spurred Bird Breathing System, Oct. 31, 2003.
- Helen Pearson (2008) 'Outcry at scale of inheritance project', Nature, 10 October 2008
- Periannan Senapathy et al (2008) 'Origination of the Split Structure of Spliceosomal Genes from Random Genetic Sequences', Plos One, October 20, 2008. Open Access.
- When did this processing occur?
Some evidence suggests that Senapathy means prebiotic processing, because he writes:
"Stop codons occurred too frequently to allow functional proteins to be encoded in random DNA" (my emphasis).
Life could not have started with too short genes and proteins, that would be incompatible with life.
Life requires functional proteins.
On the other hand, Senapathy also provides evidence that a substantial amount of processing occurred after the origin
of random DNA:
"The average exon length from the intron-rich genomes is about 170 bases whereas that expected from random ORF lengths is 60 bases. This may indicate that there has been a selection for longer exons within the allowed maximum ORF length of 600 bases for optimizing the frequency of suitable exon lengths." (my emphasis). Further evidence supports this interpretation: "mRNA splicing evolved to overcome the problem of the frequent occurrence of stop codons in primordial random DNA"; "RNA splicing evolved to circumvent the problem of short ORFs"; "According to the ROSG model, mRNA splicing evolved to overcome the problem of the frequent occurrence of stop codons in primordial random DNA that severely restricted ORF lengths." This is vague. Senapathy is not clear about it. A good scientific theory should be clear.
- Computational Discovery of Internal Micro-Exons
- Stewart Scherer (2008) A Short Guide to the Human Genome, p.32.
- Ann Gibbons (2008) 'The Birth of Childhood', Science 14 November 2008.
- Henry Nicholls (2008) 'Darwin 200: Let's make a mammoth', Nature 456, 310-314 (2008).
Let's make a mammoth from its DNA is a perfect analogy with Senapathy's project to create an animal from its genome!
All the problems and obstacles to create a mammoth from its DNA appear in Senapathy's scenario!
- updated 13 Mar 2013. It is chemically possible to have 6 bases (3 base pairs), 8 bases (4 base pairs), 12 bases (6 base pairs) (Nicholas Barton et al (2007) Evolution, page 561) or maybe even 20 bases (10 pairs). In that case each base codes for one amino acid and the length of DNA would be reduced to a third of the original length. Shorter DNA molecules would be more efficient: it costs less energy and less building blocks to synthetize and replicate. However, the difficulty would be to find bases that are able to form pairs with a constant diameter to enable a stable, regular double helix structure. The disadvantage would be that 20 different bases must be synthesized, coded for and maintained.
On the other hand, organisms could do with only CG pairs or only AT pairs. Information in DNA can still be coded with one base pair. Each strand would have a sequence such as GCCGGGGC..., and each amino acid could be coded by four or five bases instead of three. Replication with only one base pair would be most accurate because only G or C (say) would need to be distinguished. The disadvantage would be that the length of DNA to encode proteins would much longer, almost double. Example of using the four-base codon CGGG in a cell-free translation system: FRET analysis of protein conformational change through position-specific incorporation of fluorescent amino acids.
- In an interview Jerry Coyne says: "I don't know of any challenge to evolution that's ever come
from a non-religious person. Personally I've never experienced one.". Senapathy is such a non-religious critic of evolution.
- Noncoding DNA
- Actually Carl Woese showed that 'Prokaryotes' must be replaced with 'Archaea' and 'Bacteria' or
alternatively with 'microorganisms'. See also: Norman R. Pace (2009) 'It's time retire the prokaryote', Microbiology Today, May 2009, 85-87.
- Kurland CG, Collins LJ, Penny D (2006) Science 312:1011-1014. Quoted by Lynch
- John S. Mattick and Igor V. Makunin (2006) 'Non-coding RNA', Human Molecular Genetics 2006 15.
- Eugene V. Koonin (2009) Darwinian evolution in the light of genomics, Nucleic Acids Research, 2009, Vol. 37, No. 4 1011-1034. free full text.
- Henrik Kaessmann (2009) More Than Just a Copy, Science.
- Carol Greider, Elizabeth Blackburn and Jack Szostak received the Nobel prize 2009 for their work on telemeres. Nature
- Fuyuki Ishikawa and Taku Naito (1999) 'Why do we have linear chromosomes? A matter of Adam and Eve',
Mutation Research/DNA Repair Volume 434, Issue 2, 23 June 1999, Pages 99-107:
"Bacterial circular chromosomes have sporadically become linearised during prokaryote evolution".
- Dirk Schübeler (2009) 'Epigenomics: Methylation matters' Nature 462, 296-297 (19 November 2009)
- Dennis McCarthy (2009) Here be dragons. How the study of animal and plants distributions revolutionzied our views of life and Earth, p.100.
- Dennis McCarthy (2009) 'Here be dragons', p.186 and 191.
- See also: "As Simpson himself pointed out, -any event that is not absolutely impossible ...
becomes probable if enough time elapses" (Nature, 4 Feb 2010) about Madagascar species. What we need is quantification.
- Stewart Scherer (2008) A Short Guide to the Human Genome, p.41.
- Explanation of basic principles: Reading the Genetic Code (Nature Education).
- Michael Yarus (2010) Life from an RNA world, Harvard University Press.
- This very profound objection to independent origin occured to me only 7 years after I started this review.
- Elizabeth Pennisi (2010) 'Synthetic Genome Brings New Life to Bacterium', Science, 21 May 2010.
- Douglas L. Theobald (2010) 'A formal test of the theory of universal common ancestry', Nature 465 219-222 (13 May 2010). But see: The common ancestry of life.
- Nick Lane & William Martin (2010) 'The energetics of genome complexity', Nature, 467 929-934 21 October 2010
- France Denoeud (2010) 'Plasticity of Animal Genome Architecture Unmasked by Rapid Evolution of a Pelagic Tunicate', Science, Published Online 18 November 2010.
- Felisa A. Smith et al (2010) 'The Evolution of Maximum Body Size of Terrestrial Mammals', Science, 26 November 2010.
- Elizabeth Pennisi (2010) Shining a Light on the Genome's 'Dark Matter', Science 17 dec 2010
- 'meaningless/meaningful': is relative to the language. This also holds for the genetic code language! Senapathy would have found genes from an arbitrary genetic language in a random piece of computer DNA, because meaningful genes can be produced by any genetic language. The point is however, that in a computer experiment it's easy to use the same language during the experiment, while in natural experiments there is nothing that ensures the same genetic language during the production of even one genome, let alone all genomes.
- O.B. Ptitsyn (1984) 'Random sequences and protein folding', Journal of Molecular Structure: THEOCHEM Volume 24, Issues 1-2, July 1985
- V. S. Pande et al (1994) 'Nonrandomness in protein sequences: Evidence for a physically driven stage of evolution?' PNAS Vol. 91, pp. 12972-12975, December 1994.
- Anthony D. Keefe & Jack W. Szostak (2001) 'Functional proteins from a random-sequence library', Nature 410, 715-718.
- Codon usage in E. coli and: codon usage.
- D M Raup and J W Valentine Multiple origins of life PNAS May 1, 1983 vol. 80 no. 10 2981-2984
- Senapthy uploaded 3 papers to Nature Precedings (Documents on Nature Precedings are not peer-reviewed, but any visitor can comment):
- Origin of biological information: Inherent occurrence of intron-rich split genes, coding for complex extant proteins, within pre-biotic random genetic sequences 13 December 2010 07:04
- The inherent occurrence of complex intron-rich spliceosomal split genes, including regulatory and splicing elements, within pre-biotic random genetic sequences 13 December 2010 07:27
- Parallel genome assembly from pre-biotic split-genes: A solution for the mosaic genome conundrum 13 December 2010 07:53
I submitted 3 comments to the first document, but only one was posted. Remarkably, that comments are blocked while simultaneous submitting 3 large articles is no problem at all! Only later the second one appeared. And on 28 February 2011 Senapathy replied.
- T. Mourier and D. C. Jeffares (2003) 'Eukaryotic Intron Loss', Science p.1393, 30 May 2003.
- Sakharkar KR (2006) 'Functional and evolutionary analyses on expressed intronless genes in the mouse genome', FEBS Lett. 2006 Feb 20;580(5):1472-8. Epub 2006 Jan 31.
- Jain M et al (2008) 'Genome-wide analysis of intronless genes in rice and Arabidopsis', Funct Integr Genomics. 2008 Feb;8(1):69-78. Epub 2007 Jun 20.
- Syozo Osawa (1995) Evolution of the Genetic Code, Oxford University Press. page 150.
- Dmitry V. Fyodorov & James T. Kadonaga (2002) 'Dynamics of ATP-dependent chromatin assembly by ACF', Nature 418, 896-900 (22 August 2002)
- C. G. Kurland, L. J. Collins, D. Penny (2006) Genomics and the Irreducible Nature of Eukaryote Cells, Science 19 May 2006
- T. M. Embley, W. Martin (2006) 'Eukaryotic evolution, changes and challenges', Nature 440, 623 (Mar 30, 2006):
"In recent years, even that has been called into question, as some phylogenies have suggested that prokaryotes might be derived from eukaryotes".
- Eugene V. Koonin (2009) 'Intron-Dominated Genomes of Early Ancestors of Eukaryotes', J Hered (2009) 100 (5): 618-623.
- N. H. Barton et al (2007) Evolution, Cold Spring Harbor Laboratory Press, p.220.
- Group I catalytic intron in wikipedia. Homing endonuclease recognition sequences are long enough to occur randomly only with a very low probability: approximately once every 7×1010 bp.
- Scott William Roy, Walter Gilbert (2006) The evolution of spliceosomal introns: patterns, puzzles and progress, Nature Reviews Genetics Volume 7 March 2006 211
- Francisco Rodríguez-Trelles, Rosa Tarrío, Francisco J. Ayala (2006) 'Origins and Evolution of Spliceosomal Introns', Annu. Rev. Genet. 2006. 40:47-76. Very imporant article (introns-first hypothesis).
- Sequence and organization of the human mitochondrial genome, Nature (1981) 290: 457-65. Also: The mouse mitochondrial genome displays exceptional economy of organization, protein-coding genes with zero or few noncoding nucleotides between coding sequences .
- Lawrence A. David, Eric J. Alm (2011) 'Rapid evolutionary innovation during an Archaean genetic expansion', Nature, 469, 93-96. 06 January 2011
- GenScript (accessed 6 Jan 2011)
- "Our findings thus show that any split gene that can encode complex proteins which form the structure of the spliceosome or any other eukaryotic cellular organelle," (p.8). This equals to the claim that organelles were designed.
- Brian Hall, Benedikt Hallgrimsson (2008) Strickberger's Evolution, Fourth edition, page 139.
- Lee R. Kump (2010) 'Earth's Second Wind', Science 10 December 2010
- Radu Popa (2004) Between Necessity and Probability: Searching for the Definition and Origin of Life, Springer, p. 67-68.
- In previous versions I stated only that every OOL theory needs to explain the restricted choice of the genetic code in the face of the huge freedom of choice, but later I realised that precisely this freedom of choice forces any theory of independent origin to predict a huge diversity of genetic codes (11 Jan 11).
- "The non-universal genetic codes are not produced randomly, but are derived from the universal genetic code as the result of a series of non-disruptive changes" (Syozo Osawa (1995) 'Evolution of the Genetic Code', p.171). Senapathy's theory would predict that non-universal codes are (1) random, (2) ubiquitous, (3) originated in the beginning. All predictions are wrong.
- Michael Lynch (2002) 'Intron evolution as a population-genetic process', Proc Natl Acad Sci U S A. 2002 April 30.
- Michael Lynch, John S. Conery (2003) 'The Origins of Genome Complexity', Science 302 21 Nov 2003.
- Eugene V Koonin (2006) 'The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate?', Biology Direct. Koonin seems to be a little bit genome-centered too! (with response from Ford Doolittle).
- D. Fridmanis et al (2007) 'Formation of new genes explains lower intron density in mammalian Rhodopsin G protein-coupled receptors', Mol Phylogenet Evol. 2007 Jun; 43(3):864-80.
- Tom Strachan, Andrew Read (2000) Human Molecular Genetics Second Edition, p.150-152.
- Sarah Blaffer Hrdy (2009) Mothers and Others, p. 101.
- JE Darnell, Jr (1978) Implications of RNA-RNA splicing in evolution of eukaryotic cells, Science 22 December 1978:
Abstract: "The differences in the biochemistry of messenger RNA formation in eukaryotes compared to prokaryotes are so profound as to suggest that sequential prokaryotic to eukaryotic cell evolution seems unlikely. The recently discovered noncontiguous sequences in eukaryotic DNA that encode messenger RNA may reflect an ancient, rather than a new, distribution of information in DNA and that eukaryotes evolved independently of prokaryotes."
- Martin and Koonin (2006) Introns and the origin of nucleus-cytosol compartmentalization, Nature 440, 41-45 (2 March 2006)
- Daniel C. Jeffares, Tobias Mourier, David Penny (2006) 'The biology of intron gain and loss', TRENDS in Genetics Vol.22 No.1 January 2006
- In theory Senapathy could argue that prokaryotes originated independently and had as many introns as eukaryotes but lost them completely, since several publications argue that clear instances of massive intron loss make the case for complete intron loss in prokaryotes plausible. This implies strong Darwinian selection.
- Anthony Poole, Daniel Jeffares, David Penny (1999) Early evolution: prokaryotes, the new kids on the block, BioEssays 21:880-889, 1999.
- Morgan Ryan (2011) Unauthorized reproduction not prohibited, American Scientist, Vol 99, p. 30.
- M.B. Shapiro, P. Senapathy (1987) "RNA splice junctions of different classes of eukaryotes" is cited in: László Patthy (1999) Protein Evolution, p. 148.
- Monya Baker (2011) 'Genomics: Genomes in three dimensions', Nature, 470, 289-294 (10 February 2011)
- Tom Misteli (2007) Beyond the Sequence: Cellular Organization of Genome Function, Cell 128, 787-800, February 23, 2007
- Tom Misteli (2011) The Inner Life of the Genome Scientific American Feb 2011.
- Dan Graur, W H Li (2000) Fundamentals of Molecular Evolution, p.289 and 296.
- Guy M. Narbonne (2011) Evolutionary biology: When life got big, Nature 470, 17 February 2011
- Eddo Kim (2007) Different levels of alternative splicing among eukaryotes, Nucleic Acids Res. 2007 January; 35(1): 125-131.
- Aubrey E. Hill, Eric J. Sorscher (2006) The non-random distribution of intronless human genes across molecular function categories, FEBS Letters.
- Francis Crick (1981) Life Itself. It's Origin and Nature, Simon and Schuster.
- Monroe Strickberger (2000) Evolution, Third edition, p.143.
- Michael IP (2005) Intron retention: a common splicing event within the human kallikrein gene family. Clin Chem. 2005 Mar;51(3):506-15.
- Gene Splicing Overview & Techniques.
- Andreas Wagner (2005) Energy Constraints on the Evolution of Gene Expression, Molecular Biology and Evolution, 22, 6 Pp. 1365-1374
- Ner-Gaon H (2004) Intron retention is a major phenomenon in alternative splicing in Arabidopsis, Plant J. 2004 Sep;39(6):877-85.
- Knowles DG, 2. McLysaght A (2009) Recent de novo origin of human protein-coding genes. Genome Research
- Maren Krull, Jürgen Brosius and Jürgen Schmitz (2005) Alu-SINE Exonization: En Route to Protein-Coding Function, Molecular Biology and Evolution 22, 8 1702-1711
- Manfred Eigen (1992) Steps towards Life. A perspective on Evolution, Oxford University Press. (paperback edition: 1996) See also wikipedia article Error threshold.
- Richard Dawkins (1986) The Blind Watchmaker, chapter 3.
- P Senapathy (1986) Origin of eukaryotic introns: a hypothesis, based on codon distribution statistics in genes, and its implications, PNAS April 1, 1986 vol. 83 no. 7 2133-2137. The abstract is a beautiful succinct summary (written 8 years before his book) of his whole theory!
- Francesco Catania, Xiang Gao and Douglas G. Scofield Endogenous Mechanisms for the Origins of Spliceosomal Introns, J Hered (2009) 100 (5): 591-596. Quote: "Spliceosomal introns have also been suggested to directly arise from random primordial and canonical ancestral gene sequences (the 'split-gene model' and the 'proto-splice site model', respectively). In particular, Senapathy (1986)". Further Senapathy (1986) is quoted in: Masaru Tomita et al (1996) 'Introns and Reading Frames: Correlation Between Splicing Sites and Their
Codon Positions', Mol. Biol. Evol. 13(9):1219-1223.
- Lewin's GENES X, 2011, p. 86.
- Donald R. Forsdyke & James R. Mortimer (2000) Chargaff's legacy, Gene (2000) 261, 127-137
- Cory Y. McLean et al (2011) Human-specific loss of regulatory DNA and the evolution of human-specific traits, Nature 10 Mar 2011
- Manyuan Long and Carl Rosenberg (2000) Testing the 'Proto-splice Sites' Model of Intron Origin: Evidence from Analysis of Intron Phase Correlations, Mol Biol Evol (2000) 17 (12): 1789-1796. Another way of measuring intron phases is expressing exon length as multiples of 3, 3N+1, 3N+2.
- Compare with W F Doolittle: 'Genes in pieces: Were they ever together?' Nature 1978, 272:581-582.
- Jeffrey M. Perkel (2011) Synthetic Genomes: Building a better Bacterium, Science, 25 Mar 2011
- See my review of Information theory and molecular biology by Hubert Yockey.
- Eileen E. M. Furlong (2011) Molecular biology: A fly in the face of genomics, Nature 471, 458-459 24 March 2011
- S Ragsdale (2011) Biochemistry: How two amino acids become one, Nature, 31 Mar 2011 shows that the stopcodon UAG from methanogenic Archaea encodes the new amino acid Pyrrolysine. So it only has two stop codons.
Tetrahymena species recognize only UGA as a stop codon, while Euplotes species recognize only UAA and UAG as stop codons (Joe Salas-Marco et al, 2006).
- Ryan E. Mills et all (2011) 'Natural genetic variation caused by small insertions and deletions in the human genome', Genome Research April 1, 2011
- Gretchen Vogel (2011) 'Do Jumping Genes Spawn Diversity?, Science, 15 April 2011.
- Jevon Plunkett et al (2011) An Evolutionary Genomic Approach to Identify Genes Involved in Human Birth Timing, PLoS Genetics April 2011
- Kevin Plaxco and Michael Gross (2006) Astrobiology. A Brief Introduction, page 71: "These amino acids were almost certainly introduced by biochemistry after the origins of life".
- Hadas Keren, Galit Lev-Maor & Gil Ast (2010) Alternative splicing and evolution: diversification, exon definition and function, Nature Review Genetics, 11, 345-355 (May 2010)
- Elizabeth Pennisi (2011) 'Green Genomes', Science 17 Jun 2011
- "It is worth noting that the average length of metazoan exons (125 - 165 bp) is similar to the length of DNA that wraps around a nucleosome (147 bp), which suggests that nucleosome occupancy might confer purifying selection on exon length. However, the length of an average human exon is only 126 bp." From ref 209.
- Elçin Ünal et all (2011) Gametogenesis Eliminates Age-Induced Cellular Damage and Resets Life Span in Yeast, Science, 24 Jun 2011.
- "The oceans are teeming with viruses — typically, there are 100 billion viral particles per litre of water in the top 50 metres of most marine ecosystems. With an average of ten viruses for each bacterial cell, these parasites impose a tight control over the composition of marine microbial communities. The 'arms race' hypothesis holds that the selective pressure exerted by viruses continuously triggers adaptive mutations in the bacterial genomes, with counteracting genetic adaptations occurring at a similar pace in the parasites." Science 30 jun 2011.
- Mingyao Li et al (2011) Widespread RNA and DNA Sequence Differences in the Human Transcriptome, Science, 1 Jul 2011 :"We have uncovered thousands of exonic sites where the RNA sequences do not match those of the DNA sequences, including transitions [changes a purine nucleotide to another purine] and transversions [substitution of a purine for a pyrimidine or vice versa]".
- David Deamer (2011) First Life. Discovering the Connections between Stars, cells, and How Life Began. University of California Press, p. 183.
- Denis Noble (2006) The Music of Life. Biology Beyond the Genome, Oxford University Press. See my short description on the Introduction page. The first four chapters are the most important, they explain what is wrong with genetic determinism.
- See: David Deamer (note 215) page 214: "Can genetic information really appear out of nowhere, by chance?" where he reports the experiments of Bartel and Szostak (1993) "Isolation of new ribozymes from a large pool of random sequences', who began by synthetizing many trillions of different random RNA molecules 300 nucleotides long and found RNAs with catalytic activity. Furthermore, selection and amplification are involved, which are absent from Senapathy's scenario. Deamer concludes "The inescapable conclusion is that genetic information can appear out of random mixtures, as long as there are populations containing large numbers of polymeric molecules with variable sequences of monomers and a way to select and amplify specific property" (p.216). The conditions in this claim are extremely important! The problem here is: how do you get long polymeres in the first place? Long enough to be of catalytic value?
- Warm and Cold-Blooded
- The Central Dogma of molecular biology certainly reinforces genetic determinism because the direction of the flow of information is from DNA to RNA to proteins. This strongly suggests DNA is in control and there is no feedback.
- Which came first, the bird or the smaller genome? 30 August 2007
- Craig B. Lowe et al (2011) Three Periods of Regulatory Innovation During Vertebrate Evolution, Science, 19 August 2011.
- A story of chromosome number, Nature 477, 9 1 September 2011
- Y. Jiao et al (2011) Ancestral polyploidy in seed plants and angiosperms. Nature 473, 97 (2011).
- T. E. Wood et al (2009) The frequency of polyploid speciation in vascular plants. Proc. Natl. Acad. Sci. U.S.A. 106, 13875 (2009).
- Steven A. Frank (2009) Somatic evolutionary genomics: Mutations during development cause highly variable genetic mosaicism with risk of cancer and neurodegeneration, PNAS January 26, 2010 vol. 107
- Centrosome (24 Sep 2011)
- Kyle Vogan (2011) Maternal imprinting defect, Nature Genetics 43, 928
- James Darnell (2011) RNA. Life's Indispensable molecule, p. 243.
- Senapathy knows about the existence of the spliceosome in the Genetics Primer (p. 547, 555), although it does not occur in the index.
- Evolution of nested gene arrangements, the ruvinsky lab.
- Chun-Long Chen et al (2008) Genomewide Analysis of Box C/D and Box H/ACA snoRNAs in Chlamydomonas reinhardtii Reveals an Extensive Organization Into Intronic Gene Clusters, Genetics May 2008 vol. 179 no. 1 21-30.
- Michael B. Clark et al (2011) The Reality of Pervasive Transcription PLoS Biology July 2011.
- Eugene V Koonin, Tatiana G Senkevich, Valerian V Dolja (2006) 'The ancient Virus World and evolution of cells', Biology Direct 2006, 1:29. The authors use the words: "the primordial pool of primitive genetic elements"; "the primordial genetic pool", "The primordial gene pool", and "existence of a complex, precellular, compartmentalized but extensively mixing and recombining pool of genes"; "viral origin from the primordial genetic pool". However, "In this pool, RNA viruses would evolve first, followed by retroid elements, and DNA viruses.". So that is different from Senapathy: he starts with complete eukaryotic genomes. (I need to investigate this important publication! It seems that the authors make the mistake of 'protein-coding genes' without transcription and translation machinery!)
- Stephen J. Freeland, et all (1999) Early Fixation of an Optimal Genetic Code, Mol Biol Evol (2000) 17 (4): 511-518.
- A. D. Ellington (2009) Evolutionary origins and directed evolution of RNA, Int J Biochem Cell Biol. 2009 Feb;41(2):254-65. (See also Stuart A. Kauffman about random DNA libraries)
- "A long explanation for introns", New Scientist, June 26, 1986 (see google books); "Exon, introns, and evolution", New Scientist, March 31, 1988 (see google books).
- Martin Ackermann, Lin Chao (2006) DNA Sequences Shaped by Selection for Stability, PLoS Genetics, February 2006.
- Woese, C. R. On the evolution of the genetic code. Proc. Natl Acad. Sci. USA 54, 1546–1552 (1965).
- Tobias Warnecke, Laurence D. Hurst (2011) Error prevention and mitigation as forces in the evolution of genes and genomes, Nature Reviews Genetics 12, 875-881 (December 2011)
- Brian P. Cusack, Peter F. Arndt, Laurent Duret, Hugues Roest Crollius (2011) Preventing Dangerous Nonsense: Selection for Robustness to Transcriptional Error in Human Genes, PLoS Genetics October 2011
- Miodrag Grbic et al (2011) The genome of Tetranychus urticae reveals herbivorous pest adaptations, Nature 479, 487–492 (24 November 2011) Supplementary information Figure S2.4.2.
- Richard Cordaux, Mark A. Batze (2009) The impact of retrotransposons on human genome evolution, Nature Reviews Genetics 10, 691-703 (October 2009) is a very usefull overview.
- Erez Lieberman Aiden (2011) Zoom! Science 2 December 2011
- William A. Wells (2005) There's DNA in those organelles, The Journal of Cell Biology, March 14 2005
- Hans Ris and Walter Plaut (1962) Ultrastructure Of DNA-Containing Areas In The Chloroplast Of Chlamydomonas, The Journal of Cell Biology, 1 Jun 1962.
- Eugene V. Koonin (2011) The Logic of Chance. Pearson Education, hardback.
- Kevin N. Laland, Kim Sterelny, John Odling-Smee, William Hoppitt, Tobias Uller (2011) 'Cause and Effect in Biology Revisited: Is Mayr's Proximate-Ultimate Dichotomy Still Useful?', Science 16 Dec 2011.
- RM Schwartz and MO Dayhoff (1978) 'Origins of prokaryotes, eukaryotes, mitochondria, and chloroplasts', Science 27 January 1978: 395-403.
- Structure of the Mitochondrial Genome, Genetic Origins website.
- "Translation in Escherichia requires the coordinated and complex interactions of at least 100 gene products." from: Ravi Jain, Maria C. Rivera, and James A. Lake (1999) Horizontal gene transfer among genomes: The complexity hypothesis, Proc. Natl. Acad. Sci. USA Vol. 96, pp. 3801–3806, March 1999
- Stuart A. Kauffman (2011) Approaches to the Origin of Life on Earth, Life 2011, 1, 34-48 (Open Access)
- Joan A. Steitz (2012) RNA Rejoice! Review of RNA Life's Indispensable Molecule by James Darnell, Science 6 January 2012
- Ming Zou, Baocheng Guo, Shunping He (2011) 'The Roles and Evolutionary Patterns of Intronless Genes in Deuterostomes', Comparative and Functional Genomics, Volume 2011.
- Brian K. Hall, Benedict Hallgrimsson (2008) Strickberger's Evolution Fourth Edition, p. 134.
- Dennis W. Grogan (2002) Hyperthermophiles and the problem of DNA instability, Molecular Microbiology.
- Scott Freeman and Jon Herron (2007) 'Evolutionary Analysis', page 657.
- Martin Egli (2006) Uncovering DNA's 'sweet' secret. One particular curiosity: how did DNA and RNA come to incorporate five-carbon sugars into their "backbone" when six-carbon sugars, like glucose, may have been more common?
- Miklos Csuros, Igor B. Rogozin, Eugene V. Koonin (2011) A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes, PLoS Computational Biology September 2011.
- Igor B. Rogozin, Yuri I. Wolf, Alexander V. Sorokin, Boris G. Mirkin, and Eugene V. Koonin, (2003)
Remarkable Interkingdom Conservation of Intron Positions and Massive, Lineage-Specific Intron Loss and Gain in Eukaryotic Evolution, Current Biology, Vol. 13, 1512–1517, September 2, 2003.
- NCBI The RNA World and the Origins of Life.
- Lindberg J, Lundeberg J. (2009) The plasticity of the mammalian transcriptome, Genomics 2010 Jan;95(1):1-6.
- Furthermore, E. Koonin notes that the emergence of spurious (weak) transcription initiation sites in random DNA sequences is relatively easy (given the existence of Transcription factors of course) (p. 240, Note 246).
- Aaron E. Engelhart and Nicholas V. Hud (2010) Primitive Genetic Polymers, Cold Spring Harbor Perspectives in Biology, 12 May 2010.
- Aaron Klug (2004) 'The Discovery of the DNA Double Helix', Journal of Molecular Biology, Volume 335, Issue 1, 2 January 2004, Pages 3-26 (available as: klug-DNA.pdf)
- The standard laboratory technique to isolate DNA includes digestion with proteinase which removes all proteins (histones).
- How Many People Have Ever Lived On Earth? Population reference Bureau, assessed 9 Feb 2012.
- Monya Baker (2012) Functional genomics: The changes that count, Nature 482, 257–262 09 February 2012
- Kerstin Lindblad-Toh et al (2011) A high-resolution map of human evolutionary constraint using 29 mammals, Nature 478, 476–482 (27 October 2011)
- Jurka J, et al (2007) Repetitive sequences in complex genomes: structure and evolution, Annu Rev Genomics Hum Genet. 2007;8:241-59.
- Christian de Duve (2005,2006) Singularities. Landmarks on the Pathways of Life, p.81.
- Eric S. Lander (2011) Initial impact of the sequencing of the human genome, Nature, 470, 187–197 (10 February 2011)
- James Collins (2012) Synthetic Biology: Bits and pieces come to life, Nature 483, S8–S10 (01 March 2012)
- Gregory P. Wilson, et al (2012) Adaptive radiation of multituberculate mammals before the extinction of dinosaurs, Nature 483, 457–460 (22 March 2012)
- Tom Strachan, Andrew Read (2011) Human Molecular Genetics. Fourth Edition, on p. 274 is a table with 7 examples. (Info).
- In theory, a protein or enzyme could originate spontaneously from amino acids. However, non-essential amino gacids, such as Glycine, must be synthesized from precursors (in this case from Serine) by enzymes (in this case by serine hydroxymethyltransferase, SHMT). Even if all amino acids were present, the precise order of the amino acids would be too unlikely to originate by chance. One needs genes for that.
- Pregnancy: Why Mother's Immune System Does Not Reject Developing Fetus as Foreign Tissue, Sciencedaily, 7 Jun 2012.
- Chris Todd Hittinger (2012): "As plants invaded land, lignin provided the rigidity necessary for vascular plants to grow above their rivals and move water and nutrients over long distances. Lignin is a dizzying web of polymerized phenylalanine derivatives with dozens of combinations of modifications and cross-links that make wood structurally sound ", Science 29 June 2012
- Kenneth A. Johnson (2012) 'Biochemistry: DNA replication caught in the act', Nature 487, 177–178 (12 July 2012)
- Anne-Ruxandra Carvunis et al (2012) Proto-genes and de novo gene birth, Natue 19 Jul 2012.
- Dong-Dong Wu et al (2011) De Novo Origin of Human Protein-Coding Genes, PLOS Genetics, November 10, 2011. "Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee, supported by both transcriptional and proteomic evidence."
- William A. Shear (2012) Palaeontology: An insect to fill the gap, Nature, 488, 34–35 (02 August 2012)
- Mark Isalan (2012) Systems biology: A cell in a computer, Nature, 2 Aug 2012
- Christof Koch (2012) Modular Biological Complexity, Science 3 August 2012. Koch discusses the complexity of living systems which are characterized by large numbers of highly heterogeneous components, be they genes, proteins, or cells. He applies his analysis to the human brain, but it can equally applied to the origin of an eukaryotic organism.
- Alla Katsnelson (2010) Epigenome effort makes its mark, Nature 467, 646 (2010) 6 October 2010
- Amy Maxmen (2012) Cancer research: Open ambition, Nature News feature 8 August 2012. (a story about cancer drug discoverer Jay Bradner). "Such control systems generally involve three types of protein: 'writers', 'readers' and 'erasers'. Writers attach chemical marks, such as methyl groups (to DNA) or acetyl groups (to the histone proteins that DNA wraps around); readers bind to these marks and influence gene expression; erasers remove the marks".
- Kai Kupferschmidt (2012) Attack of the Clones, Science 10 August 2012: "Fungi have long been seen as the least interesting pathogens, but two catastrophes in the animal world have changed that view." "In just the past 5 years, scientists have discovered fungi affecting rattlesnakes, land crabs, avocado trees, cultured abalone, and the eggs of sea turtles".
- The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome, Nature 489, 57–74 (06 September 2012): "These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions.". One of the more remarkable findings is that 80% of the genome contains elements linked to biochemical functions, dispatching the widely held view that the human genome is mostly 'junk DNA' (Genomics: ENCODE explained). "Comparative genomic studies suggest that 3–8% of bases are under purifying (negative) selection and therefore may be functional".
- Elizabeth Pennisi (2012) 'ENCODE Project Writes Eulogy for Junk DNA', Science 7 September 2012: "the Encyclopedia of DNA Elements (ENCODE), has found that 80% of the human genome serves some purpose". "The latest protein-coding gene count is 20,687, with hints of about 50 more, the consortium reports in Nature. Those genes account for about 3% of the human genome [including introns], less if one counts only their coding regions. So, if introns are excluded the percentage of DNA consisting of genes probably is about 0.3% of the total DNA in the human genome. "As a result of ENCODE, Gingeras and others argue that the fundamental unit of the genome and the basic unit of heredity should be the transcript–the piece of RNA decoded from DNA–and not the gene".
- Laurence Moran 'The Random Genome Project' and the original source is: Sean Eddy (8 Sep 2012): "The experiment that I'd like to see is the Random Genome Project. Synthesize a hundred million base chromosome of entirely random DNA, and do an ENCODE project on that DNA. Place your bets: will it be transcribed? bound by DNA-binding proteins? chromatin marked? Of course it will." Sunday, September 09, 2012. Addition: a random sequence may bind a transcription-factor, but that may not result in transcription.
- The ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome, Nature 489, 06 September 2012
- Prospect, May 24, 2012.
- Maurizio Zanetti, Navin R. Mahadevan (2012) Immune Surveillance from Chromosomal Chaos? Science 28 September 2012
- Tim R. Mercer et al (2009) Long non-coding RNAs: insights into functions, Nature Reviews Genetics 10, 155-159 (March 2009)
- Vivien Marx (2012) 'Epigenetics: Reading the second genomic code', Nature 1 Nov 2012.
"But DNA works with many partners, including 'epigenetic' factors, which influence gene expression in ways that don't involve changes to the underlying sequence" and: "... to control the activity of particular genes". Genes do not control gene expression? Who does the controlling? “This development has heightened awareness about the good technologies needed to study how the genetic code is put into action says Adam Petterson ! ... BET proteins belong to a class of epigenetic reader that targets histones, recruits multi-protein complexes to the spot where they attach and instructs cellular processes involved in reading genetic information ... So far, 96 histone methyltransferases have been identified in humans...”
- The 1000 Genomes Project Consortium (2012) An integrated map of genetic variation from 1,092 human genomes, Nature 1 Nov 2012.
- We now know that the haploid human genome has 3,080 (male) or 3,022 (female) million base pairs.
- Nina V. Fedoroff (2012) Transposable Elements, Epigenetics, and Genome Evolution, Science 9 Nov 2012 (free access)
- Dirk Schübeler (2012) Epigenetic Islands in a Genetic Ocean, Science 9 November 2012
- Scott B. Vafai, Vamsi K. Mootha (2012) Mitochondrial disorders as windows into an ancient organelle, Nature 491, 374–383 (15 November 2012)
- Charles Robert Darwin (1809–1882). Origin of Species. XIV. Mutual Affinities of Organic Beings: Morphology–Embryology–Rudimentary Organs.
- See section genome sequencing and mapping in wikipedia. In 1995 the first eukaryotic genome, the budding yeast Saccharomyces cerevisiae, was completed.
- Peter Langridge (2012) Genomics: Decoding our daily bread, Nature, 491, 678–680 (29 November 2012)
- Warnecke T, Weber CC, Hurst LD. Why there is more to protein evolution than protein function: splicing, nucleosomes and dual-coding sequence, Biochem Soc Trans. 2009 Aug;37(Pt 4):756–61. (very useful, educational powerpoint presentation!).
- Wenqing Fu et al (2012) Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants, Nature, Published online 28 November 2012
- Panagiotis Papasaikas, Juan Valcárcel (2012) Splicing in 4D, science, 21 Dec 2012
- Joshua B. Plotkin, Grzegorz Kudla (2011) 'Synonymous but not the same: the causes and consequences of codon bias', Nature Reviews Genetics 12, 32-42 (January 2011)
- Dong-Dong Wu, David M. Irwin, Ya-Ping Zhang (2011) De Novo Origin of Human Protein-Coding Genes, PLoS Genetics 7(11)
- David G. Knowles and Aoife McLysaght (2009) Recent de novo origin of human protein-coding genes, Genome Res. 2009. 19: "This is the first evidence for entirely novel human-specific protein-coding genes originating from ancestrally noncoding sequences. We estimate that 0.075% of human genes may have originated through this mechanism leading to a total expectation of 18 such cases in a genome of 24,000 protein-coding genes."
- Dan Graur et al (2013) On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE, Genome Biology and Evolution , 20 Feb 2013.
- The difference between prokaryotes and eukaryotes correlates well with effective population sizes, genome size and the ability of natural selection to remove "junk DNA". Large genomes belonging to species with small effective population sizes should contain considerable amounts of junk DNA. See: Note 309. So, Senapathy found "junk DNA" in eukaryotes and not in prokaryotes and concluded that eukaryotes looked more like random DNA than prokaryotic DNA. In fact the cause of the difference is population size.
- Amy Maxmen (2013) RNA: The genome's rising stars, Nature (04 April 2013)
- Gert Korthof (2012) New origin of life model fatally requires a nonrandom protein, 24 Dec 2012. (blogpost)
- Is a variation on: "Surely You're Joking, Mr. Feynman!": Adventures of a Curious Character. (this has been the original title of the review...)
- The human release factor eRF1 gene is 50.768 basepairs long and the protein is 437 Amino Acids long and has a specific, non-random, sequence. So, it cannot arise spontaneously. Mathematically speaking, every sequence can arise in a serie of trials long enough. But that is not the point. The abiotic synthesis of those specific DNA- or protein-sequences is the problem to solve.
- James R. Lupski (2013) 'Genome Mosaicism–One Human, Multiple Genomes', Science 26 July 2013
- Robert M. Brucker, Seth R. Bordenstein (2013) 'The Hologenomic Basis of Speciation: Gut Bacteria Cause Hybrid Lethality in the Genus Nasonia', Science 9 August 2013.
- Kathleen L. McCann (2013) Mysterious Ribosomopathies Science 23 August 2013
- Mitochondria Versus Nucleus, The Scientist, February 15, 2013
- M. Paul Smith, David A. T. Harper (2013) Causes of the Cambrian Explosion, Science, 20 Sep 2013.
- DNA Packaging: Nucleosomes and Chromatin, Nature Scitable.
- "RNA polymerases are intricate molecular machines that transcribe DNA into RNA, combining RNA synthesis with the precise movement of a DNA template across their active site. Eukaryotic cells (those of animals, plants and fungi) have several RNA polymerases, each dedicated to the production of specific RNAs. RNA polymerase I (Pol I) synthesizes the ribosomal RNA component of the cell's protein-producing factories and so is crucial for cell survival, growth and proliferation; malfunction of Pol I can cause cell death or support the unrestrained proliferation characteristic of cancer cells". Nature 31 Oct 2013.
- Kostas Kampourakis (Editor) (2013) The Philosophy of Biology: A Companion for Educators (History, Philosophy and Theory of the Life Sciences), Springer, Introduction, p. 3.
- Tibor Gánti (2003) The Principles of Life, Oxford University Press, hardback. page 17. These are very beautiful insights from Gánti decades ago!
- Robert J. Weatheritt, M. Madan Babu (2013) The Hidden Codes That Shape Protein Evolution, Science 13 Dec 13:
"The authors determined that ~14% of the codons within 86.9% of human genes are occupied by transcription factors. Such regions, called "duons", therefore encode two types of information: one that is interpreted by the genetic code to make proteins and the other, by the transcription factor-binding regulatory code to influence gene expression. This requirement for transcription factors to bind within protein-coding regions of the genome has led to a considerable bias in codon usage and choice of amino acids, in a manner that is constrained by the binding motif of each transcription factor."
- Tim Lenton, Andrew Watson (2011) Revolutions That Made the Earth, Oxford University Press. 448 pp. (Reissue edition: paperback 2013). (Chapter 4 and 4.6 Summary: the whole system view.)
- Michele Clamp et al (2007) Distinguishing protein-coding and noncoding genes in the human genome PNAS December 4, 2007. A random GC-rich sequence (50% GC) of 2 kb has a ≈50% chance of harboring an ORF ≈400 bases long. See Supporting Information Figure 4 for the graph of ORF length statistics.
- Kepa Ruiz-Mirazo and Alvaro Moreno (2006) 'On the Origins of Information and Its Relevance for Biological Complexity',
Biological Theory 1(3) 2006, 227–229. However, single-stranded RNA could contain 'information' because it can fold. But, Senapathy is only concerned with DNA, not RNA.
- Li Zhao (2014) Origin and Spread of de Novo Genes in Drosophila melanogaster Populations, Science 14 February 2014
- Narayana Annaluru et al (2014) Total Synthesis of a Functional Designer Eukaryotic Chromosome, Science 4 April 2014: "Here, we report the synthesis of a functional 272,871 base pair designer eukaryotic chromosome, synIII, which is based on the 316,617 base pair native Saccharomyces cerevisiae chromosome III.". Other researchers have synthesized a bacterium's full genome, but the yeast job is orders of magnitude more complex. Furthermore, compare a yeast chromosome with human chromosome 21: 48 million nucleotides!
- Elizabeth Pennisi (2014) Building the Ultimate Yeast Genome, Science 4 April 2014: To increase the genome's stability, they took out mobile DNA elements, such as retrotransposons, introns and other noncoding DNA. It took Codon Devices more than eleven months to deliver a 90,000-base circular chromosome.
- Viruses are intracellular parasites and contain no ribosome, produce no energy and do not divide. See also: Pandoraviruses: Amoeba Viruses with Genomes Up to 2.5 Mb Reaching That of Parasitic Eukaryotes. Pandoraviruses contain more than 1000 protein-genes including 54 DNA-processing proteins and seven virus-encoded amino acid–transfer RNA (tRNA) ligases. The largest known viral genome Pandoravirus salinus contains 2541 protein-genes. The big number of genes does not make them alive; they still depend on living cells for reproduction.
- Andrew G. Clark (2014) Genetics: The vital Y chromosome, Nature 508, 463-465 (24 April 2014): "Most noteworthy is their observation that the sex chromosomes of placental mammals, birds and monotremes had essentially independent origins, which means that patterns of gene loss and of specific retention of classes of genes on their Y (or W) chromosomes can be compared."
- Compare with Daniel G. Gibson and J. Craig Venter (2014): "A biological cell is much like a computer – the genome can be thought of as the software that encodes the cell's instructions, and the cellular machinery as the hardware that interprets and runs the software." Nature 8 May 2014 (Synthetic biology: Construction of a yeast chromosome)
- Jocelyn Kaiser (2014) The Hunt for Missing Genes, Science 16 May 2014. "the average person carries about 100 incapacitated genes–and in 20 of those cases, both the maternal and paternal copies of a gene are missing, creating a complete knockout."
- Already at this point in the story the canonical genetic code is assumed. Or: rather some genetic code, because at that point in the origin of life any genetic code is possible. See: The elephant in the room. As an example: "reassignment of all three stop codons was found" in: Natalia N. Ivanova (2014) Stop codon reassignments in the wild, Science 23 May 2014. The effect of a stopcodon reassignment to an amino acid is a longer protein. A stop codon reassignment could be in the nuclear or mitochondrial DNA. [23 May 2014]
- It is easy to forget that introns never get spliced out in the DNA, but only in the mRNA! Remark inserted 25 Jun 2014. Furthermore, catalytic RNAs play a dominant role in this processing, which represents a major involvement of ribozymes.
- Wolf Reik & Gavin Kelsey (2014) Epigenetics: Cellular memory erased in human embryos, Nature 31 Jul 2014.
- Mycoplasma: the codon UGA encodes the amino acid tryptophan instead of the usual stop codon. 31 Jul 2014
- Matt Kaplan (2012) DNA has a 521-year half-life, Nature, 10 Oct 2012 1 Aug 2014
- Bloom K, Joglekar A. (2010) 'Towards building a chromosome segregation machine', Nature, 2010 Jan 28
- Suzanne Clancy DNA Transcription, Scitable, Nature Education.
- However, if 'no other causal factor' than DNA is required to produce a human being, then 'the organism does –in a sense– compute itself from its genes'! So, what's the problem? He contradicts himself. For Senapathy the problem is quite different and serious: he has no cellular machinery for reading DNA. 9 Aug 2014
- Genomes. 2nd edition. Chapter 7 Understanding a Genome Sequence. Stewart Scherer (2008) A short guide to the human genome states 16.995 to 21.461 nucleotides (bases). (p. 25)
- Joanna L. Kelley, et al (2014) Compact genome of the Antarctic midge is likely an adaptation to an extreme environment, Nature Communications, 12 aug 2014. See Supplementary tables.
- Addy Pross (2012) What is Life? : "All modern life forms depend critically on this interdependence. DNA, the nucleic acid in which all heritable information is coded, cannot replicate without the elaborate involvement of protein enzymes, and those proteins cannot be generated without the prior existence of the DNA molecule, which codes for those enzymes. ... The RNA-world hypothesis appears to resolve this dilemma... " Chapter 5. 26 Aug 2014.
- Emily Singer (2014) Chemists Seek Possible Precursor to RNA, Quanta magazine, Feb 5 2014.
- Thousands of never-before-seen human genome variations uncovered, Science Daily, 10 Nov 2014.
- Guojie Zhang et al (2014) 'Comparative genomics reveals insights into avian genome evolution and adaptation', Science 12 December 2014
- Hui Y. Xiong et al (2015) 'The human splicing code reveals new insights into the genetic determinants of disease ' and editorial: Roderic Guigo, Juan Valcarcel (2015) Prescribing splicing, Science 9 Jan 2015. "These [SNVs] include synonymous changes within protein-coding sequences, generally assumed to be functionally neutral, as well as missense or nonsense changes whose effects on protein expression may be more dramatic than anticipated because of their impact on the splicing process."
- Ed Yong (2015) Microbiology: Here's looking at you, squid, Nature, 14 January 2015
- Michael Weinreich (2015) Molecular biology: DNA replication reconstructed, Nature, News & Views, 26 March 2015. The core replication machinery has been conserved throughout evolution, from yeast to mammals.
- Edward J. Grow et al (2015) Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells, Nature, 11 Jun 2015. "transcripts of HERVH, and of its regulatory element LTR7, were detected before EGA ". (Embryonic Genome Activation)
- The cells' toolbox for DNA repair. Nobel Prize Org Press Release 7 October 2015. See also: Popular Information: DNA repair – providing chemical stability for life (pdf), and "Before his (Lindahl) work, "I don't think anybody really considered the idea that DNA requires active engagement by a set of housekeeping processes to keep it in a stable state," says Keith Caldecott" in Nature, 7 Oct 2015.
- Robert J. Weatheritt, M. Madan Babu (2013) The Hidden Codes That Shape Protein Evolution, Science 13 December 2013
- Karsten Melcher (2016) Structural biology: When sperm meets egg, Nature 23 Jun 2016
- David S. Booth & Nicole King (2016) Evolution: Gene regulation in transition, Nature 23 Jun 2016
- Fateful imprints, Science 13 Jan 2017
- Suzan Mazur (2017) Eugene Koonin: The New Evolutionary Biology, Huffington Post, 02/03/2017. [my criticism: Koonin's genomes evolve in a vacuum, they do not live on a planet, and are not influenced by down-to-earth factors as physical, geochemical and climatological factors.]
- Yasutaka Kakui, Frank Uhlmann (2017) Building chromosomes without bricks, Science 23 Jun 2017
- Jeremy L. England (2013) Statistical physics of self-replication, The Journal of Chemical Physics, Volume 139, Issue 12 2013.
- Tejas Dharmaraj, Katherine L. Wilson (2017) How chromosomes unite, Nature NEWS AND VIEWS 29 November 2017
- Senapathy knows the existence of a Startcodon, he shows a Start codon in figure 10 on page 555 of the Genetics primer. A Start codon determines together with a Stop codon the Open Reading Frame (ORF). Calculation: on the basis of the frequency of stop codons, there are on average 3 ORFs of 21 codons in every sequence of random DNA of length 64 codons. Adding Start codons with frequency of 1 in 64, 1 of the 3 ORFs is truncated at position 10 on average. Taking together: 21 + 21 + 10 = 52 divided by 3 = 17 codons. So, the predicted average length of ORFs is 17 codons. This is smaller than predicted on the basis of Stop codons alone. 25 Feb 2018
- According to Senapathy the left splice site is 9 bases long and the right one is 4 bases long (page 243). 1 Mar 2018
- Circadian organization of the genome, Science 16 Mar 2018 ("The clock protein Rev-erbα regulates genome folding to establish circadian gene repression.")
- However, according to Karo Michaelian (2017), photochemical pathways to nucleobases exist. (Thermodynamic Dissipation Theory of the Origin and Evolution of Life, p.124.).
- Periannan Senapathy's company Genome Technologies. Note: this seems to be no longer Senapathy's homepage, but of a commerical firm (noted on 16 Jun 2013). Here is a snapshot of his website genome.com taken in 2001 by the WayBackMachine. Please note the self-assured language Senapathy uses about himself and his achievements.
- Senapathy's new home site ("The page you requested was not found" on 16 Jun 2013).
- Senapathy's immodest autobiography. ("The page you requested was not found" on 16 Jun 2013).
- Independent Birth of Organisms - A New Theory That Distinct Organisms Arose Independenlty From the Primordial Pond Showing That Evolutionary Theories Are Fundamentally Incorrect (1994). Genome Press, Madison, 635 pages. The book is free available as an Adobe pdf file (3 MB!). Fast internet connection recommended. The pdf file is full-text searchable, which is extremely handy for research purposes (recommended for any book! Each book should also be published on CDROM!) ("The page you requested was not found" on 16 Jun 2013). Luckily, I have saved copy of the pdf on my PC.
- Here is a list of publications of Senapathy P (Periannan) from the BioInfoBank Library. More on p.598 of his book.
Please note that peer-reviewed journals are included in the list, but as far as I can see from the abstracts those publications do not mention his theory of independent origin of organisms!
- PRWEB Genome Data Proves False the Theory of Evolution, New Theory Shows Complex Animals and Plants Originated from Prebiotic Chemistry, December 15, 2010
- A layman's summary of the new theory by Jeffrey Mattox. With many links (some dead links). Mattox is an electrical engineer who 'discovered' Senapathy. The page is no longer maintained.
- Double helix: 50 years of DNA (from Nature), containing a collection of overviews celebrating the historical, scientific and cultural impacts of the discovery of the double helix. All content is free.
- Gert Korthof: A Chemist's View of Life: Ultimate Reductionism & Dissent. A review of Schwabe's book.
- Gert Korthof: Independent Origin and the facts of life. Reasons from developmental biology, genetics and ecology (please note that I skipped reasons from evolution biology!).
- Gert Korthof: a review of The principles of Life by Tibor Gánti. (the origin of life)
- Gert Korthof: a review of Lynn Margulis and Dorion Sagan (2002) Acquiring Genomes. A theory of the origin of species. Recommended reading. If anything contradicts independent origin then it is Margulis' now well established symbiosis theory.
- Gert Korthof: The Feathered Onion is a review of Clive Trotman's book. Summary: Is the time span for the origin of life on earth too short?
- How Many New Genes Are There? Science Vol 311 24 March 2006 1709. This article uses computer generated random (intron-free) cDNA sequences of 2000 bases in length and concluded that by chance 1247 of 20,000 (6,2%) contain Open Reading Frames which could produce proteins of 119 or more amino acids.
- Robert M. Hazen, Patrick L. Griffin, James M. Carothers, and Jack W. Szostak (2007) 'Functional information and the emergence of biocomplexity', PNAS published online May 9, 2007. "Here we explore the functional information of randomly generated populations of Avida organisms." That is random genomes are generated! This would be a modest but careful way to explore genomespace and the probability of random generation of a funtional genome.
- Periannan Senapathy et al (2008) 'Origination of the Split Structure of Spliceosomal Genes from Random Genetic Sequences',
Plos One, October 20, 2008. Open Access. ("and that a machinery was required for removing the genetic waste.": the concept 'genetic waste' has only meaning if a functional genome exist. Similarly, 'stop codon' only has meaning if the genetic code is already established.)
Please note that the book reviewed on this page is present in the PLOS article as note 36, but the subtitle
'A New Theory That Distinct Organisms Arose Independently From The Primordial Pond Showing That Evolutionary Theories Are
Fundamentally Incorrect' is omitted.
- T. Ryan Gregory and Niles Eldredge (2008) Spore biology. Surprisingly, this page is usefull for Senapathy supporters, because Senapathy's theory is like Spore Biology (need to elaborate this).
- Periannan Senapathy (2013) 'Theory of the origin of complex eukaryotic genomes from pre-biotic random genetic sequences', 14 Mar 2013 - 12:00pm. Lecture in the Evolution Seminar Series (ESS) of the J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison. (reported to me by Tyler Valkoun from University of Wisconsin-Madison).