There’s a million codes out there. HTML, bar codes, zip codes, Java, English and Chinese.

Out of a million codes, 999,999 are designed by humans.

There’s one code we don’t know the origin of – and that’s DNA. We don’t know of any codes that are not designed. This implies design in DNA.

That’s an unsolved science mystery. So I and a group of Private Equity Investors have formed a company, Natural Code LLC, to offer a multi-million dollar technology prize for Origin Of Information.

Mitchell Hackerman posted a GREAT question about the Evolution 2.0 Prize:

“So you wrote the book Evolution 2.0 and want to know of a code that wasn’t developed by Intelligence?

Well, there’s no way to prove either way; while we may not have codes we know of that haven’t been developed by intelligent life, that doesn’t mean DNA wasn’t formed via some biologic accident.

We can only say that since code is always developed via intelligence, it’s only reasonable to consider human DNA and/or or code was formulated by intelligence.

In the end, there is no way to prove either way for certain, so of course, no one can win your prize money. Nor could you win prize money to prove DNA didn’t spontaneously develop. You still don’t have enough information to conclude it can’t or hasn’t developed spontaneously.”

Mitchell, thank you for being so forthright. You could be correct. For all we know, life might have been a spontaneous biologic accident. That is exactly what Richard Dawkins says in his book The Selfish Gene:

“In once sense, it is a bigger gap” and that the origin of life may have been an “extremely improbable event” (p. 135).

There’s only one problem with that approach:

It’s not science.

What is science?


SCIENCE: 1. a branch of knowledge or study dealing with a body of facts or truths systematically arranged and showing the operation of general laws: the mathematical sciences.

2. systematic knowledge of the physical or material world gained through observation and experimentation.

If you cannot test it, reproduce it, falsify it, observe it, validate it from first principles, model it, simulate it on a computer or validate it mathematically, then it’s not science.

If life is something that happened literally accidentally, perhaps only once in the history of the entire universe… then in order to accept that theory, we have to abandon the scientific method. Because none of our experience confirms that accidental events can create information.

If we’re sticking with hard science, no current theory of life’s origin qualifies.

One of my friends is a prominent scientist who simply refuses to talk about Origin of Life, because he’s honest enough to admit that we know next to nothing about where life came from.

So if we’re going to be consistent and insist that we only teach science in science classrooms, then not only should discussions of God be banned, but all the other theories of life’s origin should be banned too.

The creationist believes in God with a capital G.

The atheist believes in Chance with a capital C.

I fail to see the difference. (Except that creationists generally admit their belief is based on faith, and atheists usually don’t.)

In formal scientific literature, the most truthful statement I’ve ever found is from Hubert Yockey, in his book Information Theory, Evolution and the Origin of Life (Cambridge University Press 2005). On page 176 he says:

“I have no doubt that if the historic process leading to the origin of life were knowable, it would be a process of physics and chemistry. Thus the process of the origin of life is possible but unknowable.”

Page 181: “The fact that there are many things unavailable to human knowledge and reasoning, even in mathematics, does not mean that there must be an Intelligent Designer.”

From a scientific perspective, Yockey’s answer is perfectly valid. I salute him for his candor. But it leaves the elephant stomping around in the room. It assumes our absence of knowledge is a brick wall.

But what if this is solvable – scientifically?

It might be solvable. So I am willing and eager to stick to the normal rules of science – methodological naturalism – and not abdicate to a “God Of Gaps explanation” every time we hit a wall in our understanding.

This is VERY important. Why?

Because no working scientist gets to say “God did it, that explains it” then take a 3-martini lunch. Scientists have to earn their paychecks. We must respect their jobs and their profession.

MANY religious people pit theology against science. The way many Christians, creationists and Intelligent Design advocates frame the issue, they’re practically giving scientists the finger.

It took me quite awhile to see how big this problem is. But I now see it very clearly. This is not OK.

That’s why it’s vital to search for an Origin Of Life model that is properly scientific. That is the motivation behind the prize.

Creating and funding this prize has been a very complex and expensive undertaking. Forming a corporation, hiring lawyers, conforming to securities laws, pitching investors, etc. etc. etc. Only a person who has formed an entity and legally taken on equity investors, dealt with federal regulations etc. can fully appreciate this.

Some of my friends think this is brilliant. Others think I’m crazy.

I did not create and fund this prize to “give scientists the finger.” I founded this prize so that we can put Origin Of Life on proper scientific footing.

Why? Because there may well be a principle of self-organization in nature, or consciousness, or some unknown law of physics, that explains information.

Origin of Information is one of the most valuable and fundamental questions in the entire history of science. If this is discovered, it will be one of the ten most important discoveries of the 21st century. It may be one of the biggest science discoveries of all time.

I believe there’s a 10% chance of solving this in my lifetime.

Of course we can choose to give the current (non-empirical, non-testable, non-scientific) Origin Of Life theories a free diplomatic bag of immunity. If so, why don’t we just ignore science entirely… and make up whatever we want to believe?

Secular people of all stripes are free to do that. People are welcome to believe life was a “happy chemical accident,” as long as they acknowledge that’s not science.

But they can’t have their cake and eat it too. They can’t claim to “wear the robe of science” as though it somehow supports their skepticism. And they cannot ban God from the debate, embrace a story about warm ponds and lucky lightning strikes, and claim to be fair and honest about science.

By the way, I know many deeply religious people who are also extremely uncomfortable with “God of the Gaps” arguments. They also only accept naturalistic models as real science. The BioLogos foundation is a good example.

Meanwhile, if you want to dismiss “design” in biology, you must solve information first. Until then, the inference to a Designer is still on the table.

Arthur C. Clarke said, ”Any sufficiently advanced technology is indistinguishable from magic.”

And I say, “Any sufficiently improbable event is indistinguishable from miracles.”

Therefore I see no empirical advantage that any of the current explanations have, vs. invoking God. Both are faith based.

The only proper scientific approach is to hypothesize that there is an undiscovered principle that explains life and information.

This is why I am totally serious about this prize. Origin Of Information may be solvable. If it is, the discovery will meet the criteria I’ve outlined in the Evolution 2.0 Prize.

May the best man – or woman – win.

  1. rasheed larney says:

    Hi Perry,

    I know that you are genuinely interested in finding a natural code and I agree that it would be an immensely important discovery. However, what if such a thing was logically impossible (not just technologically or even scientifically impossible)? I’ve been thinking about that and would really appreciate your thoughts.

    To elaborate on what I mean, I’ll copy my answer to a question on Quora: “What other coding system has existed without intelligent design? How did the DNA coding system arise without it being created?”
    I was motivated to post an answer mainly in response to someone else claiming that light waves, tree rings, and other natural phenomena are all codes (the common claim I know you’re all too familiar with).

    “What other coding system has existed without intelligent design?”

    All codes are made by intelligent agents. It is for this reason that Intelligent Design proponents argue that the genetic code must have been caused by an intelligent agent. In syllogistic form the argument goes like this:

    All codes are intelligently designed.
    The “genetic code” is a code.
    Therefore, the genetic code was intelligently designed.
    Opponents of ID, on the other hand, typically respond to this by resorting to any of 3 arguments:

    1. The origin of the genetic code is not a mystery. It was created by natural, undirected, non-intelligent processes (such as physico-chemical processes, random mutations, natural selection, evolution, etc).
    2. Not all codes are intelligently designed. There are many naturally occurring phenomena that also qualify as codes, and they are made by natural, non-intelligent causes.
    3. The genetic code is not a real, literal code. Its description as a code in scientific literature is metaphorical only.
    Unfortunately, none of these 3 objections against ID are very convincing.

    1. The origin of the genetic code is not a mystery
    Some people claim emphatically that the genetic code has a natural, non-intelligent origin. However, this claim is based purely on faith in the neo-Darwinian model, not on any scientific evidence. According to origin of life researchers (ie. the actual scientists who investigate this very question) we not only don’t know how the genetic code came to be, but we also haven’t made any significant progress in solving this problem since the code was “discovered”. So, the brute fact is that there is no known natural, non-intelligent cause for the genetic code:

    “Nevertheless, in a close analogy to the situation with theoretical approaches, we are unaware of any experiments that would have the potential to actually reconstruct the origin of coding, not even at the stage of serious planning.”

    “Summarizing the state of the art in the study of the code evolution, we cannot escape considerable skepticism. It seems that the two-pronged fundamental question: “why is the genetic code the way it is and how did it come to be?”, that was asked over 50 years ago, at the dawn of molecular biology, might remain pertinent even in another 50 years. Our consolation is that we cannot think of a more fundamental problem in biology.” (Eugene V. Koonin and Artem S. Novozhilov, Origin and evolution of the genetic code: the universal enigma. 2009)

    From the introduction of “Origin and Evolution of the Universal Genetic Code” in the Annual Review of Genetics (Nov 2017):

    “Why are the codon assignments what they are? In other words, why is it the case that, for instance, glycine is encoded by GGN codons rather than, say, CCN codons (the latter of which encode proline in the SGC)?” “Evolution of the code is intimately linked to the origin and evolution of the translation apparatus itself, and this is one of the most fundamental and hardest problems in all of biology.”

    From the concluding remarks of the same review:

    “Notwithstanding the complete transformation of biology that occurred over these decades, we do not seem to be much closer to the solution. Recalling the list of “why” questions we asked in the introduction, we find that it is hardly possible to answer any of them definitively.” (Eugene V. Koonin and Artem S. Novozhilov, “Origin and Evolution of the Universal Genetic Code”, Annual Review Of Genetics, Vol. 51:45-62 (Volume publication date November 2017)

    In light of this, anyone who claims categorically that the genetic code is explained by natural, physical, chemical processes carries the onus to prove that claim. That is to say the onus of proof doesn’t fall on anyone who denies the claim that the genetic code has a natural origin (or is agnostic towards it), but rather it falls squarely on those who make the claim that it DOES have a natural explanation.

    2. Not all codes are intelligently designed

    Some commenters attempt to show that many naturally occurring phenomena also qualify as codes, just like the genetic code. The suggestion being that those codes have natural, non-intelligent explanations, therefore there is no reason to suppose that the genetic code couldn’t also have a natural explanation. In another answer on this page, for example, there is a claim that lightwaves, sound waves, footprints, topography, tree rings, varves, etc. all “carry coded information”.

    The problem with such claims of naturally occurring codes, however, (apart from the fact that no serious scientist would accept it) is that they are based on a glaring misunderstanding of what a code is.

    Obviously you can define the word “code” to mean anything you want so that random natural phenomena qualify as “codes”. However, that definition goes against the commonly accepted meaning of “code” according to any dictionary, as well as its definition according to semiotics and information theory. It also ignores the pertinent features of the genetic code that make us call it a code in the first place. So, let’s consider the meaning of “code”:

    What Is A Code? (from common use)

    Here are some dictionary definitions:

    A set of rules for converting information into another form or representation.

    An encoded representation of a character, symbol, or other entity.

    A system of words, letters, figures, or symbols used to represent others.

    A system of symbols (such as letters or numbers) used to represent assigned meanings.

    A system of symbols, and rules for their association by means of which information can be represented or communicated.

    So a code is a procedural system of symbols, and the rules/conventions related to their association. Eg. Morse code is a conventional system made up of multiple individual symbols (symbolic representations of alphabetical characters). By definition then, a crucial property of a code is that it is made up of symbols (symbolic representations with assigned meanings or functions).

    But if a code is made up of symbols, then what is a symbol? To answer that, let’s refer to the field of semiotics.

    What Is A Symbol? (from Semiotics)

    Everything can be a sign of something else (ie. can be a signifier of something other than itself). Semiotics (or semiology) is the study of sign processes and meaning-making. It’s co-founder, Charles Sanders Peirce, distinguished between 3 different kinds of signs:


    An Iconic sign resembles the thing that it signifies, eg. you know that the smiley-face emoji represents “happy” because it resembles a happy face. You also know that the statue of Abraham Lincoln represents Abraham Lincoln because it resembles him.


    An Indexical sign (also called a Natural sign, or Sinsign) signifies in virtue of non-arbitrary existential facts (ie. natural causal relations) between the signifier and the thing that it signifies. Eg. the reason why we can interpret smoke as a sign of fire is because of the natural (physico-chemical) causal connection between smoke and fire. This causal relation between the fire and smoke is what makes the smoke an indicator (from index) of fire.


    A Symbolic sign (also called a Conventional sign, or Legisign) has a crucial signifying element that is not due to existential causal facts, but rather primarily due to convention, habit or rules. Eg. traffic lights, words, gestures, flags, logos, languages, etc. They signify in virtue of the conventions surrounding their use.

    There are no existential, physical, or chemical facts about a red light that could lead you to know that it means “stop”. The only reason you can interpret it to mean “stop” is because you are familiar with the convention or rule that says it means “stop”.

    Note also that there are no existential, physical, or chemical reasons why, instead of a red light, any other coloured light could not be a sign for “stop”. A blue light, for example, could also mean “stop”. So the choice of color is arbitrary in the sense that there is no physical, chemical, or any other natural process that causes red to work as a sign for “stop”, rather than some other color. The color red only becomes the sign for “stop” when it is arbitrarily selected (out of all possible colors) to signify “stop”. So, another crucial difference between a symbol and an index is that a symbol (and the convention surrounding its use) is arbitrary. By definition then, it is arbitrarily chosen to symbolise something else by the creative act of an intelligent agent. To put that another way, the statement “All symbols are intelligently designed” doesn’t require any empirical evidence because it is necessarily true. It is true by definition the same way the statement “All bachelors are single” is true by definition.

    Another important thing to clarify about a symbol is that the agent that interprets the symbol (the interpretant) does not necessarily have to be intelligent. For example, you can speak to your phone and ask Siri “What’s the weather like?” and Siri will respond appropriately upon “hearing” and “interpreting” your spoken words. When your computer gets automatic virus updates there is also a non-intelligent interpretant. Likewise when you press the “s” key on your keyboard the keycode is “interpreted” and mapped to ASCII in your OS and eventually, after many intervening steps, an “s” is displayed on your screen.

    Similarly, the sender of coded or symbolic information also does not necessarily have to be intelligent. You can imagine “it’s raining” being spoken by Siri. Auto-generated emails is another example.

    To clarify further, you (an intelligent interpretant) can interpret a red light from a traffic signal (a non-intelligent sender) to mean “stop” because you know the convention. But you can also invent a car (a non-intelligent interpretant) that is able to detect and “interpret” red lights and stop accordingly.

    The maker of a symbolic convention, however, is always intelligent, and in cases where the sender or interpretant is not intelligent, they are also always made by an intelligent agent. This is true not just as an inference based on observation (ie. because we’ve never yet seen a counter-example). As already shown, it’s true by definition. Creating a symbol involves the ability to perceive and distinguish things, to make representations of things by other things, to decide what things will be represented and what things they will be represented by, etc. The creation of not just a symbol, but a system of symbols, or code, requires an even greater degree of intelligence. As does the creation of non-intelligent agents that are able to send or interpret a symbol.

    So, what can we say about those alleged “naturally occurring codes” like lightwaves, sound waves, footprints, topography, tree rings, varves, etc?

    Firstly, there are no discernible systems of signs involved. Eg. the amount of tree rings can be a sign of the age of a tree, but then it would just be one single sign, not a system of signs. This already disqualifies these natural phenomena from being codes because codes are systems.

    Also, insofar as any of these natural phenomena can signify something else (eg. lightwaves provide information about the source of the light, footprints provide information about what animals passed that area, tree rings provide information about the age of a tree, etc), they do so only because of the existential or causal connections between them and the things that they indicate. We know that a lightwave displaying the Doppler effect means that the source of the light is moving relative to us. But we only know this because of the causal relation between the physical properties of the wave and the physical properties of the light source.

    As another example, if you can’t find a station on the radio, then the sound waves will carry static noise to your ears. The static noise might indicate to you that there is no station because of the causal relation between not being tuned to a station and receiving static noise. In this sense the static noise is an index, but other than that it is meaningless. The static noise is not an arbitrarily selected symbol for “there’s no station”.

    On the other hand, if you do find a station and hear a DJ saying “It’s raining.”, then the sound waves merely carry meaningful symbolic information to you, or to anyone else who understands the language (ie. the convention) spoken by the DJ. The spoken sounds “it’s raining” is the signifier, not the waves or the properties of the waves. In this case there is symbolic information because what the spoken sound “it’s raining” means depends on a convention, and is arbitrary. The laws of acoustics and the properties of air do not determine which words and sounds are spoken by the DJ, nor what they mean. Likewise, the laws of physics and the chemical properties of ink do not determine which sequence of letters from a-z (eg. k-a-n-g-a-r-o-o) will represent an Australian marsupial.

    To sum up then, lightwaves, sound waves, footprints, topography, tree rings, varves, and all other natural phenomena cited by commenters as examples of naturally created codes are actually all indexes. None of them are examples of codes, or even symbols. Furthermore, even if these various natural phenomena were examples of codes, then we would have to conclude that they were intelligently designed because code systems are made up of symbols, and symbols are, by definition, intelligently designed.

    “…there is NOTHING in the physico-chemical world that remotely resembles reactions being determined by a sequence and codes between sequences.” – (Hubert P. Yockey, Information Theory, Evolution, and the Origin of Life, 2005 Cambridge University Press).

    3. The genetic code is not a real, literal code

    The genetic code resides in the specific sequences of nucleotide bases (ie. codon assignments) that encode (or map to) specific amino acids.

    Now, in the English language a random sequence of letters A-T-C, for example, is meaningless (ie. it specifies nothing), unless you re-arrange the letters to form C-A-T, which then specifies a domesticated feline animal. Similarly, in DNA the specific arrangement of bases: Guanine then Thymine then Cytosine (GTC) “specifies” the amino acid Valine.

    So to determine if the specific sequences of nucleotide bases (ie. codon assignments) are indexical or symbolic we have to determine that either:

    (1) The codon assignments are the result of natural, physico-chemical causal processes or laws, and the genetic code is not a real, literal code. In this case the codon assignments are indexical.


    (2) The codon assignments are the result of an arbitrary convention, not caused by natural, physico-chemical processes or laws, and the genetic code is a real, literal code. In this case the codon assignments are symbolic.

    What is the evidence for (1)?

    As we’ve seen from origin of life research, there are currently no known physical or chemical processes or laws that determine which sequences of nucleotide bases will map to the amino acid Valine. The same is true of all the other amino acids. We don’t know why the four sequences GTT, GTC, GTA, GTG signify/map to/encode/represent Valine, as apposed to all other possible sequences. It seems we also have no way of knowing if we will ever find a physical/chemical explanation, or even if there actually is such a physical/chemical explanation.

    In fact, according to the foremost expert on bioinformatics, Hubert Yockey,

    “…there is NOTHING in the physico-chemical world that remotely resembles reactions being determined by a sequence and codes between sequences.” – (Hubert P. Yockey, Information Theory, Evolution, and the Origin of Life, 2005 Cambridge University Press).

    Yockey concludes from this that, therefore,

    “the process of the Origin of Life is possible but unknowable.”

    Since we have no natural explanation for why the codon assignments are what they are, we have to admit that there is no evidence for (1). This means that any claim that the genetic code is not a real, literal code (no matter who makes the claim and no matter how convincing their arguments are) is based on nothing but personal opinion because the only way prove it is to show that the genetic code has a natural, non-intelligent origin, which begs the question. We may choose to believe that someday a natural, physico-chemical explanation will be found, but then we should admit that our belief is based on faith, not on scientific evidence.

    What is the evidence for (2)?

    A. The fact that there is no known physico-chemical causal explanation for the codon assignments (despite searching for more than 60 years), and that some researchers regard such an explanation as unknowable, strongly supports the conclusion that the codon assignments are the result of an arbitrary convention, and are not caused by natural, physico-chemical processes or laws.

    B. The fact that the genetic code meets all the rigorous requirements for being a literal code, and is scientifically defined as a code in information and communication theory, means that the codon assignments are symbolic and, therefore, the result of an arbitrary convention and not of natural, physico-chemical processes or laws.

    “Information, transcription, translation, code, redundancy, synonymous, messenger, editing, and proofreading are all appropriate terms in biology. They take their meaning from information theory and are *not* synonyms, metaphors, or analogies.” (Hubert Yockey, Information Theory, Evolution, and the Origin of Life, Cambridge University Press, 2005.)

    C. The fact that the genetic code resembles symbolic code in every crucial way is supporting evidence for (2). Refer to a DNA codon table and compare it to an ASCII code table, or to any symbolic code.

    D. The fact that the genetic code is described, written about, or defined, as a code universally and pervasively in the entire body of biological literature is more supporting evidence for (2).

    E. The fact that it is utterly impossible to describe or explain the genetic code, DNA, or processes involving the genetic code (eg. protein synthesis) in a full and meaningful way solely in terms of physico-chemical processes WITHOUT using terms and concepts from information/communication theory (which has symbolic representation at its foundation) is still more evidence for (2). Eg. from a well-known textbook “the Molecular Biology of the Gene” by Watson & Losick:

    “At the very heart of the Central Dogma is the concept of information transfer from the linear sequence of the four-letter alphabet of the polynucleotide chain into the 20-amino-acid language of the polypeptide chain…the translation of genetic information into amino acid sequences takes place on ribosomes…” (p 573) And from the section THREE RULES GOVERN THE GENETIC CODE (p 582): “The first rule holds that codons are read in a 5’ to 3’ direction…The second rule is that codons are nonoverlapping and the message contains no gaps…the final rule is that the message is translated in a fixed reading frame…”

    It seems then that the 3 typical objections:

    “The origin of the genetic code is not a mystery”

    “not all codes are intelligently designed”

    “the genetic code is not a real, literal code”

    do not refute the argument that the genetic code was intelligently designed.


    In conclusion, let’s revisit the original syllogistic form of the Intelligent Design argument:

    1. All codes are intelligently designed – (by inference from observation)
    2. The “genetic code” is a code – (by empirical evidence, Yockey)
    Therefore, the genetic code was intelligently designed.

    As we’ve seen, there are 2 ways to refute the syllogism in this form, ie. refute premise (1) and/or premise (2).

    In this form of the syllogism, premise (1) is true by inference from observation (ie. all codes that have been routinely and habitually observed so far, are intelligently designed). That’s why premise (1) can be more accurately stated as “All codes that we know the origin of are intelligently designed”. So, while this is a strong premise, refuting it isn’t logically impossible. That is to say, in so far as premise (1) is only considered as an inference from observation, it is logically possible for it to be refuted (eg. by empirical discovery of a code that is not intelligently designed).

    However, if we take into account what we’ve learned about the definition of a code, a stronger syllogism can be formulated:

    1. All symbols are intelligently designed – (by definition)
    2. All codes are systems of symbols – (by definition)
    3. Therefore, all codes are intelligently designed – (by definition)
    4. The “genetic code” is a code – (by empirical evidence, Yockey)
    Therefore, the genetic code was intelligently designed.

    In this form there is now only 1 way to refute the syllogism, ie. by disproving premise (4): The “genetic code” is a code. However, the ONLY way to disprove this is by discovering a physical, chemical, or other natural process that is the cause for the genetic code; but that is precisely what origin of life scientists have failed to do.

    • …and if someone can do that we will be more than happy to write the check. I appreciate the work you’ve taken to run through the options.

    • Srdan Klikovac says:

      When we find out which natural forces influence and regulate codes such as genetic code, then we will find out how Life originated. To answer this question are followed by long and complicated scientific research, but there are important shortcuts, but only with the best and most serious scientist can be discussed for now.

