So the last part of this was a complete doozy, and I’d be impressed if anyone read past the middle part of it, and doubly impressed if you didn’t have a million Google/Wikipedia tabs open while reading through it. I had to get the boring stuff out of the way. Unfortunately, I realized today a flaw in my universe—less a logical flaw, more a constructive flaw.
In constructing the concept of information mutations, I mentioned two types of mutations are possible: replacements and omissions. However, we can see quite clearly that even as (a special type) of self-replicator we aren’t all just viroids floating in a warm biological soup right now. How could this gel with the lemma of complexity?
Erratum #1
Simply put, there exists a third type of mutation (how dare I, as a genomics student, forget it)—addition (we can also call it less formally duplication). Addition much like its friends we can model as a stochastic and context-free process, meaning simply that we expect it to occur randomly throughout the information set of the self-replicator. This can of course extend the information content of the self-replicator—see “ayy lmao” versus “ayy lmaooooo”. Of course, addition and mutations are rarely stochastic and context-free in any real-world scenario (we can see in the prior example the o is duplicated by the host on purpose, versus something like “ayy lllllmao”.
Also, since this was probably not formalized by me in Part III, we can assume that the mutation rate applies to a single mutation in the self-replicator’s information — if we were to consider it a string like “cattle”, we would expect a single mutation to add, remove, or replace one character. We can enhance this later on to include cases where this isn’t true (using a suitable probability distribution, like Poisson, for number of mutations).
Erratum #2
Furthermore, in the prior post I axiomatically stated that replacement can cause speciation, but not omission. This is likely patently false—much like a game of Tetris, removing information can certainly change the minimal representation needed (speciation). A good example would be the word “cattle”. If we were to remove “le” from the end, we would get “catt”. In general, most readers would assume that by omitting the last 2 characters the word we meant to convey was “cat”, thereby producing a significantly lower minimal representation (although changing the meaning)!
This unfortunately complicates even the toy model used in the lemma of complexity given the cases of the proof.
To confuse you further, we can re-frame the lemma of complexity. While I didn’t formalize this (as a separate statement at least in Part III), the lemma largely depends on the self-replicator reducing in complexity over time (through replication & propagation) to its minimal representation (where under its minimal representation it will cease to be viable, or speciate into a new self-replicator with a smaller minimal representation). This is fairly intuitive given the toy model — if we assume omission must occur sometimes and no process can extend the information set of the self-replicator, its size over time (to infinity) must equal to the minimal-representation. Or for the more mathy folks: if we assume a non-zero probability of omission mutation for a self-replicator for every iteration of its replication & propagation, as the number of iterations tends to infinity, we anticipate that the size of the self-replicator will equal its minimal representation (the Kolmogorov complexity).
However, this doesn’t necessarily hold if we assume a third type of mutation (addition). In fact, if we assume that the type of mutation is also random we can observe the following (let’s assign variable probabilities):
Omission: p
Replacement: b
Addition: 1-(p+b)
One could probably argue that between the two mutagenic events for the self-replicator these probabilities are not constant, and all sorts of fun things. These writings are more of a basic exploration of the topic, and I may address these cases in further posts/papers depending on how bad my attention span is.
Let’s relax from the one-host-one-replicator approach, and simply look at the fitness of a single self-replicator species. We can do this to save our lemma of complexity, because conceptually, we’re trying to understand relative fitness—how easy it is, with identical host and media conditions, for a single self-replicator to spread (as measured by the maximum amount of hosts it can infect given infinite time). Therefore, mutations which lead to speciation are effectively dead-ends for our self-replicator—these create entirely new species which may have a different relative fitness!
Using our existing toy model approach, we can observe that for any given iteration there are really four potential outcomes now (instead of the monotonic case discussed prior):
Speciation—This may occur regardless of mutation type, and is effectively a fitness dead-end. Per our current model, we can effectively consider speciation events equivalent to reaching inviability (let’s pretend for now species cannot re-converge, a.k.a. an addition creating a new species and a later omission converting it back to the same species).
The self-replicator is no worse off—This is the case for synonymous replacement (although in a competitive self-replicator landscape, this is not true for non-synonymous replacement of course). This is essentially a no-op in this toy model.
The self-replicator is worse off — worse off here means “approaching or at the point of inviability”. This is the case for all omission or replacement events by the prior post’s argument. Simply put, in cases of omission either the self-replicator can become:
inviable (when it is at the minimal representation)
more likely in the future to become inviable (when its complexity decreases and approaches the minimal representation, as we saw in the prior post)
The self-replicator is better off—better off here means “getting farther from the point of inviability”, which simply implies the information content of the self-replicator is moving away from its minimal representation (it is increasing in size). This is slightly non-intuitive, but is best analogized as redundancy—A more redundant self-replicator, in the absence of mutational context, will survive longer than a less redundant one. This is effectively why telomeres exist (I kid). In the case of non-synonymous addition, this is equal to speciation and covered in our first case. In synonymous addition, this improves the longevity of the self-replicator (assuming no medium/host preference for self-replicator size, which I will discuss later on).
The good news is because we can ignore speciation and equal off outcomes (it is either a dead-end or a no-op for this iteration), we’ve collapsed our probabilities into simply two:
Worse off: p (this occurs in synonymous omission or replacement)
Better off: 1-p (this occurs only in synonymous addition)
This still sounds kind of hard, right?
Let’s reframe it again.
Imagine we’re flipping a coin instead, where if it is tails I pay you $1, and if it’s heads you pay me $1. With a fair coin, we can easily guess that the expected amount of money either of us making is $0 by the end of the (infinite game)—assuming each flip is an independent outcome, we can use linearity of expectation to quickly determine that the expected value per turn is (for me) $-1 * 0.50 + $1 * 0.50, and given any amount of turns the expected outcome is $0.
However, this changes if the coin is not fair—we can observe by the same process if the coin is heads 51% of the time, the expected value per turn for me is actually $-1*.49 + $1*.51, or $0.02 of profit per turn. While this does not guarantee every turn this will be the case (I may lose money for a very long time due to shit luck), if we stretch this game to infinity what is my profit?
It is in fact infinity.
What’s useful is this is exactly our self-replicator’s mutational game per turn! We know by our prior argument that the minimal representation of the self-replicator sets a lower bound on viability (although in all honesty it doesn’t matter if it is 0 or 1,000,000 here). What we do know is three things:
1) Assuming our probability of synonymous omission per iteration is greater (by any amount actually) than the probability of synonymous addition, the self-replicator will eventually become non-viable.
2) By the same vein, assuming that the mutation rate is independent of the self-replicator (as we’ve previously argued, it is a function of the host and the medium, not the replicator), this implies that a larger minimal representation is relatively worse. While the point at which the self-replicator hits inviability in this case is dependent on the information content it starts with (e.g. the redundancy described above), it will always for the same size last longer with a smaller minimal representation.
3) This is not the case if the probability of synonymous omission per iteration is less than synonymous addition. This is intuitive to see — if the probability of addition is significantly greater than omission, we can quickly observe that the maximal size for our replicator is unbounded to infinity. However, it does not imply replicators with larger minimal representation are more fit—in fact, due to ergodicity (which I can discuss later on), it is still likely (but not guaranteed) that a smaller minimal representation will be more relatively fit than a larger one. This is a whole statistical argument in itself, and applies if they are equal as well.
From this we can finally rederive our lemma of complexity as a more specialized argument relating the probability of various mutational types:
The Lemma of Complexity — Assume an infinite number of hosts in a medium of finite and homogenous transmissibility (that the number of potential hosts per iteration to propagate to is some finite number, and that each host in the network has an equal amount and random assortment of other hosts to propagate to). Given the combined host and medium mutation rate for a self-replicator (and assuming our handy context-free and stochastic properties), we can observe that if the probability of synonymous omission is strictly greater than the probability of synonymous addition, then the relative fitness (maximal spread) of a self-replicator is proportional to its minimum representation (the Kolmogorov complexity).
And that’s it for today.
Cheers,
Lily