Let's examine the properties of the sequence:
It is made up only of 1's, 2's, and 3's
We can prove this by contradiction. Because the sequence starts with '1', the only way a '4' can show up is as the first number in a pair (the "run" value -- the bold values in 3
1). After all, that's the only way that the '2' and '3' ever show up. So this means we must have a sub-sequence such as "1111", "2222", or "3333" in our sequence. Let's abstract these as "xxxx".
There are two ways "xxxx" can be placed in the sequence, at an even offset or an odd offset. At an even offset, the first and third 'x's are counts; at an odd offset, the second and third 'x's are counts. Let's examine the the even offset first. You can't have "C1xC2x" in the sequence, because that means it should have been encoded as "(C1+C2)x". Similarly, at an odd offset, there must be a count before the first 'x' (we'll call it C1 again), which means we have "C1xxxx" in our sequence. Again, you can't have two counts in a row for the same value! The subsequence would have to be "(C1+x)xx".
So this means there will never be four like values in a row, thus '4' will never be in this sequence. (You can prove that '2' and '3' WILL be in the sequence.)
Size tradeoffs are minimal
Every span of like values results in a two-character sequence (count and value). 1 value in a row results in a gain of a character, 2 values in a row results in no change, and 3 values in a row results in a loss of 1 character. What we have to examine is the sequence and show that triplets are far less common than singlets and couplets.
P.L., P.M., P.O.D, X.S.
How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart