|Perl: the Markov chain saw|
Re^2: Data compression by 50% + : is it possible?by LanX (Archbishop)
|on May 12, 2019 at 00:34 UTC||Need Help??|
Supposing your input is correct and that it's truly random, than it should be possible to represent each line with ~ 7.356 bytes or ~ 59 bits.
You have 9 groups with 0-3 numbers in the range 2..9.
I.e each group can be represented with a byte with at most 3 bits set.
There are only 93=56+28+8+1 such combinations possible.
ln(93*9)/ln(256)= 7.35655366 bytes per line
At the moment you'll need -2.5 characters per group which results in -22.5 char per line. (56*3+28*2+8*1)/93
That's about one third.So even with a non binary representation you should achieve your 50 percent or better.
This can only be improved if the combinations don't have the same likelihood.
I don't wanna dig deeper because I don't trust your code and smell an xy problem here.
I just realised that you are forbidding consecutive numbers in your if condition. I.e (2,3,9) is never possible.
This will change the math, but the approach is the same.
Roboticus said you need 15 char in average 7.4 bytes per line is just an upper boundary, so 50% is easily reached.
Don't wanna calculate it again! This would be needed to be done programmatically.
(But I don't trust your code anyway ;)