### Re^2: Data compression by 50% + : is it possible?

by LanX (Archbishop)
 on May 12, 2019 at 10:10 UTC ( #1233638=note: print w/replies, xml ) Need Help??

Hello Roboticus,

I just realized that we had the same ideas (me just later ;).

One difference is that you want to encode each of the 9 groups individually with 6 bits each => 6*9=54 bits per line while I'm encoding a whole line as a polynom

Sum(\$g(\$i) * 50**\$i) with \$i=0 .. 8

resulting in the need of 51 bits per line.

But I don't understand why you say

>  That's not quite enough to crunch out half the space

The old encoding needs per line in average >14 bytes plus newline.

That's > 15 * 8 = 120 bits

What am I missing?

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Replies are listed 'Best First'.
Re^3: Data compression by 50% + : is it possible?
by roboticus (Chancellor) on May 12, 2019 at 12:09 UTC

LanX:

You're not missing anything that I know of. What I was basing my "not quite" phrasing on is the idea of using a single character to encode each group (@c) into a character, so it would use 9 characters (72) bits. Had I thought of just packing the required 51 bit records together, it would be more than sufficient to get 50% compression, as the file would take 635 bytes to encode 100 records (sans newlines).

(The 51 bits came from: 50 different possibilities for each group of 10 in the inner loop (log2(50) == 5.64.. bits/group) * (9 groups) == 50.79 bits.)

...roboticus

When your only tool is a hammer, all problems look like your thumb.

>  using a single character to encode each group (@c) into a character, so it would use 9 characters (72) bits

Oh I see, but I hope you are aware that your approach can easily be packed into 9*6=54 bits and is easier to code than mine.

With a per line ratio of 7 bytes = 56 bits you'll already have a 50%+ compression.

My approach would require modulo 50 calculations on 51 bit integers, not sure how tricky this is on a 32 bit machine.

So I'd rather "waste" 5 bits/line for a pragmatic solution.

Btw: I'm reluctant trying to implement Huffman here, because the OP could probably change the parameters in his next post.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Create A New User
Node Status?
node history
Node Type: note [id://1233638]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2019-06-20 20:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Is there a future for codeless software?

Results (91 votes). Check out past polls.

Notices?