Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
You say you don't know if your numbers will fit into 16 bit, and using 32 bit per number simply doubles storage space.

May I point you to the fact that there is an option in pack/unpack to store numbers with built in compression? They're called "BER-compressed integers" (see perlfunc:pack — use the "w" template), and they use a few bytes as possible. They're slightly less efficient in storage on the maximum number, as only 7 bits per byte are significant. But on the plus side: they don't waste any bytes they don't need, so for not so extreme values, the waste is no more than with the fixed length representation, and often less.

Experimentally I've determined the following ranges for the byte count:

rangebyte count
0 - 1271
128 - 163832
16384 - 20971513
2097152 - 2684354554
268435456 - 4294967295 (?)5

The range for 5 bytes runs at least up to 4294967295, i.e. 2**32-1, but above that, something goes wrong, at least on Indigoperl 5.6.1/Win32. Suddenly, I get a value for 2**32 that doesn't make any sense: it takes 10 bytes, for a start, while 2**32-1 only takes 5. In theory, there's no upper limit for which you could use this kind of representation, so I have no idea on what's going on.

That problem aside, here's an extra way you can save on space: sort the numbers in ascending order, and store the difference between successing numbers — the first value being the smallest number itself, the difference from zero. The more numbers you have, the closer they'll be together, and that way, you might shave of a few bytes.


In reply to Re: Compressing a set of integers by bart
in thread Compressing a set of integers by toma

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2024-06-18 09:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.