Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Using bytecode for object serialization

by tall_man (Parson)
on Sep 26, 2003 at 17:04 UTC ( #294498=perlquestion: print w/ replies, xml ) Need Help??
tall_man has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to find a way to serialize large data-structure objects which makes them load more quickly than with Storable or Data::Dumper. I have a large perl data structure that does not change often. Here is what I tried:

First, I took Data::Dumper output and wrapped it in a subroutine.

sub make_byteObject { # Trivial example -- the real one was about 50,000 lines. my $byteobj = bless( { 'one' => 1 }, 'byteObject' ); }
Then I creadted bytecode for this routine using perlcc:

perlcc -B -o bytecode

To use the object, I invoked the bytecode with "do" and called the subroutine:

do "bytecode"; my $obj = make_byteObject();
In the example I tried, the byte-coded version loaded a tiny bit faster than the Data::Dumper version, but not significantly. Any suggestions to improve on this idea?

Replies are listed 'Best First'.
•Re: Using bytecode for object serialization
by merlyn (Sage) on Sep 26, 2003 at 17:40 UTC
    Data::Dumper will be relatively slow to restore, because it has to first compile the Perl code, and then execute the code to allocate all the appropriate references and scalars.

    But I'm surprised you say that Storable isn't fast enough! That's doing just about as little as you can and still end up with Perl data structures. If you need something faster, perhaps you'll need to drop into C code that you can call from Inline::C or something.

    Also, if it's really that large, and yet loading time is important, perhaps you are loading it too often, and it belongs in a real database. Consider the use of DBD::SQLite to access just the parts of the data that you need for a particular program invocation.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Using bytecode for object serialization
by Ovid (Cardinal) on Sep 26, 2003 at 17:31 UTC

    Two options spring to mind. One is to use YAML for your serialization. It's fast and relatively safe. The other option would be to try Pixie. You would create an object wrapper around your data structure and use Pixie to store it in a database.

    use Pixie; my $pixie = Pixie->new->connect(@connection_paramters); my $cookie = $pixie->insert($any_object);

    At that point, you just cache your cookie however you want. Later, when you want the object back, retrieve the cookie and...


    You can also bind names to your objects and fetch by name, if you prefer. I don't know if that would be faster, but it can present a clean interface.


    New address of my CGI Course.

      I tried YAML just now and I was very disappointed by it. Not only did it take 14 seconds to load what Data::Dumper can do in .3 seconds, but it is buggy as well. It died on the hash key below until I supplied by own quotation marks around it!
      Trade Route (Eastern Range, Yellow):

      Update: Storable beats everything else so far, among methods that pull the whole data structure into memory:

      Storable: .17 secs
      Data::Dumper, byte encoded: .31 secs
      Data::Dumper: .32 secs
      YAML: 14.2 secs

Re: Using bytecode for object serialization
by demerphq (Chancellor) on Sep 26, 2003 at 18:23 UTC

    Storable is going to be fast as you can get I think. Data::Dumper is a horribly inefficient representation. Id be interested to see some benchmarks of the performance differences between Dumper, Storable and YAML.

    Im betting Storable creams the other two big time.


      First they ignore you, then they laugh at you, then they fight you, then you win.
      -- Gandhi

      Although Data::Dumper is by far slower than Storable because it endeavors to dump data structures in human- and perl-readable form, it does offer an advantage in that if you are freezing the structures to a database and/or flat file and the data structure becomes corrupted for some reason, you can visually reconstruct them.

      This is not a hypothetical case. I have been involved in two such situations, once because one of the Oracle datatypes likes to strip trailing whitespace from a string. A very obscure but, in this case, important piece of information that was not discovered until code was pushed to the production environment.

        Absolutely Data::Dumper has its place. I swear by it. (And am writing and have written several versions of different flavours, none as good :-) By horribly inefficient I meant the process of dumping and of parsing. The fact that it performs so well was suprising to me. The fact that it only took twice as long to load as Storable is a good comment on the quality of perls parsing.

        However I'm willing to bet good money that the dump times showed Storable doing much better.

        Of course this doesn't address your point about Data::Dumper as a development and debugging tool in the slightest. There it is very useful indeed.


          First they ignore you, then they laugh at you, then they fight you, then you win.
          -- Gandhi

Re: Using bytecode for object serialization
by toma (Vicar) on Sep 28, 2003 at 07:41 UTC
    Since your data doesn't change very often, you could leave the program running so that it doesn't need to reload the data. In effect, you make a server that keeps the data loaded and provides the data to other applications.

    mod_perl is an example of such an approach using the Apache web server as the long-running program. The server retrieves the data at startup using Storable. Restart the server when the data changes.

    It should work perfectly the first time! - toma

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://294498]
Approved by HyperZonk
Front-paged by broquaint
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (6)
As of 2016-08-25 22:11 GMT
Find Nodes?
    Voting Booth?
    The best thing I ever won in a lottery was:

    Results (363 votes). Check out past polls.