http://www.perlmonks.org?node_id=691376

I've been making quite a bit of use of stored data structures as of late. Sometimes storing to file as intermediate conversion results, sometimes storing to data in MySQL tables. The data must be generically recoverable from many different systems (with varying and uncontrollable module versions).

Initially I relied on Data::Dumper to serialize the structures. Later I came to prefer Storable (nstore_fd fd_retrieve). Storable appeared to be a better solution. More compact serialized data, fast de-serialization, no messy eval statetments. It was a good choice.

The other day I applied some updates to my dev machine. Looks like Storable was in the list of updates. Now the data serialized on my dev machine won't de-serialize on the testing machines.

"Storable binary image v2.7 more recent than I am (v2.6)" - bye bye stored data :(

What's the fracking point? What purpose does Storable serve if one can't rely on it to recover the data again? I mean the whole point of storing things is so that you can retrieve them again later for Pete's sake. As it is, I'm not completely committed to Storable and I can switch back to using Data::Dumper with minimum work but this could have easily been a data backup recovery nightmare at some point in the future.

Is Data::Dumper the only truly portable (core) way to serialize data structures?

Replies are listed 'Best First'.
Re: Burned by Storable
by grinder (Bishop) on Jun 11, 2008 at 06:28 UTC
    What purpose does Storable serve if one can't rely on it to recover the data again?

    Wow, you were using a really old version of Storable, and you upgraded to a not-quite-so-old version! The current version, 2.15, has various switches you can twiddle to configure just how much backward and forward compatibility you want.

    The situation you encountered was a frequent complaint from users, but no-one stepped up for a long time to fix the problem. Code speaks louder than words. I think it was only in the run up to 5.10 that the issue was dealt with.

    The best Perl solution is to use Data::Serializer, which will allow you to switch the underlying serialisation mechanism used with minimum changes to the client code. And better contenders to the serialisation crown would have to include JSON and YAML, since implementations for both exist in other programming languages.

    • another intruder with the mooring in the heart of the Perl

      You had me going there for a minute :)

      I believe the v2.7 and v2.6 in the error message refer to the binary storage format used by Storable, and not the actual Storable module version. The version of Storable on my dev machine is 2.15 (according to CPAN, 2.18 is now the latest). The version of Storable on my failed test machine is 2.13. The Storable changelog doesn't say anything about the binary format version changes (except for binary format v2.5 in version 2.0).

      I don't think it's unreasonable to expect compatibility in a module between versions 2.13 and 2.15.

      Indeed, other than bug fixes, there doesn't appear to be anything in the change log (or reported bugs) that would prevent basic Storable use (I'm storing simple hashes, nothing funky) from being fully compatible all the way back to version 2.0. What's a guy to trust if he can't trust core?
Re: Burned by Storable
by Corion (Patriarch) on Jun 11, 2008 at 06:23 UTC

    I think your problem is discussed in the section files from future versions of Storable in the Storable. Basically, there is no generic way to have guaranteed backward compatible generic (de)serialisation - SQLite also has the problem of compatibility between the SQLite versions 2 and 3. If you really, really want a "backward+forward compatible" way of serializing your data, ASCII is your only hope because if all else fails, you can edit it from within vi.

    As a general approach, you shouldn't upgrade your machines separately, at least not if you expect them to share their data.

      Basically, there is no generic way to have guaranteed backward compatible generic (de)serialisation
      Well, actually there are at least two ways: First, don't change the serialization format between minor versions. Second, keep the old code around -- it sure *used* to parse that old format. Storable is fast and compact, but I've almost sworn off using it because of this. Data::Dumper plus gzip should be good enough.
      Our only hope is to edit it with vi?? Please tell me that ain't so!
Re: Burned by Storable
by andreas1234567 (Vicar) on Jun 11, 2008 at 07:08 UTC
    I have had a similar experience with Storable. I used to freeze a data structure, store it as a blob in a (MySQL) table, then thaw it upon use. That worked all well and fine until either:
    • The Storable module was updated.
    • The application was migrated from a 32-bit to 64-bit server, or vice versa.
    • The application was moved to an operating system with a different endianness.
    Any of the above rendered the stored data unusable and meant the data had to be converted, all of which would have been completely unnecessary had we used a textual ascii representation to begin with. Additionally, one cannot easily view the contents of the data when stored using a binary representation, which makes inspection, debugging and testing much more difficult than it should be.

    I ended up using String::Escape's hash2string and string2hash to (de)serialize the data and store it as text. Easy to use, understand, inspect and test, while completely platform-independent and probably not slower than any binary conversion.

    See also The Importance of Being Textual chapter of Eric S. Raymond's book The Art of Unix Programming:

    Text streams are a valuable universal format because they're easy for human beings to read, write, and edit without specialized tools. These formats are (or can be designed to be) transparent.

    Designing a textual protocol tends to future-proof your system.

    --
    No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
      Did you use the option to store the data in network order? That should deal with 32/64 bit and endianness just fine.
        No I didn't, in fact I was not even aware of the option. I reckon it would probably take care of some, but not all, of the above challenges. Still, personally I would choose a textual format every time for the sake of transparency.
        --
        No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Burned by Storable
by Anonymous Monk on Jun 11, 2008 at 09:08 UTC
    What's the fracking point? What purpose does Storable serve if one can't rely on it to recover the data again?

    The point is to know the limitations of your tools and to keep your tools current.

    Fri May 17 22:48:59 BST 2002 Nicholas Clark <nick@ccl4.org> Version 2.0, binary format 2.5 (but writes format 2.4 on pre 5.7.3 +) The perl5 porters have decided to make sure that Storable stil +l builds on pre-5.8 perls, and make the 5.8 version available on + CPAN. The VERSION is now 2.0, and it passes all tests on 5.005_03, 5 +.6.1 and 5.6.1 with threads. On 5.6.0 t/downgrade.t fails tests 34 +and 37, due to a bug in 5.6.0 - upgrade to 5.6.1. Jarkko and I have collated the list of changes the perl5 porte +rs have from the perl5 Changes file: - data features of upcoming perl 5.8.0 are supported: Unicode +hash keys (Unicode hash values have been supported since Storable + 1.0.1) and "restricted hashes" (readonly hashes and hash entries) - a newer version of perl can now be used to serialize data wh +ich is not supported in earlier perls: Storable will attempt to do +the right thing for as long as possible, croaking only when safe + data conversion simply isn't possible. Alternatively earlier perl +s can opt to have a lossy downgrade data instead of croaking - when built with perls pre 5.7.3 this Storable writes out fil +es with binary format 2.4, the same format as Storable 1.0.8 on +wards. This should mean that this Storable will inter-operate seaml +essly with any Storable 1.0.8 or newer on perls pre 5.7.3 - dclone() now works with empty string scalar objects - retrieving of large hashes is now more efficient - more routines autosplit out of the main module, so Storable +should load slightly more quickly - better documentation - the internal context objects are now freed explicitly, rathe +r than relying on thread or process exit - bugs fixed in debugging trace code affecting builds made wit +h 64 bit IVs - code tidy-ups to allow clean compiles with more warning opti +ons turned on avoid problems with $@ getting corrupted on 5.005_ +03 if Carp wasn't already loaded - added &show_file_magic, so you can add to /etc/magic and tea +ch Unix's file command about Storable files We plan to keep Storable on CPAN in sync with the Perl core, s +o if you encounter bugs or other problems building or using Stor +able, please let us know at perl5-porters@perl.org Patches welcome!
      My bad for not knowing the limitation of the tool.

      But I have no control over what version of the tool my application will encounter in the wild. I expect my application to run on any reasonably current Perl installation (old versions or not). I don't expect old versions of Storable to completely dump on a version change without some kind of backward compatibility armour built in from the get go. The fact that this is addressed in newer versions of Storable does not fix the already broken world that has been created around it.

      I know I will never trust Storable to use it again. If I could, I'd vote it off the 'Perl core' island.
        I will never trust Storable to use it again. If I could, I'd vote it off the 'Perl core' island.

        Heh! You know, many of the porters feel the same way. It took a tremendous amount of effort to get it compiling and testing perfectly on all the various platforms that Perl runs on. Compiler idiosyncrasies make it particularly difficult to get right each time. Changing a preprocessor macro to work around the damage in one compiler made it break in another.

        And it continues to soak up developer effort, as time goes by. On the other hand, it's nice to have a big hairy XS module lying around so people can point to and say "oh no! not again!" whenever someone calls for another large XS module to be pulled into the core.

        • another intruder with the mooring in the heart of the Perl

        But I have no control over what version of the tool my application will encounter in the wild.
        To a certain degree, this will be a problem with any module you use. Versions change, internals change, and incompatibilities creep in even when they aren't intended. The only way to be sure your application will work is to bundle specific versions of the modules it needs. This isn't so much a perl thing as a general issue with componentized software.

        In the case of core modules, you at least have a simpler target. You can test against the specific perl versions you want to support. But even so, distributions will modify them, people will install newer versions for features they need, etc.

        By the way, Storable rocks. It's massively faster than Data::Dumper, highly cross-platform when you store in network order, and as mentioned elsewhere the newer versions try hard to work across version and format changes. It was a great day for perl when we got a fast serialization mechanism in the core and I'd be very sorry to see it go.

        You can always identify your environment your application will run in. Take a look at CPAN, part of what it does is looking at what modules are installed and what versions. Depending on the config, it will upgrade or install other Perl packages as needed. If it did not, CPAN would be a little more than a fancy file copy utility.

        Now if you don't want to work with Storable, that is fine, but what happens when another module that you depends on has issues because of different features from version to version? Are you going to curse that module too or are you going to do something about it?

        Take a look at Module::Build and consider how it handles installing modules and maybe you can come up with a better solution than excommunicating Storable.