http://www.perlmonks.org?node_id=883617

raybies has asked for the wisdom of the Perl Monks concerning the following question:

I've seen a number of monks deciding to convert from the Storable module to YAML. I've looked at a lot of documents now... um... Would someone help me with some pointers on which YAML module is best, and perhaps how the usage differs from the extremely simple usage of storable? (the simple Store/Retrieve is all I needed.)

Oh yeah, and thanks for all ye do. You're all amazing.

Replies are listed 'Best First'.
Re: Converting from Storable to YAML
by Your Mother (Archbishop) on Jan 21, 2011 at 21:38 UTC

    Probably you want YAML::XS (it's supposed to be the most correct). About 50/50 I have been using the vanilla variety (YAML) without problems. DumpFile and LoadFile are store and retrieve. The docs are good-

    my $yaml = Dump [ 1..4 ]; my $array = Load $yaml; # This module exports the functions "Dump", "Load", # "DumpFile" and "LoadFile". These functions are # intended to work exactly like "YAML.pm"'s # corresponding functions.

    Storable is a *fine* module. YAML is just easier to introspect and tweak for a human. If you don't have a compelling reason to switch, don't do it just in the name of fashion. Well, learning something new is always good too so...

Re: Converting from Storable to YAML
by thezip (Vicar) on Jan 21, 2011 at 21:41 UTC

    Better still is to use JSON::XS.

    use strict; use warnings; use JSON::XS; my $hash {}; ... fill up the hashref with stuff ... my $json = JSON::XS->new()->pretty(); my $json_text = $json->encode($hash); open(my $ofh, '>', $outfilename); print $ofh $json_text; close $ofh;

    The stored format in the file is quite readable, and JSON::XS is uber-fast when compared to YAML and Storable.


    What can be asserted without proof can be dismissed without proof. - Christopher Hitchens

      JSON and YAML fill two different needs.

      JSON is meant for data interchange between different languages, and can only handle the most basic data types (arrays, hashes, strings, numbers and undef, and only tree structures).

      YAML on the other hand also handles blessed references and self-referential constructs (ie. arbitrary object graphs).

      No wonder that JSON modules are faster than YAML modules, they do way less.

      When the feature set of JSON is sufficient, choosing JSON over YAML is a good idea (simpler format, easier to implement right, less chance of ambiguity). If not, not.

      There's also JSYNC, which tries to combine the advantages of both formats. I don't know how mature it is today.

      JSON::XS--which I love--is harder to edit by hand, has less support for objects/etc, and is on par with Storable for speed. I think it can be faster for certain strings/sizes but I also recall Storable being better for others. Oh, and YAML::XS was not around when those benchmarks were done so it might well be on the same playing field since it's C beneath too.

      So, I recommend it too but I tend to personally use it for machine cases and Ajax only, never for human edited configuration, etc.

Re: Converting from Storable to YAML
by ELISHEVA (Prior) on Jan 22, 2011 at 17:46 UTC

    the simple Store/Retrieve is all I needed. -- Ah, the simple store/retrieve is not so simple.

    My personal preference between YAML, Storable, and JSON is YAML. As for version of YAML, I'm not fussy so I use YAML::Any. That way I can use whatever YAML (XS, Old, other) that has been installed on the host system.

    YAML handles a much wider range of data structures than JSON, as wide as Storable, and it won't blow up on you if you just happen to have a circularity in your data.

    I prefer YAML over Storable, because YAML is human readable and machine/Perl version independent. The main issue with Storable is that the data is dumped exactly as Perl stores it in C, which means that it is machine specific and possibly Perl release specific. You can't assume that a file you dump on your machine will load properly on another machine.

    Since Storable dumps in a format efficient for the computer and not a person, its output is not at all easy to inspect, if you are trying to figure out what went wrong. Finally, it is a bit fussy to test dumped data (did I get what I expected?) in an automated fashion because small details about the way a variable is handled during the life of the program (namely whether it was used as a string, number or both) can change the way a number is dumped.

    Since JSON is also human readable, JSON is often pushed as the faster, more "hip" substitute for YAML, but there are some limitations that really make it only suitable for very simple kinds of data. Neither Storable or YAML face any of these limitations.

    • JSON blows up if there is a circular chain of references in the data it is trying to dump (or did as of June 2009). Both Storable and YAML handle circularities with grace.

    • There is no way to preserve the fact that two different hash keys store the same reference. Or more generally, it can't dump any sort of network graph.

      When you dump the hash you will dump the data attached to the reference twice. When you reload the hash, the keys will point to distinct objects. This can be a serious problem for any data structure or algorithm that is relying on reference equality. The algorithm will work one way before dumping and another way after dumping and reloading. Both YAML and Storable handle network graphs with grace.

    • Dumping objects is a pain. JSON can do it, but you have to have a TO_JSON method for each object you want to dump. Alternatively you can run JSON with the -convert_bless_universally argument, define a generic object converter and name it UNIVERSAL::TO_JSON. Again YAML and Storable handle this task with ease.

    • Loading objects back in is even worse. There is no syntax to store the class of a dumped bless reference. All you have is the raw hash or array. If you are set on reloading your data back in as objects, your TO_JSON method will have to come up with some convention for identifying the class and you will have to write a custom loader that knows how to convert the array or hash back into a blessed object. It isn't a big deal to write, but it isn't "out of the box" and any roll-your-own convention can run into problems if the dumped data has to be shared widely at some point. Reloading objects complete with their blessings is built into both Storable and YAML

      This should be a tutorial++

      Thank you for such a clear tutorial !

      How does DBM::Deep compare to YAML::Any, for saving Hashes ?

Re: Converting from Storable to YAML
by Khen1950fx (Canon) on Jan 22, 2011 at 14:46 UTC
    An easy way to determine which YAML module is the right one for you is to use YAML::Old. You'll find a YAML test shell---ysh---that will let you tryout a particular YAML module in an interactive way. It's fast and easy to use.
Re: Converting from Storable to YAML
by gnosti (Chaplain) on Jan 23, 2011 at 07:28 UTC
    If you're emphasizing extremely simple, I'd suggest looking at YAML::Tiny possibly via YAML::Any to allow for more capable backends in future.

    I like YAML for serializing program state because I can read the output for debugging. Program users can send me their state files, and occasionally debug for themselves.

    YAML::Tiny, like JSON, cannot handle circular data structures, objects, or code references. That may constrain how you write your code. I ended up writing my own serializing code that converts objects to hashes with a 'class' field that YAML::Tiny can handle.