Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Human readable/writable serialization alternatives to YAML and XML ?

by mandog (Curate)
on Mar 06, 2009 at 20:43 UTC ( #748952=perlquestion: print w/ replies, xml ) Need Help??
mandog has asked for the wisdom of the Perl Monks concerning the following question:

What good human readable/writable data serialization options with good Perl support are there?

We're completely happy with Config::Simple and .ini files for the configuration info for our application.

Now we want to re-factor our Test::WWW::Mechanize tests so the test data is easier to follow, perhaps by separating test code from test data. I'm surprised that there isn't an obvious choice.

The popular serialization modules Storable and FreezeThaw are binary.

XML doesn't seem very human readable. There are 39 pages of results for the google query "xml sucks". There is even a web site devoted to that assertion.

YAML looks a bit better but the documentation is a bit thin. For example YAML::Manual::Tutorial is a line of text: "not yet been written." Worse there are a few apparently serious years old bugs. This could be an easy opportunity for us to contribute documentation...

JSON::XS doesn't look bad. There seems to be a good bit of example code, but I'm not seeing how JSON is much more readable than Perl

I'm begining to think that for us Perl is actually more writeable/readable than XML,YAML,JSON,etc. Also, we don't have to learn something new to use Perl. Maybe we should put the test code in a module and move each sub that returns an array of test data into its own .t file. Simplifying somewhat and perhaps introducing syntax errors, our existing code is:

my @forms_no_password = forms_with_no_password($host); my @forms_to_add_agency = forms_to_add_agency($host); my @other_group_of_related_forms = other_group_of_related_forms($ho +st); submit_forms_ok( $mech, @forms_no_password ); # .... sub forms_with_no_password { my $host = shift; my @forms = ( { 'test_description' => "search_phrase='' category_search_fo +rm", 'url' => "http://$host/", 'expected_content' => "Sorry, we don't have any program categories", 'number' => '1', 'button' => 'Search', 'fields' => {}, }, { 'test_description' => "search_phrase='' agency_search_form +", 'url' => "http://$host/", 'expected_content' => "know of any agency whose name match +es", 'number' => '2', 'button' => 'Search', 'fields' => {}, }, ); } sub submit_forms_ok { my ($mech,@forms) @_; my %params; for my $form (@forms) { $params{form_number} = $$form{number}; $params{fields} = $$form{fields}; $params{button} = $$form{button}; $mech->get( $$form{url} ); $mech->submit_form_ok( \%params,$$form{test_description} ); $mech->content_contains( $$form{expected_content} ); } }

Comment on Human readable/writable serialization alternatives to YAML and XML ?
Download Code
Re: Human readable/writable serialization alternatives to YAML and XML ?
by bellaire (Hermit) on Mar 06, 2009 at 21:03 UTC
    FWIW, I think there's absolutely nothing wrong with just using Perl data structures as meta-data. If you are comfortable with the syntax, and you're just going to be converting it into Perl data structures anyway, why take the extra step? We use this approach quite commonly, and it's really very handy for things to be defined as structured hashrefs, listrefs, and scalars, especially since Perl is kind enough to let you use barewords to the left of => even with use strict.
Re: Human readable/writable serialization alternatives to YAML and XML ?
by perrin (Chancellor) on Mar 06, 2009 at 21:17 UTC
    I'm begining to think that for us Perl is actually more writeable/readable than XML,YAML,JSON,etc.

    Agreed. Perl is the default choice. Although I find JSON (and to a lesser degree YAML) fairly easy to read, it's also easy to break when writing, and none of the parsers I tried gave feedback as useful as Perl's. My rule of thumb is that you should try to never use any tool that doesn't tell you which line number the problem is on.

    However, I don't have the XML-phobia some people seem to have. It seems fine to me for config data. Verbose, but not that hard to deal with.

Re: Human readable/writable serialization alternatives to YAML and XML ?
by Your Mother (Canon) on Mar 06, 2009 at 21:41 UTC

    I agree with the sentiment and general comments above but consider config as altered by someone malicious or just not too quick or accidentally left over from a paste buffer...

    # 500 lines of config omitted... $blah = qx( rm -rf / }; #syntax error intentional for "safety."

    Config to me implies non-trusted users. The worst you get with malicious/wrong XML/JSON is a broken config load (YAML is a little deeper, you can include code but it doesn't execute the same as `eval` or `perl ...`). If your config is code... well, it's code, not really config anymore. The separation is perhaps arbitrary, so if it's developer only config, Perl might make sense. The XML/JSON/YAML stuff is easier to generate from input though and it can be shared across languages and applications. I find YAML easy to write by hand and JSON(::XS) is great for machine/speedy stuff. I used it for marshalling data in the value slots of a DB_File just yesterday. :)

Re: Human readable/writable serialization alternatives to YAML and XML ?
by duelafn (Priest) on Mar 06, 2009 at 22:55 UTC

    The YAML Specification is independent of the YAML perl package.

    I use YAML for all my human-readable configuration files and have even used YAML for this specific purpose. Here's a sample test from the test suite (actions map directly to Test::WWW::Mechanize methods):

    - url: BLAH.cgi title_is: My Spiffy Page actions: - submit_form_ok: form_name: magic_form fields: mstat: M mtype: 99 - contains: - Mbr First - Mbr Last - lacks: - Amt Due - submit_form_ok: form_name: magic_form fields: # delcd will still be 'A' mstat: "" amdue: 50 chk_amdue: checked - contains: - Mbr First - Mbr Last - Amt Due

    Update: My employer (National Financial Management) has granted me permission to modify and release the test driver for these files on CPAN. Watch for Test::WWW::Mechanize::YAML on CPAN!

    Update 2: Released Test::WWW::Mechanize::Driver on CPAN!

    Good Day,
        Dean

Re: Human readable/writable serialization alternatives to YAML and XML ?
by pileofrogs (Priest) on Mar 07, 2009 at 00:31 UTC

    If it needs to be human writable, I use ini files or similar. Otherwise I use YAML.

    I like YAML/JSON because it's taint safe. Anything that evals perl to unserialize a datastructure makes me nervous. That may or may not be a concern to you though.

    --Pileofrogs

Re: Human readable/writable serialization alternatives to YAML and XML ?
by Porculus (Hermit) on Mar 07, 2009 at 09:13 UTC

    Is XML all that bad? Sure, a lot of people have an irrational hatred of it, but there are lots of things that people hate, and that doesn't mean they're dreadful. One reason so many people criticise XML is that it's so widely used -- more people are familiar with XML than have even heard of YAML. (And, dare I say it, some of the people* extolling the virtues of JSON are trend-followers who regularly claim that whatever they learned about most recently is the bestest thing ever.)

    The advantage of XML, from the point of view of a human trying to work with it directly, is that you can get validating editors that will flag up any mistakes as you type them. Typo in a tag name? A good XML editor will flag it up and may even be able to fix it automatically. If you're using YAML, JSON, etc, then the first thing you'll know about it is when your program chokes on the invalid input.

    * I'm referring to the wider Internet here, not the people commenting in this thread -- all the people posting here so far are clearly familiar with all the options and are presenting informed opinions.

      The advantage of XML, ... is that you can get validating editors that will flag up any mistakes as you type them.

      You can view that the other way around. XML is so complex and verbose, that you need special, validating editors.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      For pure serialization, XML starts looking like a bad choice compared to YAML and JSON, but if you want to do anything more then serialization then XML starts winning ... the closer the data you are encoding starts looking like a document then YAML and JSON lose.

      XML is better as a general purpose format and I tend to choose general solutions if what I am building has a lot of volatility in its requirements. I see JSON and YAML as a natural optimisation to be considered nearer the end of development versus considering them as architectural 'pillars' of a solution ... and yes I would rather lose in terms of speed of development upfront and manage 'near future' requirements then use YAML/JSON because its fractionally quicker then XML to work with.

      I think the main problem is using XML everywhere when its power will never be needed ... basic config files are the main culprit here.

Re: Human readable/writable serialization alternatives to YAML and XML ?
by JavaFan (Canon) on Mar 07, 2009 at 10:48 UTC
    Worse there are a few apparently serious years old bugs.
    Ingy is currently rewriting the YAML implementation (in fact, he's backporting the Python port to Perl), which should fix a lot of the bugs. It's my understanding he rewrite is closed to be finished (didn't he get a grand from TPF for this?).

    As for JSON being more readable, JSON is actually a subset of YAML - anything that is valid JSON is valid YAML as well (but not the other way around).

    If I want Perl to dump a structure which I want to read, I almost always use YAML - I find its output reasonable readable. For input, I prefer ini or apache like config files. I use Config::General to read it. But people vary in what they find readable/writable so don't expect a definite answer in this thread.

      Ingy is currently rewriting the YAML implementation (in fact, he's backporting the Python port to Perl), which should fix a lot of the bugs. It's my understanding he rewrite is closed to be finished (didn't he get a grand from TPF for this?).

      He has a grant running indeed, and today's grant update does indicate that it's nearly finished, and should be tested by interested hackers.

      As for JSON being more readable, JSON is actually a subset of YAML - anything that is valid JSON is valid YAML as well (but not the other way around).

      Just for the sake of correctness: JSON is not a 100% subset of YAML. The author of the JSON::XS module researched that topic: http://search.cpan.org/~mlehmann/JSON-XS-2.232/XS.pm#JSON_and_YAML

        And the author of YAML told on a recent conference that there indeed was a slight difference between JSON and YAML, and that the YAML specification was recently relaxed to make JSON a proper subset of YAML.
Re: Human readable/writable serialization alternatives to YAML and XML ?
by andreas1234567 (Vicar) on Mar 09, 2009 at 07:31 UTC
    I'm not seeing how JSON is much more readable than Perl
    Readability is a matter of personal preference. I use JSON::XS extensively, mainly because it's a text-only human-readable format that is also platform and programming language independent (as pointed out by others). Using JSON I can easily create interfaces that (at least in theory) can be used to inter-operate with programs written in Non-Perl.
    --
    No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://748952]
Approved by Limbic~Region
Front-paged by Limbic~Region
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (10)
As of 2014-08-21 06:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (127 votes), past polls