http://www.perlmonks.org?node_id=822448

mertserger has asked for the wisdom of the Perl Monks concerning the following question:

I have to maintain some perl scripts used to validate data in dictionary entries written in XML. The validation scripts use XML::twig. There is code which counts how many quotations are associated with a definition:
if ( my @qps = $elt->children('qp') ) { foreach my $qp (@qps) { $numOfQuots = $numOfQuots + scalar( $qp->children( qr/ +^(q|xr)$/) ); } }
However what is really wanted is a count of quotations which are not suppressed, which is indicated by the q tag having an attribut supp='yes' on it. As I am fairly new to using XML::twig I am not sure how to rewrite the code above so that the count is of q elements without the supp attribute or where the supp attribute is not equal to 'yes'.
  • Comment on XML::twig counting elements that don't have a certain attribute/value on them
  • Download Code

Replies are listed 'Best First'.
Re: XML::twig counting elements that don't have a certain attribute/value on them
by toolic (Bishop) on Feb 10, 2010 at 16:15 UTC
    Try this:
    use strict; use warnings; use XML::Twig; my $xmlStr = <<XML; <foo> <qp> <q supp="yes">hello</q> <q supp="no">bye</q> <xr supp="yes">later</xr> </qp> </foo> XML my $twig= XML::Twig->new(); $twig->parse($xmlStr); my $elt = $twig->root(); my @qps = $elt->children('qp'); my $numOfQuotes = 0; for my $qp (@qps) { for my $q ($qp->children( qr/^(q|xr)$/ )) { $numOfQuotes++ if $q->att('supp') eq 'yes'; } } print "numOfQuotes = $numOfQuotes\n"; __END__ numOfQuotes = 2

      Toolic

      I have modified your suggestion (I needed the ones that weren't suppressed) as this seemed to cope with the added requirements for testing for other attributes and values. Thansk for this suggestion

      My code is something like this:

      my $numOfQuots = 0; if ( my @qps = $elt->children('qp') ) { foreach my $qp (@qps) { foreach my $q ($qp->children( qr/^(q|xr)$/ ) ) { $numOfQuots++ if (($q->att('supp') ne 'yes') && ($q->att('info') ne 'yes') && ($q->att('info') ne 'info') && ($q->att('info') ne 'intro') && ($q->att('style') ne 'IMPL')); } } }

Re: XML::twig counting elements that don't have a certain attribute/value on them
by mirod (Canon) on Feb 10, 2010 at 16:13 UTC

    It's hard to check without the data, but you could do $qp->children( 'q[@supp != "yes"]'). Actually if all you want is the number of such children, you can use the children_count method instead of taking scalar ( $qp->children....

      $qp->children( 'q[@supp != "yes"]') worked great, thanks. I have just been told that as well as suppressed quotations the count should ignore information quotes, which are indicated by an attribute info="yes" on the <q> tag. How do you combine two conditions? Would $qp->children( 'q[@supp != "yes"][@info != "yes"]') work or is it done some other way?

        No, only one predicate is accepted at the moment, so you would have to write: $qp->children( 'q[@supp != "yes" and @info != "yes"]').

Re: XML::twig counting elements that don't have a certain attribute/value on them
by Jenda (Abbot) on Feb 12, 2010 at 10:02 UTC

    I know you've said you use XML::Twig, but ... this solution has the additional advantage that it doesn't keep the whole parsed document in memory:

    use strict; use warnings; no warnings 'uninitialized'; use XML::Rules; my $xmlStr = <<XML; <foo> <qp name="foo"> <q supp="yes">hello</q> <q supp="no">bye</q> <xr supp="yes">later</xr> </qp> <qp name="bar"> <q supp="yes">bye</q> <xr supp="no">later</xr> <xr supp="no">sometime</xr> </qp> </foo> XML my $parser = XML::Rules->new( stripspaces => 7, rules => { 'q,xr' => sub { return if $_[1]->{supp} eq 'yes' or $_[1]->{info} eq 'yes' +; return '+count' => 1; }, qp => sub { printf "Definition '%s' has %d quotations.\n", $_[1]->{nam +e}, $_[1]->{count}+0; return '+total' => $_[1]->{count}+0; }, foo => sub { return $_[1]->{total} }, _default => '', }, ); my $total = $parser->parse($xmlStr); print "The total count is $total\n";

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.