Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

XML:Simple read tag value with regex

by filipebean (Novice)
on May 07, 2013 at 12:54 UTC ( #1032474=perlquestion: print w/replies, xml ) Need Help??
filipebean has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I'm using XML:Simple to read a configuration from a XML file. The code is working, but Im facing now a problem when it read the regex that includes an xml tag. xml file below:

Config file:

<sourcetype> <name>G1</name> <desc>group 1 to decode</desc> <rules> <rule>['^\d(.)','14']</rule> <rule>['^(<xyz>)']</rule> <rule>['^(</xyz>)']</rule> </rules> </sourcetype>


my $xml = new XML::Simple( KeyAttr=>[] ); my $data = $xml->XMLin( $config_file ); foreach my $sourcetype ( @{$data->{sourcetypes}{sourcetype}} ) { print " " . $sourcetype->{name} . "\t\t" . $sourcetype->{desc} . "\ +n"; }

When I run the script it complains:

Opening and ending tag mismatch: xyz line 10 and rule Opening and ending tag mismatch: rule line 11 and rule at /XML/LibXML/ line 64 at /5.8.4/XML/ line 362

Is it possible to have regex as a tag value like '^(<xyz>)' and read it as a string?

Thank you in advance, Best regards.

Replies are listed 'Best First'.
Re: XML:Simple read tag value with regex
by toolic (Bishop) on May 07, 2013 at 13:04 UTC
    Your Perl code is probably fine, but that is not valid XML syntax. Use XML entity references (something like this):
Re: XML:Simple read tag value with regex
by mirod (Canon) on May 07, 2013 at 13:19 UTC

    As said before, you can escape the < by using &lt; or you can use CDATA sections:

    <sourcetype> <name>G1</name> <desc>group 1 to decode</desc> <rules> <rule><![CDATA[['^\d(.)','14']]]></rule> <rule><![CDATA[['^(<xyz>)']]]></rule> <rule><![CDATA[['^(</xyz>)']]]></rule> </rules> </sourcetype>

    That's still ugly, but a little easier to read than the version with &lt;, and at least you can cut and paste code in the rule elements.

    The <![CDATA[...]]> construct prevents everything in the section from being parsed as XML. You still need it to be valid text (unicode by default) and not to include ]]>, but anything else is fine.

Re: XML:Simple read tag value with regex
by kcott (Chancellor) on May 07, 2013 at 13:34 UTC

    G'day filipebean,

    There's a number of special characters that may need to be escaped in XML content: "<" to "&lt;"; ">" to "&gt;"; and "&" to "&amp;" (see XML Predefined Entities). So,


    would become


    Given the readability issues, it might be better to use CDATA blocks (see XML CDATA Sections) to escape the entire regex, e.g.


    [Note: I haven't investigated how XML::Simple interacts with these constructs.]

    -- Ken

      Hi all,

      thank for your answers. I tried the CDATA as you suggest and it works perfectly :)

      best regards.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1032474]
Approved by toolic
[Eily]: you could tie a variable into not having the same value each time, if you like to make people who try to debug your code facepalm
[Corion]: perl -wle 'package o; use overload q("") => sub {warn "str"; ""}, bool => sub{warn "bool"; 1}; package main; my $o={}; bless $o => o; print "Yay" if ($o && !length($o))'
[Corion]: But people writing such code should document the objects they construct and why it makes sense for an object to be invisible as string while being true in a boolean context
[hippo]: That's equal parts clever and horrendous.
[Eily]: the overload version wouldn't return true with "$x" && !length $x though, I guess
[hippo]: The more I look at this code, the more $x is a plain old scalar and the more this condition will never be true. I'm calling it a bug at this point.
[hippo]: Thanks for your input which has soothed my sanity (a little)
[Corion]: Eily: Sure - if you force both things into stringy things, then you break that magic. But that would also mean that you changed the expression, as now $x = 0.00 will be true instead of false as it were before
[Corion]: Ah no, at least in my feeble experiments that doesn't change the meaning
[Corion]: We sell sanity in small packages ;)

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (9)
As of 2017-07-27 13:40 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (413 votes). Check out past polls.