http://www.perlmonks.org?node_id=182606


in reply to XML::Parser and Invalid XML

I have a bit of xml that uses numbers as elements.
No you don't. If it looks like that, it is not XML. It's a markup language of your own creation, unsupported by the world.

If you change your DTD to something that permits this:

<settings> <node item="1">Your Name</node> <node item="2">His Name</node> <node item="3">324324324</node> </settings>
I think you'll be a lot happier, because then you can use the thousands of tools meant to be used with XML.

-- Randal L. Schwartz, Perl hacker

Replies are listed 'Best First'.
Re: &bull;Re: XML::Parser and Invalid XML
by mirod (Canon) on Jul 17, 2002 at 22:59 UTC

    The proper format might even be closer to:

    <settings> <your_name>Your Name</your_name> <his_name>His Name</his_name> <number>324324324</number> </settings>

    XML markup should be descriptive. 1, 2 and 3, whether in tag names or as the only piece of information in the node tags are quite useless. And Perl will be just as happy to go through the keys of a hash than to the indexes in an array (BTW this comment is not directed to you merlyn ;--)

    --
    The Error Message is GOD - MJD

      I think I see part of the problem... the thing is, both the key and the value has meaning. In this case, I have a list of DID (Division ID) =< Division Name and in this file, DID => UID (Unique Member ID) and the reverse UID => DID to show what divisions a specific member is in. I'm just trying to store it. I've been using XML::Simple for a while now on other types of data, but it falls apart when the key is a number... :(



      "Weird things happen, get used to it."

      Flame ~ Lead Programmer: GMS

        As it says in the fine manual:
        Caveats Some care is required in creating data structures which will be passed to "XMLout()". Hash keys from the data structure will be encoded as either XML element names or attribute names. Therefore, you should use hash key names which conform to the relatively strict XML naming rules: Names in XML must begin with a letter. The remaining characters may be letters, digits, hyphens (-), under- scores (_) or full stops (.). It is also allowable to include one colon (:) in an element name but this should only be used when working with namespaces - a facility well beyond the scope of XML::Simple. You can use other punctuation characters in hash values (just not in hash keys) however XML::Simple does not sup- port dumping binary data. If you break these rules, the current implementation of "XMLout()" will simply emit non-compliant XML which will be rejected if you try to read it back in. (A later ver- sion of XML::Simple might take a more proactive approach).

        -- Randal L. Schwartz, Perl hacker

Re: &bull;Re: XML::Parser and Invalid XML
by Flame (Deacon) on Jul 17, 2002 at 22:46 UTC
    As I said, I know it's breaking the rules, so it's technically not W3C XML, but it's what XML::Simple end's up outputting and calling XML, so I don't know what to do about it.

    I have a data structure like this:
    { 'uid' => { '1' => ['3','4','5','6'], '2' => ['4','3','1','5'], }, 'did' => { '1' => '2', '3' => ['1','2'], '4' => ['1','2'], '5' => ['1','2'], '6' => '1', }, }

    When I run XMLout() from XML::Simple on that structure, I get:
    <opt> <did 1="2" 6="1"> <3>1</3> <3>2</3> <4>1</4> <4>2</4> <5>1</5> <5>2</5> </did> <uid> <1>3</1> <1>4</1> <1>5</1> <1>6</1> <2>4</2> <2>3</2> <2>1</2> <2>5</2> </uid> </opt>


    So how can I fix it? I'm open to just about anything here including alternative storage systems, but XMLin() won't read what XMLout() outputs. Is there a way to trick XML::Parser (or expat or whatever it's called) into thinking <1>234</1> is valid?



    "Weird things happen, get used to it."

    Flame ~ Lead Programmer: GMS

      I think you are using the wrong tool here. If you really want to dump this structure as XML you could use Data::DumpXML (and its friend Data::DumpXML::Parser).

      --
      The Error Message is GOD - MJD

        This looks like it'll do the trick. Thanks!



        "Weird things happen, get used to it."

        Flame ~ Lead Programmer: GMS

      This could be better, but you get the idea, just copy the hash with a tacked on token to all the hash keys.
      my %hash = qw(1 a 2 b 3 c); my $fixed_href = {}; fix_hash(\%fix_hash, $fixed_href); sub fix_hash { my ($old, $new) = @_; while (my ($key, $value) = each %$old) { (my $nkey = $key) =~ s/^(?=\d)/N/; if (ref($value) eq 'HASH') { $new_value = {}; $new->{$nkey} = $new_value; fix_hash($value, $new_value; } else { $new->{$nkey} = $value; } } }