Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Modify XML tags

by Anonymous Monk
on Nov 18, 2011 at 15:22 UTC ( #938859=note: print w/replies, xml ) Need Help??


in reply to Modify XML tags

Hi. Is there a way to change all tags and attributes in XML to lower case?

Sure, here is a start

#!/usr/bin/perl -- use strict; use warnings; use XML::Twig; my $str = <<'EOF'; <NoTe> <To> <Person>Satan</Person> </To> <Beef><SaUsAGe>is Tasty</SaUsAGe></Beef> </NoTe> EOF { my $t = XML::Twig->new( pretty_print => 'indented', force_end_tag_handlers_usage => 1, start_tag_handlers => { _all_ => sub { $_->set_tag( lc $_->ta +g ); return }, }, end_tag_handlers => { _all_ => sub { $_->set_tag( lc $_->tag +); return }, }, ); $t->parse($str); $t->flush(); } __END__

(Faced XPath feature being case-insensitive)

Why is this a problem?

Replies are listed 'Best First'.
Re^2: Modify XML tags
by mirod (Canon) on Nov 19, 2011 at 11:27 UTC

    Nice! One of the few cases where it makes sense to use force_end_tag_handlers_usage, Bravo!

    I don't think you need the end_tag_handlers handler though, the start one should be enough. You could also flush at the end of the handler to save memory, if that's an issue (untested).

      Yup, start_tag_handlers is enough, but the flushing has to be done from end_tag_handler

      #!/usr/bin/perl -- use strict; use warnings; use XML::Twig; my $str = <<'EOF'; <NoTe KunG="FoO" ChOp="SuEy"> <To KunG="FoO"> <Person KunG="FoO">Satan</Person> </To> <Beef KunG="FoO"><SaUsAGe KunG="FoO">is Tasty</SaUsAGe></Beef> </NoTe> EOF { my $t = XML::Twig->new( pretty_print => 'indented', force_end_tag_handlers_usage => 1, start_tag_handlers => { _all_ => sub { $_->set_tag( lc $_->tag ); if( $_->has_atts ){ my $atts = $_->atts ; $_->set_atts ({ map { lc( $_ ) => $atts->{$_} } keys %{ $atts } }); } return }, }, end_tag_handlers => { _all_ => sub { $_->flush; return }, }, ); $t->parse($str); $t->flush(); } __END__ <note chop="SuEy" kung="FoO"> <to kung="FoO"> <person kung="FoO">Satan</person> </to> <beef kung="FoO"> <sausage kung="FoO">is Tasty</sausage> </beef> </note>

      I don't think you need the end_tag_handlers handler though, the start one should be enough. You could also flush at the end of the handler to save memory, if that's an issue (untested).

      Flushing in start_tag handler doubles the output  <note chop="SuEy" kung="FoO"></note> but end_tag_handlers => { _all_ doesn't get called at all

      so nothing gets flushed until the whole tree is parsed

      Is this by design of end_tag_handlers?

Re^2: Modify XML tags
by elgato (Novice) on Nov 21, 2011 at 06:31 UTC
    The problem is it's 10-50mb per file, and there are CDATA sections also. And i need to transform the xml very fast.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://938859]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2021-12-06 06:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (32 votes). Check out past polls.

    Notices?