Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Using HTML::TreeBuilder to change the DOCTYPE declaration

by wfsp (Abbot)
on Nov 04, 2008 at 10:40 UTC ( #721332=perlquestion: print w/replies, xml ) Need Help??

wfsp has asked for the wisdom of the Perl Monks concerning the following question:

The look_down method is not finding the DOCTYPE declaration although it is present in Data::Dumpers output as described in the docs. What's the best way to do this?
#!/usr/bin/perl use warnings; use strict; use Data::Dumper; $Data::Dumper::Indent = 1; $Data::Dumper::Sortkeys = 1; use HTML::TreeBuilder; my $content = do{local $/;<DATA>}; my $root = HTML::TreeBuilder->new_from_content($content); print $root->tag, qq{\n}; # prints: html print $root->{_decl}{_tag}, qq{\n}; # prints: ~declaration print $root->{_decl}{text}, qq{\n}; # prints: DOCTYPE html PUBLIC "XHTML" my $dec = $root->look_down( _tag => q{~declaration}, ) or die qq{declaration not found\n}; $dec->splice_content( 0, 1, q{<!DOCTYPE html PUBLIC "HTML4">}, ) or die qq{splice failed\n}; print $root->as_HTML; __DATA__ <!DOCTYPE html PUBLIC "XHTML"> <html> <head><title>declaration</title></head> <body><p>declaration</p></body> </html>
extract from Data::Dumper output
'_decl' => bless( { '_tag' => '~declaration', 'text' => 'DOCTYPE html PUBLIC "XHTML"' }, 'HTML::Element' ),

Replies are listed 'Best First'.
Re: Using HTML::TreeBuilder to change the DOCTYPE declaration
by ikegami (Pope) on Nov 04, 2008 at 11:20 UTC

    The doctype is stored as an attribute of the root element where the attribute name is "_decl" and the value is an HTML::Element object, so basically, you want

    my $ele = $root->look_down( _decl => ...was specified..., ) or die qq{declaration not found\n}; my $dec = $ele->attr('_decl');

    Since look_down doesn't allow us to check if an attribute was specified we'll have to provide our own handler.

    my $ele = $root->look_down( sub { $_[0]->attr('_decl') } ) or die qq{declaration not found\n}; my $dec = $ele->attr('_decl');

    But why use look_down at all? The only possible node it could return is the root node. The above code boils down to

    my $dec = $root->attr('_decl') or die qq{declaration not found\n};

    Now that we have the declaration, let's move on to changing it. It makes no sense to use splice_content to modify attributes. attr is the proper method.

    $dec->attr(text => 'DOCTYPE html PUBLIC "HTML4"');

    Since the entire purpose is to replace the declaration, let's create a new declaration rather than dying if it's absent.

    $root->attr('_decl', HTML::Element->new('~declaration', text => 'DOCTYPE html PUBLIC "HTML4"', ) );

    All together:

    #!/usr/bin/perl use warnings; use strict; use HTML::TreeBuilder; my $content = do{local $/;<DATA>}; my $root = HTML::TreeBuilder->new_from_content($content); $root->attr('_decl', HTML::Element->new('~declaration', text => 'DOCTYPE html PUBLIC "HTML4"', ) ); print $root->as_HTML; __DATA__ <!DOCTYPE html PUBLIC "XHTML"> <html> <head><title>declaration</title></head> <body><p>declaration</p></body> </html>
    <!DOCTYPE html PUBLIC "HTML4"> <html><head><title>declaration</title></head><body><p>declaration</bod +y></html>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://721332]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2020-05-25 12:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If programming languages were movie genres, Perl would be:















    Results (145 votes). Check out past polls.

    Notices?