Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^2: read a file and insert closing tags if not present

by valavanp (Curate)
on Mar 29, 2007 at 07:32 UTC ( #607176=note: print w/ replies, xml ) Need Help??


in reply to Re: read a file and insert closing tags if not present
in thread read a file and insert closing tags if not present

Hi grandfather, This is the code which i tried.

require HTML::TokeParser; $p = HTML::TokeParser->new("output.xml") || die "Can't open: $!"; $p->empty_element_tags(1); open(FH, "output.xml"); print FH $p; close FH;
output.xml
<greeting class="simple">Hello, world!
The above file is a sample file which i tried to insert the closing tag for the greeting. Actually i have a file which contains 500 lines of text with tagging. for. example in that file i have a tag named <to> but it's not closed. I have to insert the closing tag. This is an example. Thanks for your suggestion.


Comment on Re^2: read a file and insert closing tags if not present
Select or Download Code
Re^3: read a file and insert closing tags if not present
by f00li5h (Chaplain) on Mar 29, 2007 at 07:46 UTC

    You can guess sometimes, but there is no way of knowing where the right place for it is.

    in the example,<p> foo <p> bar, you can see where the </p>'s should go, because you can't nest p tags but if you have <span style="rly">Oh, rly<span style="ya">ya, rly there is no real way of knowing where the </span>'s should go, because they can legally be nested.

    You'll most likely have to write rules for how (and where) to end each tag, so that you don't mess the nesting of things (like finding your whole document in a <a href="foo"> or something)

    @_=qw; ask f00li5h to appear and remain for a moment of pretend better than a lifetime;;s;;@_[map hex,split'',B204316D8C2A4516DE];;y/05/os/&print;
Re^3: read a file and insert closing tags if not present
by GrandFather (Cardinal) on Mar 29, 2007 at 08:00 UTC

    HTML::TreeBuilder handles that simple case:

    use strict; use warnings; use HTML::TreeBuilder; my $sgml = <<SGML; <greeting class="simple">Hello, world! SGML my $root = HTML::TreeBuilder->new (); $root->ignore_unknown (0); $root->parse ($sgml); print $root->guts (0)->as_XML ();

    Prints:

    <greeting class="simple">Hello, world!</greeting>

    although I'd not guarantee it will accept everything a real SGML document may contain.


    DWIM is Perl's answer to Gödel
      Hi grandfather, Your solution is fine. But when i give like this extra tags have been inserted. how can i avoid this.
      use strict; use warnings; use HTML::TreeBuilder; my $sgml = <<SGML; <html> <greeting class="simple">Hello, world!<head>heading</head> </html> SGML my $root = HTML::TreeBuilder->new (); $root->ignore_unknown (0); $root->parse ($sgml); print $root->guts (0)->as_XML ();
      Thanks for your suggestion

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://607176]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (19)
As of 2014-10-30 15:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (208 votes), past polls