Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^3: Twig Mixed Content Child Text Replace Issues

by mirod (Canon)
on Feb 08, 2011 at 16:18 UTC ( #887006=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Twig Mixed Content Child Text Replace Issues
in thread Twig Mixed Content Child Text Replace Issues

The fact that you want the user input for each substitution makes the problem a _lot_ more tricky. Otherwise you could simply use subs_text as in my previous answer.

In this case you should do the substitution on the text of the '#TEXT' (or '#PCDATA') children of the paragraph. But then what happens if the text contains twice the string you want to replace, and you want only to replace the second one? The logic becomes quite a bit more complex. Of the top of my head I would modify the regexp, to let it skip the appropriate number of occurrences of the string to replace.

A non interactive version that you could use as a basis would be:

#!/usr/bin/perl use strict; use warnings; use Test::More tests => 1; use XML::Twig; my $doc = '<paragraph> Some <bold>text</bold> here which may be any <b +old>length</bold> and <bold>contain</bold> a number of child tags.</p +aragraph>'; my $exp = '<paragraph> Some <bold>text</bold> here which may be any <b +old>length</bold> and <bold>contain</bold> a quantity of child tags.< +/paragraph>'; # I got a little fancy here to allow several keywords to replace # the keywords are grouped in a regexp, sorted by inverse length so th +e alternation works properly my $replace = { number => 'quantity' }; my $keywords= join( '|', map { "\Q$_\E" } sort { length$b <=> length $ +a } keys %$replace); my $t=XML::Twig->new( twig_roots => { paragraph => \&subs_word })->par +se( $doc); is( $t->sprint, $exp, 'one change') ; exit; sub subs_word { my( $t, $para)= @_; foreach my $text_elt ($para->children( '#TEXT')) { my $text= $text_elt->text; if( $text_elt->text=~ m{\b($keywords)\b}) { $text=~ s{\b($keywords)\b}{$replace->{$1}}g; $text_elt->set_text( $text); } }


Comment on Re^3: Twig Mixed Content Child Text Replace Issues
Select or Download Code
Re^4: Twig Mixed Content Child Text Replace Issues
by unknown_varmit (Initiate) on Feb 08, 2011 at 22:18 UTC
    Thanks mirod, your advice is spot on.

    I have modified my code (using much less-elegant methods) to prompt the user for each #text match. It's working beautifully, but will take me a bit more work to get to a regex solution as clean as yours.

    Twig is turning out to be a very useful tool.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://887006]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2014-07-14 05:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (255 votes), past polls