Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Adding counters for text replacement

by bartelby (Initiate)
on Jul 08, 2012 at 20:51 UTC ( [id://980620]=perlquestion: print w/replies, xml ) Need Help??

bartelby has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

am a beginner so what I need is way out of my league at the moment.

Have a text file with html markup in format

<p>text text text</p>

and basically I need to wrap these sequentially with xliff tags so I get:

<bpt i="1" x="1">&lt;p&gt;</bpt>text text text<ept i="1">&lt;/p&gt;</ept>

so the bpt as opening and the ept as closing with the next found paired occurrence having counter "2" (<bpt i="2" x="2">) and so on,

can someone please help?

thanks in advance, bart

know it looks strange but it is in fact the syntax that I need

thanks for all the input so far

xyzzy - I tried running the code but got errors (about spaces and barewords)

Replies are listed 'Best First'.
Re: Adding counters for text replacement
by AnomalousMonk (Archbishop) on Jul 08, 2012 at 21:24 UTC
    lt;p&gt;text text text&lt;/p&gt;

    There may be a representation problem here. You may have meant the above string to represent  <p>text text text</p> rather than a bunch of entities. If so, this can be achieved by enclosing the HTML in code tags thusly:
       <code><p>text text text</p></code>
    (<c> ... </c> tags can be used in place of <code> ... </code> tags). Please do not use <pre> tags! Please see Markup in the Monastery and Perl Monks Approved HTML tags, among others.

Re: Adding counters for text replacement
by xyzzy (Pilgrim) on Jul 08, 2012 at 21:13 UTC

    If you want to increment the value as each substitution is made, consider using the /e modifier for your substitution regex, which evals the replacement text before making the substitution. Something like this might work, but only if you have a matching number of opening and closing tags, and only if none of them are nested

    my $bpt_counter=0; my $ept_counter=0; for ($YOUR_ENTIRE_TEXT_IN_ONE_STRING) { s/(&lt;p&gt;)/++$bpt_counter;qq(<bpt i="$bpt_counter" x="$bpt_counte +r">$1</bpt>)/eg; s|(&lt;/p&gt;)|++$ept_counter;qq(<ept i="$ept_counter">$1</ept>)|eg; } #just to be safe warn "Uh-oh!\n" unless $bpt_counter == $ept_counter;

    $,=qq.\n.;print q.\/\/____\/.,q./\ \ / / \\.,q.    /_/__.,q..
    Happy, sober, smart: pick two.
Re: Adding counters for text replacement
by Anonymous Monk on Jul 08, 2012 at 20:56 UTC

    am a beginner so what I need is way out of my league at the moment.

    You acquiesce before you begin, not good. perlintro

Re: Adding counters for text replacement
by aitap (Curate) on Jul 09, 2012 at 08:12 UTC
    Are you sure this will be valid HTML? You may try to write a regexp for this (at least I don't know how can it be archieved using HTML::Entity/HTML::TreeBuilder), like this:
    $ cat t1.pl #!/usr/bin/perl -p BEGIN{$i=0} s|<p>((?<!</p>).*)</p>|'<bpt i="'.++$i.'" x="'.$i.'"><p></bpt>'.$1.'<e +pt i="'.$i.'"></p></ept>'|egs; $ seq 1 10 | while read; do echo '<p>text<xczcx>texxxxx</p>'; done | p +erl t1.pl <bpt i="1" x="1"><p></bpt>text<xczcx>texxxxx<ept i="1"></p></ept> <bpt i="2" x="2"><p></bpt>text<xczcx>texxxxx<ept i="2"></p></ept> <bpt i="3" x="3"><p></bpt>text<xczcx>texxxxx<ept i="3"></p></ept> <bpt i="4" x="4"><p></bpt>text<xczcx>texxxxx<ept i="4"></p></ept> <bpt i="5" x="5"><p></bpt>text<xczcx>texxxxx<ept i="5"></p></ept> <bpt i="6" x="6"><p></bpt>text<xczcx>texxxxx<ept i="6"></p></ept> <bpt i="7" x="7"><p></bpt>text<xczcx>texxxxx<ept i="7"></p></ept> <bpt i="8" x="8"><p></bpt>text<xczcx>texxxxx<ept i="8"></p></ept> <bpt i="9" x="9"><p></bpt>text<xczcx>texxxxx<ept i="9"></p></ept> <bpt i="10" x="10"><p></bpt>text<xczcx>texxxxx<ept i="10"></p></ept> $
    but this is wrong way (for example, it will break on nested <p>...<p>...</p>...</p> tags). Why do you need such a strange tag combination?
    Sorry if my advice was wrong.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://980620]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-03-29 11:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found