Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

In your TestEK.txt file, you have all  {[%tqu ... ]} sequences on the same line. In the second, real file, these sequences span two or more lines.

You are processing the file line-by-line and matching your regex against each line, so if a  {[%tqu ... ]} sequence spans multiple lines, the regex will never see it.

If the file is small, less than, say, several hundred megabytes and never likely to grow larger, it might be easiest to "slurp" the entire file at once as a string into a scalar variable and then do a single  s/// against the variable, then write the string back out to the new file.
    my $string = do { local $/;  <>; };
    $string =~ s/$regex/$subst/gis;
    print $string;
(untested). Note that the  s/// now needs a  /s regex modifier so that  . (dot) in  .* will match a newline across multiple lines. Get rid of the while-loop entirely.

Update: See also File::Slurp.

Update 2: Here's a test:

c:\@Work\Perl\monks\OldChamp>perl -wMstrict -le "my $s = do { local $/; <>; }; print qq{[[$s]] \n}; ;; my $rx = '{\[%tqu.*]}'; my $su = ''; $s =~ s/$rx/$su/gis; print qq{[[$s]]}; " tqu.txt [[keep this {[%tqu get rid of this]} and keep this too ]] [[keep this and keep this too ]]

Update 3: Update 2 contains a rookie mistake: using greedy  .* instead of the lazy  .*? version. Here's a version that will actually work with a single long string. The previous version would delete everything between the absolute first  {[%tqu and the absolute last  ]} sequence in the file.

c:\@Work\Perl\monks\OldChamp>perl -wMstrict -le "my $s = do { local $/; <>; }; print qq{[[$s]] \n}; ;; my $rx = '{\[%tqu.*?]}'; my $su = ''; $s =~ s/$rx/$su/gis; print qq{[[$s]]}; " tqu.txt [[keep this {[%tqu get rid of this]} and keep this too keep {[%tqu but dump also ]} it to here. ]] [[keep this and keep this too keep it to here. ]]
Actually, something like
    my $regex = qr{ {\[%tqu [^\]]* ]} }xms;
might even be preferable as long as there are guaranteed to be no  ] (right-square-bracket) characters in the sub-strings to be removed, but let's leave it at that for now.


Give a man a fish:  <%-{-{-{-<


In reply to Re^3: Substitution don't work by AnomalousMonk
in thread Substitution don't work by OldChamp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2024-04-23 20:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found