Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Text matching repost

by prassi (Acolyte)
on Jun 18, 2012 at 08:44 UTC ( #976789=perlquestion: print w/replies, xml ) Need Help??
prassi has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perl Monk,

I am re-posting the same question but with some changes in the code for which the earlier mentioned solution also dint work

#include <stdio.h> #include "report.h" void main() { #ifdef CHECK_REPORT report("This is good"); reports("This is also good") #endif #if defined (REPORT_ENABLE) report("This is not good"); #endif printf("The execution is completed\n"); }

please see in the code report may end with colon or not that is intentional. The output has to be like this

#include <stdio.h> #include "report.h" void main() { #ifdef CHECK_REPORT #endif #if defined (REPORT_ENABLE) #endif printf("The execution is completed\n"); }
But the perl code which I have used below
$/ = undef; $_ = <file1>; s#\breport[s]?.*? \)\;|("[^"].*?")#defined $1 ? $1 :""#gsiex; print $_
when the code encounters the line with REPORT_ENABLE then it removes the rest of the code including printf.

From the previous post the solution given was

s# .* report [s]? .* ; .* ##sx;
this works if the report was not in multiple lines. can you suggest what changes I need to do my regex to get the required answer.



Replies are listed 'Best First'.
Re: Text matching repost
by Neighbour (Friar) on Jun 18, 2012 at 09:29 UTC
    How about a different approach?
    #!/usr/bin/perl use strict; use warnings; my ($start, $stop) = (qr '#ifdef|#if defined', qr '#endif'); open INPUT, '<', 'monks23.dat' or die "Error opening input: " . $!; while (<INPUT>) { if (/$start/ .. /$stop/) { # Print the lines themselves if (/$start/ or /$stop/) { print; } # Skip the rest next; } print; } close INPUT;
    Glad I've finally found a useful non-numeric-iterating purpose for .. :)
    Edit: Added comments, removed unnecessary regex escaping
      I later realised that instead of using
      if (/$start/ or /$stop/) { print; }
      I could just as well use (and more efficient too):
      if ($&) { print; }
      The meaning of $& is explained in perlvar as
      The string matched by the last successful pattern match.
      You can also use $MATCH if you use English;
        I could just as well use (and more efficient too):
        if ($&) { print; }

        Ummm, no. $& is one of the "ugly three" variables (the other two are $` and $') that kill performance. From perlvar:

        The use of this variable anywhere in a program imposes a considerable performance penalty on all regular expression matches. To avoid this penalty, you can extract the same substring by using @-. Starting with Perl 5.10, you can use the /p match flag and the ${^MATCH} variable to do the same thing for particular match operations.

        Apart from that, it may work for this special problem, but it does not work if $& evaluates to false:

        perl -E '$x="a0b"; $x=~/0/; say $&; if ($&) { die "not reached!" } if + ($x=~/0/) { say "matched zero" }'


        0 matched zero


        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      What kind of regex is this the above you mentioned, can you point me to some document on more understanding on this.



Re: Text matching repost
by Cristoforo (Curate) on Jun 18, 2012 at 16:37 UTC
    Yes, your regular expression didn't work for me either. I made a regexp that I believe works (below). But, before you use it for real files, try it on some samples to make sure it is doing what you want. Otherwise, you may lose your original c files when you try to change them.
    #!/usr/bin/perl use strict; use warnings; open my $fh, "<", 'o33.txt' or die $!; my $cfile = do {local $/; <$fh>}; close $fh or die $!; $cfile =~ s/^[\t ]*reports?\((?s:.*?)\);?[\t ]*\n//gm; print $cfile; __END__ ^[\t ]* optional leading spaces/tabs from beginning of line, ( +the 'm' modifier) reports? reports with the 's' optional \( left paren (?s:.*?) allow '.' to match newline in this scope (between parens) \) closing paren ;? optional ';' [\t ]* optional spaces/tabs \n ends with newline


    Update: added 'tabs' to spaces

Re: Text matching repost
by ckj (Chaplain) on Jun 18, 2012 at 09:27 UTC
    Try using this:
    s# .* report [s]? .* ; .* ##sxlm;
    to fetch multiple line and remove them you can use this too:
    s# .* report [s]? .*\s*.* ; .* ##sx;
    UPDATE: Have you tried running this one, I am sure that it will give you exactly what you require. it's a minor change you need not to go for so many funda and new logic. it's doable via this method itself.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://976789]
Approved by rovf
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2018-10-19 03:30 GMT
Find Nodes?
    Voting Booth?
    When I need money for a bigger acquisition, I usually ...

    Results (106 votes). Check out past polls.