http://www.perlmonks.org?node_id=1148111


in reply to Regex with condition

Further to your updated OP:

Why are you substituting newlines with spaces? The first approach given in my reply above, the one based on slurping the entire file (as you seem now to be doing), depends on having all newlines preserved. If your final text needs to be without newlines, I think the best time to get rid of them would be after all changes to the text have been made because those changes are all defined (or were defined in the original Original Post) in terms "lines", i.e., with reference to newlines.

However, also note that if the "random amount of other text" preceding the  'b' in a trigger line may contain any whitespace, then the
    my $pre = qr{ \S+ \s }xms;
regex will have to be changed accordingly.

Update: When updating a post, please do not destroy original content; instead, indicate defunct content as such (e.g., with  <strike> ... </STRIKE> tags or in a brief note). Likewise, indicate added content in some way. The Golden Rule: Thou shalt not destroy the context of previous replies. Please see How do I change/delete my post?.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^2: Regex with condition
by OldChamp (Acolyte) on Nov 19, 2015 at 16:29 UTC

    Thank You for your help and patience and replying again. I have to excuse for my formal mistakes in the update, I should have read "How do I change ..." before. I have changed / replaced my script with your solution and I have created a small testfile

    This is my script now:

    my $t = do { local $/; <>; }; my $pre = qr{ \S+ \s }xms; my $post = qr{ \s [^\n]+ \n [^\{]+ \{ }xms; ;; $t =~ s{ ^ $pre b $post \K Weiss \b} {Schwarz}xmsg; print qq{$t};

    This was the input:

    .......... [FEN "1B3k2/2R1b3/4p1pp/2r1P1p1/6P1/7P/5PK1/8 w - - 4 34"] {Weiss am Zug} * .......... [FEN "1B6/1Q4bk/R5p1/1pp4p/8/1n5P/1P4PK/1q6 b - - 2 42"] {Weiss am Zug} * ........... [FEN "1Bn4r/1k3q2/1pp1bbp1/1N1p1p1r/Q2Pp3/6PN/PP2PPB1/K1R4R b - - 0 23 +"] {Weiss am Zug} * ........... [FEN "1K6/4kp2/1P4p1/7p/7P/1r3PP1/1p6/1R6 w - - 3 52"] {Weiss am Zug} *

    The output was exactly the same, Weiss was not changed.

      In my reply above, I wrote

      However, also note that if the "random amount of other text" preceding the  'b' in a trigger line may contain any whitespace, then the
          my $pre = qr{ \S+ \s }xms;
      regex will have to be changed accordingly.
      In the OP, a typical  'b' line was
          '1n5P/1P4PK/1q6 b - - 2 42"]'
      In the post to which this is a reply, a typical  'b' line is
          '[FEN "1B6/1Q4bk/R5p1/1pp4p/8/1n5P/1P4PK/1q6 b - - 2 42"]'
      with more than one whitespace character before the 'b'. Accordingly, I have altered the $pre regex to
          my $pre  = qr{ \S+ \s \S+ \s }xms;
      (This is not what I would consider a "robust" regex, but you will have to provide a regex that best matches your data if you wish an improvement.)

      In addition, in the OP a typical  'Weiss' line had leading whitespace; in the most recent post, it has none. Accordingly, I have altered the $post regex to
          my $post = qr{ \s [^\n]+ \n [^\{]* \{ }xms;
      (note [^\{]* vice [^\{]+).

      I have downloaded your latest data to my file 1148125.dat. With the changes above and your latest data:

      c:\@Work\Perl\monks\OldChamp>perl -wMstrict -e "use 5.010; ;; my $t = do { local $/; <>; }; print qq{[[$t]] \n\n}; ;; my $pre = qr{ \S+ \s \S+ \s }xms; my $post = qr{ \s [^\n]+ \n [^\{]* \{ }xms; ;; $t =~ s{ ^ $pre b $post \K Weiss \b } {Schwarz}xmsg; ;; print qq{<<$t>> \n\n}; " 1148125.dat [[.......... [FEN "1B3k2/2R1b3/4p1pp/2r1P1p1/6P1/7P/5PK1/8 w - - 4 34"] {Weiss am Zug} * .......... [FEN "1B6/1Q4bk/R5p1/1pp4p/8/1n5P/1P4PK/1q6 b - - 2 42"] {Weiss am Zug} * ........... [FEN "1Bn4r/1k3q2/1pp1bbp1/1N1p1p1r/Q2Pp3/6PN/PP2PPB1/K1R4R b - - 0 23 +"] {Weiss am Zug} * ........... [FEN "1K6/4kp2/1P4p1/7p/7P/1r3PP1/1p6/1R6 w - - 3 52"] {Weiss am Zug} * ]] <<.......... [FEN "1B3k2/2R1b3/4p1pp/2r1P1p1/6P1/7P/5PK1/8 w - - 4 34"] {Weiss am Zug} * .......... [FEN "1B6/1Q4bk/R5p1/1pp4p/8/1n5P/1P4PK/1q6 b - - 2 42"] {Schwarz am Zug} * ........... [FEN "1Bn4r/1k3q2/1pp1bbp1/1N1p1p1r/Q2Pp3/6PN/PP2PPB1/K1R4R b - - 0 23 +"] {Schwarz am Zug} * ........... [FEN "1K6/4kp2/1P4p1/7p/7P/1r3PP1/1p6/1R6 w - - 3 52"] {Weiss am Zug} * >>
      At last,  Weiss has changed.


      Give a man a fish:  <%-{-{-{-<

        Dear AnomalousMonk, one thing is for sure, you are obviously the most helpful Monk in this forum with a great heart and a lot of patience with someone like me, who makes one mistake after the other. I have studied your replies, but I had difficulties in understanding the real meaning of $pre and also the $post, partly because this new form of writing a regex is new for me, in the book I have read about regex, this form with the modifier x (which is clearly more readable then the old form) was not used. Also the use of { instead of / was at first glance a little confusing for me. As you wisely have stated, I should have tried harder to learn fishing and not only to consume a fish. In my OP I have abreviated my data so to make my question short, not understanding that this changed the problem.

        Concerning the space in the $pre, I have just overlooked the space betweeen FEN and the ". Concerning the $post, in my real data there is leading whitespace before the 'Weiss' line, but in the testfile I have deleted it, for the same reason as before, I wanted to be short.

        After so many mistakes which I have made I have studied your solution more thoroughly and now I think I have understood it. At last, I have learned a little bit to fish. But when you are 72 years old, it's not so easy and I must confess that I have only studied in books the parts, which I think were needed to solve my problems.

        Once again, many thanks for your great help