Matching and replacing the minimum string from the tail of the regex

abitkin has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Matching and replacing the minimum string from the tail of the regex by Joost (Canon) on Aug 08, 2007 at 21:30 UTC
Well you get something: `perl test.pl s foo e f s adflkja` [download] because you're replacing / removing until the regex doesn't match. And /s.?e p/ obviously doesn't match "s\nfoo\ne f" but other than that it's not too clear to me what you want to accomplish exactly. "What should it profit a man, if he should win a flame war, yet lose his cool?"*	[reply] [d/l]
Re: Matching and replacing the minimum string from the tail of the regex by GrandFather (Saint) on Aug 08, 2007 at 21:58 UTC
Don't tell us what you think the code does. Tell us what you want to achieve and why. You've told us what output you expect for a given input, but not how the output is relate to the input. We can't tell that from your code because your code doesn't do what you want, nor even what you describe! The actual output is: `s foo e f s adflkja` [download] Update: It may be that you want something like: `use warnings; use strict; my $lines = ""; while (<DATA>) { $lines .= $_; } my @wanted = $lines =~ m/^(s(?:(?!e p$).)*e [^p]$)/msg; print @wanted; __DATA__ s erartt e p s foo e f s adflkja` [download] Prints: `s foo e f` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^2: Matching and replacing the minimum string from the tail of the regex by abitkin (Monk) on Aug 08, 2007 at 23:16 UTC
My apologizes, I missed the final line of the data space. What I'm trying to do is eliminate pass messages from a build log while keeping all the failure text. Each test has a start and end, but some data can be shown after a failure. I will update the code to reflect this. That said, I only want to eliminate items between the start (s) and the end which passed (e p). == Kwyjibo. A big, dumb, balding North American ape. With no chin.	[reply]
Re: Matching and replacing the minimum string from the tail of the regex by johngg (Canon) on Aug 08, 2007 at 22:05 UTC
If your data set is small, and I guess it is as you are reading all of your lines into a single string, you could use grep to just get the lines that match a regex alternation of what you want then use another `grep` with a post-incremented hash so that you only get the one 's' line rather than all three. `use strict; use warnings; my %seen = (); print grep { ! $seen{$_} ++ } grep { m{^s\|foo\|e f$} } <DATA> __END__ s erartt e p s foo e f s adflkja` [download] This produces `s foo e f` [download] as you require. The more usual idiom for reading all lines of a file into a single string (slurping) is `my $lines = ''; { local $/; $lines = <DATA>; }` [download] which changes the default input record separator inside the scope of the code block to `undef` so that the whole of the file is read into `$lines` in one fell swoop. I hope this is of use. Cheers, JohnGG Update: I should have placed the regex alternation in a non-capturing group. As it is, it matches lines beginning with 's', lines containing 'foo' anywhere and lines ending with 'e f'. Correct pattern is `m{^(?:s\|foo\|e f)$}`.	[reply] [d/l] [select]
Re: Matching and replacing the minimum string from the tail of the regex by Anonymous Monk on Aug 09, 2007 at 01:08 UTC
if you know the number of lines between the starting and ending lines of the block of lines you want to elide, something like this might do the trick: `my $starting_line = qr{ ^s [^\n]* \n }xsm; # starts with an 's' my $intervening_line = qr{ [^\n]* \n }xsm; # anything my $ending_line = qr{ e [ ] p \n }xsm; # ends with an 'e p' my $between = 1; my $line = do { local $/; <DATA> }; # slurp all the data $line =~ s{ $start_line ${intervening_line}{$between} $end_line } {}gxsm;` [download] this outputs: `Random String s foo e f blah blah End of file` [download] any closer?	[reply] [d/l] [select]
Re^2: Matching and replacing the minimum string from the tail of the regex by Anonymous Monk on Aug 09, 2007 at 01:42 UTC
alternatively, if it is known that the intervening line(s) will never begin with some pattern: my $starter = qr{ s }xsm; # starts with this string my $never = qr{ s }xsm; # never starts with this string my $ender = qr{ e [ ] p }xsm; # ends with this string my $start_line = qr{ ^ $starter [^\n]* \n }xsm; my $intervening_line = qr{ ^ (?! $never ) [^\n]* \n }xsm; my $end_line = qr{ $ender \n }xsm; my $line = do { local $/; <DATA> }; # slurp all the data $line =~ s{ $start_line $intervening_line* $end_line } {}gxsm; print $line; __DATA__ Random String s erartt e p s foo e f blah blah s adflkja wibble wobble e p End of file [download] output: `Random String s foo e f blah blah End of file` [download]	[reply] [d/l] [select]
Re^3: Matching and replacing the minimum string from the tail of the regex by abitkin (Monk) on Aug 09, 2007 at 14:13 UTC
For some reason, I had trouble with your code. Instead I turned the problem on it's head. `use strict; my $lines = do{ local $/; <DATA> }; # reverse the order of the lines so that the RE matches the # last part of the region first my $reversetext = join("\n", reverse(split("\n",$lines))); $reversetext =~ s/^[^\n]e p.?s[^\n]*\n//msg; # put the lines in normal order again $lines = join("\n",reverse(split("\n", $reversetext))); print $lines; __DATA__ Random String s erartt e p s foo e f blah blah s adflkja wibble wobble e p End of file` [download] == Kwyjibo. A big, dumb, balding North American ape. With no chin.	[reply] [d/l]
Re: Matching and replacing the minimum string from the tail of the regex by hv (Prior) on Aug 11, 2007 at 21:34 UTC
I'm not sure if there is a better way, but to me the obvious approach is to accept 'start' followed by '(not start)' followed by 'end': `my $lines = do { local $/; <DATA> }; $lines =~ s{ ^ s \n # start line (?: ^ (?! s \n ) . \n )* # body excluding new start line ^ e\ p $ # end line }{}xmg; print $lines;` [download] Note that this does more work than the original failing substitution, so you can expect it to be slower. I'm assuming that the start of a test is "an 's' followed by a newline", and on that assumption being a bit stricter than your original example about matching that. Hope this helps, Hugo	[reply] [d/l]


Think about Loose Coupling
	PerlMonks