Re: Replace newlines only if not inside braces
by LanX (Saint) on Feb 11, 2013 at 13:25 UTC
|
As long as the braces groups are not nested you could do it by separating blocks in a split and handling them differently.
use Data::Dump;
my $data = q|
foo
bar
{{
alpha
beta
}}
baz
|;
@splits = split /({{.*?}})/s, $data;
dd \@splits;
my $result="";
while (my $block = shift @splits) {
$block =~ s/\n/<br>\n/gs;
$result .= $block;
$result .= shift @splits if @splits;
}
print $result;
output
["\nfoo\nbar\n", "{{\nalpha\nbeta\n}}", "\nbaz\n"]
<br>
foo<br>
bar<br>
{{
alpha
beta
}}<br>
baz<br>
I refrain from trying a complicated and potentially unmaintainable one-line regex solution.
Some come to mind¹, but I don't see the necessity if there are no other restrictions (like lack of memory) involved.
UPDATE
¹) like
- looping with while (/({{.*?}})/gs) (and \G and pos)
- using /e to do substitution within substitutions
- complicated look-ahead and look-behind assertion
- using \K somehow to restrict replacement
| [reply] [d/l] [select] |
|
| [reply] |
Re: Replace newlines only if not inside braces
by smls (Friar) on Feb 11, 2013 at 13:54 UTC
|
Replacing a target pattern everywhere except inside specific chunks, can be achieved with a regex of the following form:
s/((?:CHUNK_TO_BE_EXCLUDED|.)*?)TARGET/$1REPLACEMENT/gs
In your example, TARGET would be \n and REPLACEMENT would be \n<br>. The CHUNK_TO_BE_EXCLUDED pattern would have to match a whole block wrapped in double braces. You can use {{.*?}}\n, unless brackets can be nested and you need to guarantee that you match properly balanced pairs, in which case you can find a howto for constructing the pattern you need in perlfaq6. | [reply] [d/l] [select] |
|
| [reply] |
Re: Replace newlines only if not inside braces
by tmharish (Friar) on Feb 11, 2013 at 13:37 UTC
|
And if they are nested see this thread.
And specifically this excellent post by 7stud
| [reply] |
Re: Replace newlines only if not inside braces
by ww (Archbishop) on Feb 11, 2013 at 14:01 UTC
|
As an example, not as a direct response on your specific goal, one way would be to use a negated character class:
C:\>perl -E "use 5.016; use strict; use warnings;
my $str='abXcd{X}efXyz';
my @matches;
while ($str =~ /[^{](X)[^}]/g) {push @matches, $1;}
say @matches;"
XX
| [reply] [d/l] |
Re: Replace newlines only if not inside braces
by trizen (Hermit) on Feb 11, 2013 at 14:01 UTC
|
You can match and discard something that you don't want to replace. For example, match the group {{...}} and use the \K to replace only the right side, keeping the left side of \K as it is.
Code:
$data =~ s<(?:{{.*?}}\K)?\n>{<br>\n}gs;
print $data;
| [reply] [d/l] [select] |
|
$data =~ s<(?:{{.*?}})?\K\n>{<br>\n}gs;
print $data;
1) see or clause in Re: Replace newlines only if not inside braces for a work around | [reply] [d/l] [select] |
|
No, there is no difference. I just thought it would be a little bit more efficient. I thought that, if the \K is outside, the $& variable is being cleaned up for every substitution, which is not really necessary. It should be cleaned only when something on the left side has been matched. Anyway, it is more readable in your way, and does, basically, the same thing. :)
Alternatively, to work with strings that contain {{...}} groups, which are not followed by a newline, this code should do it:
$data =~ s<(?:{{.*?}}|[^\n])*\K\n>{<br>\n}gs;
print $data;
| [reply] [d/l] [select] |
|
|
|
| [reply] |
|
Did that s/// win an obfuscation contest somewhere?
use warnings;
use strict;
use 5.012;
my $data = <<'END_OF_TEXT';
foo
bar
{{
alpha
beta
}}
baz
END_OF_TEXT
$data =~ s/
(?: #Non-capturing group
{{.*?}} #Text enclosed by double braces
\K #Exclude what's to the left of \K from match
)? #Match whole group 0 or 1 time
\n
/\n<br>/gxms;
say $data;
--output:--
foo
<br>bar
<br>{{
alpha
beta
}}
<br>baz
<br>
It would make more sense to put the newlines after the breaks if you were trying to pretty print some html.
| [reply] [d/l] |
|
Did that s/// win an obfuscation contest somewhere?
so what's your contribution?
It would make more sense to put the newlines after the breaks if you were trying to pretty print some html.
minor problems of minor minds...
| [reply] |