Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^7: regex find and replace with a twist

by talexb (Chancellor)
on Jul 17, 2018 at 17:43 UTC ( [id://1218674]=note: print w/replies, xml ) Need Help??


in reply to Re^6: regex find and replace with a twist
in thread regex find and replace with a twist

Good point -- I was only answering the follow-up, not the reply. And now I've spent an entertaining hour tinkering with Perl regular expressions, time which is never wasted. :)

Here's my improved code, which shows that I still have some distance to go in writing regular expressions:

tab@music3:~/2018-0717$ cat !$ cat pm2.pl #!/usr/bin/perl use strict; use warnings; { my $word = q($-.%abc&/); my @group1 = ( $word =~ m/(?: # Non-capturing group of captured ([^a-zA-Z]+) # .. non-matching ([a-zA-Z]+) # .. following by matching )+ # .. for as many groups as possible ([^a-zA-Z]+) # followed by matching. /x ); my @group2 = map { $_ =~ /[a-zA-Z]/ ? split( //, $_ ) : $_ } @group1; my $new_word = join ( '', map { "[$_]" } @group2 ); print "$word -> $new_word\n"; } tab@music3:~/2018-0717$ perl pm2.pl $-.%abc&/ -> [$-.%][a][b][c][&/] tab@music3:~/2018-0717$

Alex / talexb / Toronto

Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Replies are listed 'Best First'.
Re^8: regex find and replace with a twist
by AnomalousMonk (Archbishop) on Jul 17, 2018 at 18:12 UTC

    I don't understand the effort to use split and map in solutions. Even with this latest version of the assumed implied requirement, a good, old  s/// seems the best way:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $word = '$-.%aBc&/d-E'; ;; (my $new_word = $word) =~ s{ ([^[:alpha:]]+ | [[:alpha:]]) }{[$1]}xms +g; ;; print qq{'$word' -> '$new_word'}; " '$-.%aBc&/d-E' -> '[$-.%][a][B][c][&/][d][-][E]'


    Give a man a fish:  <%-{-{-{-<

      In the end it comes down to readability, and readability depends on the comprehension level of the reader.

      You have a much higher level of regex-fu than I do, so yours is clear and understandable. I just had to look at it carefully before I could clearly see what it was doing.

      PS Good job -- that's a very nice piece of code.

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      I don't understand the effort to use split and map in solutions.
      As the one who first introduced them, I wasn't making any special effort to use split or map. As I (mis)understood the spec, it was the most natural approach to a solution. At a human-level description, if you have a string and want to put brackets around each character, do you search the string for characters and, each time you find one, change it to a three-character sequence of the same character preceded and followed by brackets ($word =~ s/./[$1]/g); or do you consider the text as a list of characters (split '', $word) and put brackets around each of them (map { "[$_]" }, @chars)? Personally, if I were doing it by hand, I'd take the latter approach, and I expect pretty much anyone else would, too.

      But, as already pointed out, I misunderstood the spec as wanting to bracket all characters, not only letters. With that limitation in place, the "treat it as a list of characters, not searching in a string" approach still works with a minor modification, and much more simply than the earlier attempts to fix it:

      $new_word = join "", map { $_ =~ /[a-zA-Z]/ ? "[$_]" : $_ } split "", +$word;
      Again following the "how would I do it by hand?" approach, this changes the map from "put brackets around each character" to "put brackets around characters in a-zA-Z and leave any other characters as-is", thus transforming "layer123" to "[l][a][y][e][r]123".
Re^8: regex find and replace with a twist
by AnomalousMonk (Archbishop) on Jul 18, 2018 at 12:54 UTC

    BTW: The method used here fails in the following cases:

    c:\@Work\Perl\monks>perl -wMstrict -le "for my $word (qw(XYZ$-.%abc&/ $-.%abc&/XYZ $-.%abc&/XYZ+=*)) { my @group1 = ( $word =~ m/(?: ([^a-zA-Z]+) ([a-zA-Z]+) )+ ([^a-zA-Z]+) /x ); my @group2 = map { $_ =~ /[a-zA-Z]/ ? split( //, $_ ) : $_ } @group1; my $new_word = join ( '', map { qq{[$_]} } @group2 ); print qq{'$word' -> '$new_word'}; } " 'XYZ$-.%abc&/' -> '[$-.%][a][b][c][&/]' '$-.%abc&/XYZ' -> '[$-.%][a][b][c][&/]' '$-.%abc&/XYZ+=*' -> '[&/][X][Y][Z][+=*]'
    (Again, this assumes that we can, to begin with, even arrive at a common understanding of what is required. :)


    Give a man a fish:  <%-{-{-{-<

      Yup -- I agree 100%. As I was writing the regex, I thought, hmm, this would break if (your example above) .. then I stopped going down that road, and just finished my post. [1]

      As often happens, we were dealing with an incompletely specified problem. But even the discussion around the possible solutions is a worthwhile endeavour. :)

      1. "Just #$%@ing Do It." --petdance|alester, undated quotation

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1218674]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-20 02:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found