Minimizing the amount of place holders on long identical regex

thanos1983 has asked for the wisdom of the Perl Monks concerning the following question:

I am really bad in regex and my best attempt that is working from my point of view is really poor in syntax. I am sure that it can be done in a different way and shorter.

I am currently having a string that it is 24 numerical characters long and I have created a regex to split the string on pieces character by character so I can extract the odd place holders that contain the actual information that I need.

What I have so far is:

#!/usr/bin/env perl
use strict;
use warnings;

my $sample = "041424344454647484940414";
$sample =~ /([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([
+0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(
+[0-9])([0-9])([0-9])([0-9])([0-9])([0-9])/; # 24 times the same patte
+rn
print "$1$3$5$7$9$11$13$15$17$19$21$23\n";

__END__
$ perl test.pl
012345678901
[download]

This is the desired output and it is working but I was wondering if there is a more elegant way to minimize replicating the same group 24 times but also being able to get the odd place holders ($1$3$5...).

I could use potentially split the string character by character and store the output in an array. Where from there I would remove the even elements and reform the array into string with join. But in my case this is not possible as the system that I am writing the regex does not support the split function or join it only supports C format commands syntax, so I am using Perl as a test tool before implementation.

If any one has any idea how to make this regex shorter feel free to drop a comment.

Thanks in advance for your time and effort, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

Comment on Minimizing the amount of place holders on long identical regex Select or Download Code

Replies are listed 'Best First'.
Re: Minimizing the amount of place holders on long identical regex by tybalt89 (Monsignor) on Jun 20, 2018 at 16:05 UTC
`#!/usr/bin/perl # https://perlmonks.org/?node_id=1217011 use strict; use warnings; my $sample = "041424344454647484940414"; my @odds = $sample =~ /(\d)\d/g; print @odds, "\n";` [download]	[reply] [d/l]
Re^2: Minimizing the amount of place holders on long identical regex by thanos1983 (Parson) on Jun 20, 2018 at 16:22 UTC
Hello tybalt89, Thank you for your time and effort. The solution is working but in my case I can not using array as an output or join to put the array in a string. This is why I am using place holders. Is there any way to use only odd place holders to store the characters? Thanks again for your time and effort. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^3: Minimizing the amount of place holders on long identical regex by jimpudar (Pilgrim) on Jun 20, 2018 at 16:40 UTC
This is starting to sound like an XY problem. Can you give a little more context on why you "can not using array as an output or join to put the array in a string"? We might be able to help you overcome this with this additional information. Best, Jim πάντων χρημάτων μέτρον έστιν άνθρωπος.	[reply]
Re^4: Minimizing the amount of place holders on long identical regex by thanos1983 (Parson) on Jun 20, 2018 at 16:59 UTC
Re^5: Minimizing the amount of place holders on long identical regex by ikegami (Patriarch) on Jun 20, 2018 at 17:20 UTC
Re: Minimizing the amount of place holders on long identical regex by hippo (Bishop) on Jun 20, 2018 at 17:05 UTC
This is not much more elegant but it does save long lines in your source. `#!/usr/bin/env perl use strict; use warnings; my $sample = "041424344454647484940414"; my $re = '([0-9])' x 24; $sample =~ /$re/; print "$1$3$5$7$9$11$13$15$17$19$21$23\n";` [download]	[reply] [d/l]
Re^2: Minimizing the amount of place holders on long identical regex by thanos1983 (Parson) on Jun 20, 2018 at 17:21 UTC
Hello hippo, Thanks for the time and effort. I had no idea that it can be written like this :) BR / Thanos Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Minimizing the amount of place holders on long identical regex by haukex (Archbishop) on Jun 20, 2018 at 17:06 UTC
But in my case this is not possible as the system that I am writing the regex does not support the split function or join it only supports C format commands syntax, so I am using Perl as a test tool before implementation. Does the system you're on (which one - PCRE?) support search and replace? `my $sample = "041424344454647484940414"; (my $output = $sample) =~ s/.\K.//g; # alternative s/(.)./$1/g; die $output unless $output eq "012345678901";` [download]	[reply] [d/l]
Re^2: Minimizing the amount of place holders on long identical regex by thanos1983 (Parson) on Jun 20, 2018 at 17:19 UTC
Hello haukex, Awesome thanks a lot that worked perfectly. BR / Thanos Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Minimizing the amount of place holders on long identical regex by Eily (Monsignor) on Jun 20, 2018 at 16:09 UTC
`perl -lE "print '041424344454647484940414' =~ /(\d).?/g" 012345678901` [download] The /g flag on regex means global search, so the regex will be applied several times on the string, starting each time from the end of the previous match. In list context (eg, when you affect the result to an array), this returns the list of captures (or, if none, the list of full matches). In my example above, I capture one digit, then try to match another character. If perl does manage to match that additional character (ie, it's not the end of the string), it will start looking after that position on the next attempt. More info on that in perlretut and the description of the m// operator in perlop.	[reply] [d/l]
Re^2: Minimizing the amount of place holders on long identical regex by thanos1983 (Parson) on Jun 20, 2018 at 16:28 UTC
Hello Eily, Thank you for your time and effort. The solution is working but in my case I can not use array as an output. This is because I can not use join to put the array in a string. I am using place holders as a solution to this problem. #!/usr/bin/env perl use strict; use warnings; use Data::Dumper; my $sample = "041424344454647484940414"; $sample =~ /([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([ +0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])( +[0-9])([0-9])([0-9])([0-9])([0-9])([0-9])/; # 24 times the same patte +rn print "$1$3$5$7$9$11$13$15$17$19$21$23\n"; my $new = "041424344454647484940414"; $new =~ /(\d).?/g; print "$1$3$5$7$9$11$13$15$17$19$21$23\n"; my @array = $new =~ /(\d).?/g; print Dumper \@array; __END__ $ perl test.pl 012345678901 Use of uninitialized value $3 in concatenation (.) or string at test.p +l line 12. Use of uninitialized value $5 in concatenation (.) or string at test.p +l line 12. Use of uninitialized value $7 in concatenation (.) or string at test.p +l line 12. Use of uninitialized value $9 in concatenation (.) or string at test.p +l line 12. Use of uninitialized value $11 in concatenation (.) or string at test. +pl line 12. Use of uninitialized value $13 in concatenation (.) or string at test. +pl line 12. Use of uninitialized value $15 in concatenation (.) or string at test. +pl line 12. Use of uninitialized value $17 in concatenation (.) or string at test. +pl line 12. Use of uninitialized value $19 in concatenation (.) or string at test. +pl line 12. Use of uninitialized value $21 in concatenation (.) or string at test. +pl line 12. Use of uninitialized value $23 in concatenation (.) or string at test. +pl line 12. 0 $VAR1 = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '1' ]; [download] Thanks again for your time and effort. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Minimizing the amount of place holders on long identical regex by choroba (Cardinal) on Jun 20, 2018 at 22:30 UTC
Is pack/unpack a good solution for you? `#! /usr/bin/perl use warnings; use strict; my $sample = "041424344454647484940414"; my $expected = '012345678901'; my $result = pack '(a)', unpack '(ax)', $sample; use Test::More tests => 1; is $result, $expected;` [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re: Minimizing the amount of place holders on long identical regex by bliako (Monsignor) on Jun 21, 2018 at 00:23 UTC
Just realised that you want to port that to C eventually so below is no good. I would just use ikegami's - don't forget to `free`. You want to separate even/odd: here are 2 variations on tybalt89's which do retain the even matches as well as the odd. Assuming your system can take it: `use strict; use warnings; # trivial variation on tybalt89's to retain # both odd and even one after the other my $sample = "041424344454647484940414"; my @allin = $sample =~ /(\d)(\d)/g; # more ordered output using subs in regex my @odds_vs_evens = (); $sample =~ s/(\d)(\d)/push(@odds_vs_evens,[$1,$2])/eg; print $_->[0].'->'.$_->[1]."\n" for @odds_vs_evens;` [download] `0->4 1->4 2->4 3->4 4->4 5->4 6->4 7->4 8->4 9->4 0->4 1->4` [download] But this is what I was after: `my $sample = "041424344454647484940414"; my %odds_vs_evens = $sample =~ /(\d)(\d)/g;` [download] Alas it is not ordered and the World just lost some more balance. bw, bliako	[reply] [d/l] [select]


Don't ask to ask, just ask
	PerlMonks