mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:
I am looking to simplify a pattern. If I have a string my $x = "0.01 NaN 2.30 4.44"; then the following pattern finds the items present:
my $r1 = qr/([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)/x;
Notice that the same capture group criteria are repeated. I wonder how I may write that so it is simpler, shorter, and all on one line. Here is some pseudo-code to try to show what I am aiming for: my $r1 = qr/(?=([Na0-9\.\-\+]+)\s+){4}/
However, I've tried that and some permutations without luck:
#!/usr/bin/perl
use strict;
use warnings;
my $x = "0.01 NaN 2.30 4.44";
# the following works as desired
my $r1 = qr/([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)/x;
my ($d, $e, $f, $g) = ($x =~ m/$r1/x );
print qq($d, $e, $f, $g\n);
# the following finds the first number twice
my $r2 = qr/(?=(([Na0-9\.\-\+]+)\s*)){4}/x;
($d, $e, $f, $g) = ($x =~ m/$r2/x );
print qq($d, $e, $f, $g\n);
# the following finds a null prior to the first item
my $r3 = qr/((?=([Na0-9\.\-\+]+)\s*){4})/x;
($d, $e, $f, $g) = ($x =~ m/$r3/x );
print qq($d, $e, $f, $g\n);
exit(0);
How can I write that pattern so that the pattern it contains is repeated but not locked into the values found in the very first match? Is this a case for using recursive patterns?
Re: Repeating a capture group pattern within a pattern
by Corion (Patriarch) on Jul 15, 2024 at 07:49 UTC
|
This might be cheating, but when retrieving multiple repeated matches, I often use /g after validating that the line looks somewhat valid:
my $re4 = qr/\b([Na0-9\.\-\+]+)\b/; # capture a floating point number
my @vals = ($x =~ m/$re4/gx );
croak "Invalid line '$x'" if @vals != 4;
($d, $e, $f, $g) = @vals;
Often, I first identify the section without capturing and then parse it in a second step (but that's not what you wanted):
my $float = qr/\b([Na0-9\.\-\+]+)\b/;
croak "Invalid line '$x'" if $x !~ /((?:$float(\s+|$)){4}))/;
print "Found numbers '$1'\n";
my @vals = $1 =~ /($float)/g;
I did not find a way to capture the repeated values in one go.
| [reply] [d/l] [select] |
Re: Repeating a capture group pattern within a pattern
by haukex (Archbishop) on Jul 15, 2024 at 10:39 UTC
|
#!/usr/bin/env perl
use warnings;
use strict;
my $x = "0.01 NaN 2.30 4.44";
# the following works as desired
my $r1 = qr/([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)\s+
([Na0-9\.\-\+]+)/x;
my ($d, $e, $f, $g) = ($x =~ m/$r1/x );
print "good: $d, $e, $f, $g\n";
# recursive subpatterns
my $rx = qr{ ([Na0-9\.\-\+]+)
\s+ ((?1)) \s+ ((?1)) \s+ ((?1)) }x;
($d, $e, $f, $g) = $x =~ $rx or die;
print "good? $d, $e, $f, $g\n";
# or match / validate first, then split
my $ry = qr{ ([Na0-9\.\-\+]+) (?: \s+ (?1)){3} }x;
$x =~ $ry or die;
($d, $e, $f, $g) = split ' ', $&;
print "good? $d, $e, $f, $g\n";
| [reply] [d/l] |
Re: Repeating a capture group pattern within a pattern
by hippo (Archbishop) on Jul 15, 2024 at 09:21 UTC
|
If I were doing this for real I would use /g as Corion suggests. However, for interest's sake, here is one way to construct the pattern without repetition in the code.
#!/usr/bin/env perl
use strict;
use warnings;
my $x = "0.01 NaN 2.30 4.44";
my $r1 = join '\s+', ('([Na0-9\.\-\+]+)') x 4;
print "r1 is '$r1'\n";
my ($d, $e, $f, $g) = ($x =~ m/$r1/);
print qq($d, $e, $f, $g\n);
Note that you don't need all those backslashes so the inner character class can be shortened to just [Na0-9.+-] but it has no effect on the end result.
If this is an XY Problem, then perhaps if you explain what you are actually trying to do someone could suggest a better approach.
| [reply] [d/l] [select] |
|
<gml:DataBlock>
<gml:rangeParameters/>
<gml:doubleOrNilReasonTupleList>
1016.1 7.7 20.6 13.8 72.9 215.0 6.64 3.94 5.33 0.0 31.3 31.3 0.0 0.0 7
+16.7 8461717.0 NaN 7750386.0 2224256.0 38507.5 10.9 1016.4 7.7 21.1 1
+3.7 71.1 222.0 6.82 4.67 4.95 0.0 2.1 2.1 0.0 0.0 727.6 11081057.0 Na
+N 10146792.0 2372829.8 40682.2 11.2 1016.4 7.7 21.6 13.2 67.7 220.0 6
+.71 4.37 5.1 0.0 0.8 0.8 0.0 0.0 749.9 13780637.0 NaN 12614863.0 4011
+495.3 44407.4 11.3 1016.4 7.7 21.7 12.3 64.8 216.0 6.64 3.98 5.31 0.0
+ 0.0 0.0 0.0 0.0 693.8 16278537.0 NaN 14898300.0 5267137.5 47598.7 11
+.1 1016.1 7.7 22.1 9.8 56.4 224.0 6.72 4.7 4.81 0.0 0.0 0.0 0.0 0.0 6
+13.8 18488268.0 NaN 16917212.0 7048784.0 54656.4 11.3 1015.9 7.7 21.6
+ 10.3 59.0 223.0 6.52 4.48 4.74 0.0 0.0 0.0 0.0 0.0 508.5 20319020.0
+NaN 18588124.0 8487086.0 52728.9 11.1 1016.0 7.7 20.6 11.8 66.4 217.0
+ 6.22 3.81 4.93 0.0 0.2 0.0 0.0 0.2 387.0 21712000.0 NaN 19859482.0 9
+532787.0 45898.5 10.5 1016.1 7.7 19.9 12.7 71.8 219.0 5.97 3.74 4.64
+0.0 3.3 0.0 0.0 3.3 257.6 22639382.0 NaN 20706258.0 10155480.0 39930.
+0 9.9 1016.4 7.7 19.4 13.2 74.9 221.0 5.25 3.43 3.97 0.0 1.8 0.0 0.0
+1.8 140.2 23144020.0 NaN 21166678.0 10436559.0 36483.1 9.3 1016.4 7.7
+ 19.2 12.5 73.1 232.0 4.04 3.19 2.44 0.0 6.8 2.7 4.3 0.0 47.7 2331552
+6.0 NaN 21323276.0 10494672.0 38582.3 8.0 1016.3 7.7 19.2 11.2 68.6 2
+62.0 2.52 2.49 0.34 0.0 53.8 0.0 24.2 39.1 1.8 23322102.0 NaN 2132958
+4.0 10494672.0 43729.5 6.0 1016.5 7.7 19.0 11.0 68.8 247.0 2.3 2.08 0
+.98 0.0 100.0 0.0 99.5 100.0 0.1 23322252.0 NaN 21329518.0 10494672.0
+ 43367.8 3.7 1016.7 7.7 18.6 12.0 73.4 225.0 2.3 1.57 1.67 0.0 54.2 0
+.0 46.3 14.7 0.0 23322252.0 NaN 21329518.0 10494672.0 38219.0 3.6 101
+6.8 7.7 18.4 14.1 81.9 196.0 2.97 0.75 2.88 0.0 86.4 0.1 82.5 21.9 0.
+1 23322356.0 NaN 21329518.0 10494672.0 28237.8 4.4 1017.0 7.7 18.5 14
+.4 82.5 199.0 3.04 0.87 2.92 0.0 99.7 53.1 65.1 98.6 0.0 23322262.0 N
+aN 21329518.0 10494672.0 27807.3 5.0 1017.0 7.7 18.7 14.1 80.8 195.0
+2.52 0.57 2.46 0.0 100.0 71.6 23.1 99.9 0.1 23322488.0 NaN 21329518.0
+ 10494672.0 29676.3 4.6 1017.2 7.7 18.8 14.0 80.0 196.0 2.96 0.78 2.8
+5 0.18 100.0 100.0 100.0 99.7 0.4 23323732.0 NaN 21330926.0 10494672.
+0 7893.4 4.6 1017.1 7.7 18.6 14.2 81.3 155.0 2.05 -0.98 1.8 1.3 93.7
+58.4 63.3 57.6 15.0 23377624.0 NaN 21380274.0 10494672.0 29305.2 4.6
+1017.1 7.7 18.6 13.8 80.1 170.0 3.61 -0.55 3.56 1.3 92.8 89.9 30.2 1.
+7 72.1 23637418.0 NaN 21617728.0 10494761.0 31078.9 5.7 1017.3 7.7 18
+.9 14.9 83.3 178.0 2.97 -0.13 2.98 1.3 86.3 55.2 68.3 0.0 130.9 24108
+452.0 NaN 22048990.0 10494719.0 27161.4 5.4 1017.5 7.7 19.7 15.3 81.4
+ 197.0 2.29 0.48 2.24 1.3 57.2 22.3 44.9 0.0 302.4 25197118.0 NaN 230
+44054.0 10494678.0 28930.3 4.8 1017.7 7.7 20.4 14.6 76.5 198.0 1.82 0
+.28 1.78 1.3 66.0 10.1 62.2 0.0 413.4 26685568.0 NaN 24404956.0 10494
+752.0 34279.5 4.0 1017.5 7.7 20.7 13.8 72.6 161.0 2.67 -1.21 2.33 1.3
+ 84.8 0.0 84.8 0.0 429.6 28232080.0 NaN 25819190.0 10495451.0 38634.6
+ 4.6 1017.4 7.7 21.2 13.9 71.5 159.0 3.15 -1.39 2.81 1.3 97.7 0.2 97.
+7 0.0 444.8 29833140.0 NaN 27281036.0 10495527.0 39613.4 5.3 1017.6 7
+.7 22.0 13.7 68.4 161.0 2.93 -1.21 2.62 1.3 86.6 1.1 71.8 51.9 558.1
+31842300.0 NaN 29115360.0 10495682.0 42999.5 5.7 1017.4 7.7 23.6 11.8
+ 58.0 143.0 2.81 -1.89 2.04 1.3 5.5 0.0 4.9 0.6 628.6 34105344.0 NaN
+31181590.0 10496878.0 53270.7 5.5 1017.3 7.7 23.9 10.7 54.4 139.0 3.6
+8 -2.68 2.49 1.3 10.6 0.0 7.9 2.9 673.6 36530392.0 NaN 33394718.0 105
+04540.0 55330.6 6.5 1017.2 7.7 23.6 11.7 57.8 141.0 4.98 -3.24 3.81 1
+.3 37.6 0.0 5.4 34.0 671.6 38948084.0 NaN 35600932.0 10572389.0 53759
+.2 8.3 1017.3 7.7 23.0 12.4 61.2 145.0 5.0 -2.93 4.05 1.3 24.6 0.0 24
+.6 0.0 563.3 40976088.0 NaN 37452260.0 10577521.0 51063.5 8.4 1017.1
+7.7 22.9 13.0 63.6 145.0 4.66 -2.77 3.78 1.3 28.6 0.0 7.3 22.9 465.8
+42652852.0 NaN 38982672.0 10578041.0 48385.1 8.3 1017.0 7.7 22.8 12.7
+ 62.7 143.0 4.44 -2.77 3.46 1.3 38.8 0.0 34.2 6.9 350.8 43915868.0 Na
+N 40134736.0 10578632.0 49670.0 7.6 1016.7 7.7 22.4 12.8 64.2 138.0 4
+.17 -2.85 3.03 1.3 10.5 0.0 2.4 8.3 232.2 44751760.0 NaN 40897580.0 1
+0579198.0 48024.7 7.1 1016.5 7.7 21.7 14.3 71.1 128.0 3.16 -2.53 1.89
+ 1.3 0.9 0.0 0.0 0.9 131.4 45224988.0 NaN 41329512.0 10638878.0 40243
+.3 6.6 1016.6 7.7 21.2 14.9 74.9 124.0 2.97 -2.45 1.71 1.3 25.2 0.0 2
+.4 23.3 44.4 45384908.0 NaN 41475220.0 10681477.0 35910.2 4.9 1016.4
+7.7 20.9 15.2 77.1 119.0 2.62 -2.3 1.25 1.3 6.4 0.0 5.8 0.7 1.6 45390
+616.0 NaN 41480160.0 10681174.0 33633.9 4.5 1016.3 7.7 20.6 15.2 78.0
+ 77.0 3.04 -2.98 -0.49 1.3 1.5 0.0 1.0 0.4 0.1 45390552.0 NaN 4147990
+0.0 10681174.0 32898.9 4.4 1016.3 7.7 19.8 15.5 82.0 83.0 3.6 -3.54 -
+0.59 1.3 39.6 1.5 0.0 38.6 0.1 45390516.0 NaN 41479900.0 10681174.0 2
+8384.6 5.4 1016.2 7.7 19.8 15.2 80.9 80.0 3.99 -3.89 -0.88 1.3 79.8 0
+.0 0.1 79.8 0.0 45390256.0 NaN 41479900.0 10681174.0 29798.2 6.2 1016
+.1 7.7 19.9 14.5 78.0 83.0 4.55 -4.5 -0.65 1.3 72.8 1.8 5.5 70.7 0.1
+45390380.0 NaN 41479900.0 10681174.0 33125.9 6.9 1015.7 7.7 19.7 14.8
+ 79.8 79.0 4.93 -4.83 -1.06 1.3 82.5 4.1 0.9 81.6 0.1 45390640.0 NaN
+41479900.0 10681174.0 31134.5 7.6 1015.4 7.7 19.6 15.2 81.5 85.0 5.18
+ -5.15 -0.54 1.3 60.2 1.5 9.8 55.2 2.6 45399888.0 NaN 41488504.0 1068
+1174.0 29202.1 8.0 1015.0 7.7 19.7 16.0 84.4 90.0 5.29 -5.28 -0.08 1.
+3 72.2 0.4 15.8 66.8 32.6 45517296.0 NaN 41595768.0 10681174.0 25601.
+0 8.5 1014.9 7.7 19.9 16.2 84.5 91.0 5.63 -5.62 -0.07 1.3 62.2 0.1 3.
+6 60.7 99.1 45873920.0 NaN 41921652.0 10681188.0 25504.3 8.9 1014.7 7
+.7 20.5 16.2 82.2 97.0 5.27 -5.26 0.48 1.3 93.6 2.4 5.9 93.0 169.2 46
+483036.0 NaN 42477320.0 10681384.0 28098.3 8.7 1014.6 7.7 20.8 16.1 8
+0.8 100.0 5.36 -5.29 0.87 1.3 99.7 2.6 12.5 99.7 196.9 47191680.0 NaN
+ 43123740.0 10681074.0 29718.5 8.7 1014.2 7.7 21.7 16.3 78.2 101.0 5.
+17 -5.08 0.9 1.3 100.0 0.0 5.8 100.0 290.9 48239060.0 NaN 44080204.0
+10681266.0 32552.5 8.4 1014.1 7.7 22.3 16.4 76.7 102.0 5.69 -5.59 1.0
+4 1.3 100.0 0.0 0.3 100.0 344.6 49479400.0 NaN 45212932.0 10694658.0
+34180.5 9.1 1013.5 7.7 22.0 16.4 77.6 94.0 6.21 -6.2 0.29 1.3 100.0 0
+.0 14.7 100.0 258.3 50409360.0 NaN 46061948.0 10725906.0 33338.7 9.7
+1013.5 7.7 21.4 15.2 75.4 95.0 6.7 -6.69 0.48 1.3 100.0 0.0 79.6 100.
+0 115.0 50823272.0 NaN 46440372.0 10725795.0 35955.3 10.4 1013.1 7.7
+22.0 15.2 73.4 91.0 6.86 -6.86 0.07 1.3 99.5 0.0 78.3 97.8 235.6 5167
+1476.0 NaN 47214844.0 10727690.0 38242.3 10.8
</gml:doubleOrNilReasonTupleList>
</gml:DataBlock>
I've written them several times over the years and they have not fixed their data nor deigned to even reply.
| [reply] [d/l] |
|
#!/usr/bin/env perl
use strict;
use warnings;
my $str = '1016.1 7.7 20.6 13.8 72.9 215.0 6.64 3.94 5.33 0.0 31.3 31.
+3 0.0 0.0 716.7 8461717.0 NaN 7750386.0 2224256.0 38507.5 10.9 1016.4
+ 7.7 21.1 13.7 71.1 222.0 6.82 4.67 4.95 0.0 2.1 2.1 0.0 0.0 727.6 11
+081057.0 NaN 10146792.0 2372829.8 40682.2 11.2 1016.4 7.7 21.6 13.2 6
+7.7 220.0 6.71 4.37 5.1 0.0 0.8 0.8 0.0 0.0 749.9 13780637.0 NaN 1261
+4863.0 4011495.3 44407.4 11.3 1016.4 7.7 21.7 12.3 64.8 216.0 6.64 3.
+98 5.31 0.0 0.0 0.0 0.0 0.0 693.8 16278537.0 NaN 14898300.0 5267137.5
+ 47598.7 11.1 1016.1 7.7 22.1 9.8 56.4 224.0 6.72 4.7 4.81 0.0 0.0 0.
+0 0.0 0.0 613.8 18488268.0 NaN 16917212.0 7048784.0 54656.4 11.3 1015
+.9 7.7 21.6 10.3 59.0 223.0 6.52 4.48 4.74 0.0 0.0 0.0 0.0 0.0 508.5
+20319020.0 NaN 18588124.0 8487086.0 52728.9 11.1 1016.0 7.7 20.6 11.8
+ 66.4 217.0 6.22 3.81 4.93 0.0 0.2 0.0 0.0 0.2 387.0 21712000.0 NaN 1
+9859482.0 9532787.0 45898.5 10.5 1016.1 7.7 19.9 12.7 71.8 219.0 5.97
+ 3.74 4.64 0.0 3.3 0.0 0.0 3.3 257.6 22639382.0 NaN 20706258.0 101554
+80.0 39930.0 9.9 1016.4 7.7 19.4 13.2 74.9 221.0 5.25 3.43 3.97 0.0 1
+.8 0.0 0.0 1.8 140.2 23144020.0 NaN 21166678.0 10436559.0 36483.1 9.3
+ 1016.4 7.7 19.2 12.5 73.1 232.0 4.04 3.19 2.44 0.0 6.8 2.7 4.3 0.0 4
+7.7 23315526.0 NaN 21323276.0 10494672.0 38582.3 8.0 1016.3 7.7 19.2
+11.2 68.6 262.0 2.52 2.49 0.34 0.0 53.8 0.0 24.2 39.1 1.8 23322102.0
+NaN 21329584.0 10494672.0 43729.5 6.0 1016.5 7.7 19.0 11.0 68.8 247.0
+ 2.3 2.08 0.98 0.0 100.0 0.0 99.5 100.0 0.1 23322252.0 NaN 21329518.0
+ 10494672.0 43367.8 3.7 1016.7 7.7 18.6 12.0 73.4 225.0 2.3 1.57 1.67
+ 0.0 54.2 0.0 46.3 14.7 0.0 23322252.0 NaN 21329518.0 10494672.0 3821
+9.0 3.6 1016.8 7.7 18.4 14.1 81.9 196.0 2.97 0.75 2.88 0.0 86.4 0.1 8
+2.5 21.9 0.1 23322356.0 NaN 21329518.0 10494672.0 28237.8 4.4 1017.0
+7.7 18.5 14.4 82.5 199.0 3.04 0.87 2.92 0.0 99.7 53.1 65.1 98.6 0.0 2
+3322262.0 NaN 21329518.0 10494672.0 27807.3 5.0 1017.0 7.7 18.7 14.1
+80.8 195.0 2.52 0.57 2.46 0.0 100.0 71.6 23.1 99.9 0.1 23322488.0 NaN
+ 21329518.0 10494672.0 29676.3 4.6 1017.2 7.7 18.8 14.0 80.0 196.0 2.
+96 0.78 2.85 0.18 100.0 100.0 100.0 99.7 0.4 23323732.0 NaN 21330926.
+0 10494672.0 7893.4 4.6 1017.1 7.7 18.6 14.2 81.3 155.0 2.05 -0.98 1.
+8 1.3 93.7 58.4 63.3 57.6 15.0 23377624.0 NaN 21380274.0 10494672.0 2
+9305.2 4.6 1017.1 7.7 18.6 13.8 80.1 170.0 3.61 -0.55 3.56 1.3 92.8 8
+9.9 30.2 1.7 72.1 23637418.0 NaN 21617728.0 10494761.0 31078.9 5.7 10
+17.3 7.7 18.9 14.9 83.3 178.0 2.97 -0.13 2.98 1.3 86.3 55.2 68.3 0.0
+130.9 24108452.0 NaN 22048990.0 10494719.0 27161.4 5.4 1017.5 7.7 19.
+7 15.3 81.4 197.0 2.29 0.48 2.24 1.3 57.2 22.3 44.9 0.0 302.4 2519711
+8.0 NaN 23044054.0 10494678.0 28930.3 4.8 1017.7 7.7 20.4 14.6 76.5 1
+98.0 1.82 0.28 1.78 1.3 66.0 10.1 62.2 0.0 413.4 26685568.0 NaN 24404
+956.0 10494752.0 34279.5 4.0 1017.5 7.7 20.7 13.8 72.6 161.0 2.67 -1.
+21 2.33 1.3 84.8 0.0 84.8 0.0 429.6 28232080.0 NaN 25819190.0 1049545
+1.0 38634.6 4.6 1017.4 7.7 21.2 13.9 71.5 159.0 3.15 -1.39 2.81 1.3 9
+7.7 0.2 97.7 0.0 444.8 29833140.0 NaN 27281036.0 10495527.0 39613.4 5
+.3 1017.6 7.7 22.0 13.7 68.4 161.0 2.93 -1.21 2.62 1.3 86.6 1.1 71.8
+51.9 558.1 31842300.0 NaN 29115360.0 10495682.0 42999.5 5.7 1017.4 7.
+7 23.6 11.8 58.0 143.0 2.81 -1.89 2.04 1.3 5.5 0.0 4.9 0.6 628.6 3410
+5344.0 NaN 31181590.0 10496878.0 53270.7 5.5 1017.3 7.7 23.9 10.7 54.
+4 139.0 3.68 -2.68 2.49 1.3 10.6 0.0 7.9 2.9 673.6 36530392.0 NaN 333
+94718.0 10504540.0 55330.6 6.5 1017.2 7.7 23.6 11.7 57.8 141.0 4.98 -
+3.24 3.81 1.3 37.6 0.0 5.4 34.0 671.6 38948084.0 NaN 35600932.0 10572
+389.0 53759.2 8.3 1017.3 7.7 23.0 12.4 61.2 145.0 5.0 -2.93 4.05 1.3
+24.6 0.0 24.6 0.0 563.3 40976088.0 NaN 37452260.0 10577521.0 51063.5
+8.4 1017.1 7.7 22.9 13.0 63.6 145.0 4.66 -2.77 3.78 1.3 28.6 0.0 7.3
+22.9 465.8 42652852.0 NaN 38982672.0 10578041.0 48385.1 8.3 1017.0 7.
+7 22.8 12.7 62.7 143.0 4.44 -2.77 3.46 1.3 38.8 0.0 34.2 6.9 350.8 43
+915868.0 NaN 40134736.0 10578632.0 49670.0 7.6 1016.7 7.7 22.4 12.8 6
+4.2 138.0 4.17 -2.85 3.03 1.3 10.5 0.0 2.4 8.3 232.2 44751760.0 NaN 4
+0897580.0 10579198.0 48024.7 7.1 1016.5 7.7 21.7 14.3 71.1 128.0 3.16
+ -2.53 1.89 1.3 0.9 0.0 0.0 0.9 131.4 45224988.0 NaN 41329512.0 10638
+878.0 40243.3 6.6 1016.6 7.7 21.2 14.9 74.9 124.0 2.97 -2.45 1.71 1.3
+ 25.2 0.0 2.4 23.3 44.4 45384908.0 NaN 41475220.0 10681477.0 35910.2
+4.9 1016.4 7.7 20.9 15.2 77.1 119.0 2.62 -2.3 1.25 1.3 6.4 0.0 5.8 0.
+7 1.6 45390616.0 NaN 41480160.0 10681174.0 33633.9 4.5 1016.3 7.7 20.
+6 15.2 78.0 77.0 3.04 -2.98 -0.49 1.3 1.5 0.0 1.0 0.4 0.1 45390552.0
+NaN 41479900.0 10681174.0 32898.9 4.4 1016.3 7.7 19.8 15.5 82.0 83.0
+3.6 -3.54 -0.59 1.3 39.6 1.5 0.0 38.6 0.1 45390516.0 NaN 41479900.0 1
+0681174.0 28384.6 5.4 1016.2 7.7 19.8 15.2 80.9 80.0 3.99 -3.89 -0.88
+ 1.3 79.8 0.0 0.1 79.8 0.0 45390256.0 NaN 41479900.0 10681174.0 29798
+.2 6.2 1016.1 7.7 19.9 14.5 78.0 83.0 4.55 -4.5 -0.65 1.3 72.8 1.8 5.
+5 70.7 0.1 45390380.0 NaN 41479900.0 10681174.0 33125.9 6.9 1015.7 7.
+7 19.7 14.8 79.8 79.0 4.93 -4.83 -1.06 1.3 82.5 4.1 0.9 81.6 0.1 4539
+0640.0 NaN 41479900.0 10681174.0 31134.5 7.6 1015.4 7.7 19.6 15.2 81.
+5 85.0 5.18 -5.15 -0.54 1.3 60.2 1.5 9.8 55.2 2.6 45399888.0 NaN 4148
+8504.0 10681174.0 29202.1 8.0 1015.0 7.7 19.7 16.0 84.4 90.0 5.29 -5.
+28 -0.08 1.3 72.2 0.4 15.8 66.8 32.6 45517296.0 NaN 41595768.0 106811
+74.0 25601.0 8.5 1014.9 7.7 19.9 16.2 84.5 91.0 5.63 -5.62 -0.07 1.3
+62.2 0.1 3.6 60.7 99.1 45873920.0 NaN 41921652.0 10681188.0 25504.3 8
+.9 1014.7 7.7 20.5 16.2 82.2 97.0 5.27 -5.26 0.48 1.3 93.6 2.4 5.9 93
+.0 169.2 46483036.0 NaN 42477320.0 10681384.0 28098.3 8.7 1014.6 7.7
+20.8 16.1 80.8 100.0 5.36 -5.29 0.87 1.3 99.7 2.6 12.5 99.7 196.9 471
+91680.0 NaN 43123740.0 10681074.0 29718.5 8.7 1014.2 7.7 21.7 16.3 78
+.2 101.0 5.17 -5.08 0.9 1.3 100.0 0.0 5.8 100.0 290.9 48239060.0 NaN
+44080204.0 10681266.0 32552.5 8.4 1014.1 7.7 22.3 16.4 76.7 102.0 5.6
+9 -5.59 1.04 1.3 100.0 0.0 0.3 100.0 344.6 49479400.0 NaN 45212932.0
+10694658.0 34180.5 9.1 1013.5 7.7 22.0 16.4 77.6 94.0 6.21 -6.2 0.29
+1.3 100.0 0.0 14.7 100.0 258.3 50409360.0 NaN 46061948.0 10725906.0 3
+3338.7 9.7 1013.5 7.7 21.4 15.2 75.4 95.0 6.7 -6.69 0.48 1.3 100.0 0.
+0 79.6 100.0 115.0 50823272.0 NaN 46440372.0 10725795.0 35955.3 10.4
+1013.1 7.7 22.0 15.2 73.4 91.0 6.86 -6.86 0.07 1.3 99.5 0.0 78.3 97.8
+ 235.6 51671476.0 NaN 47214844.0 10727690.0 38242.3 10.8';
my @fields = split / /, $str;
my $NaNcount = grep { $_ eq 'NaN' } @fields;
print "There are " . scalar @fields .
" fields in the line of which $NaNcount are NaN.\n";
If you really only want the first four, then split / /, $str, 5 will bundle all the stuff you don't want into the unused 5th list item.
HTH.
| [reply] [d/l] [select] |
|
|
|
|
Re: Repeating a capture group pattern within a pattern
by LanX (Saint) on Jul 15, 2024 at 11:14 UTC
|
"There is more than one way to do it" ™ depending on your use case.
The "problem" is not that you can't repeat a pattern in Perl, but that only the last captures are kept for explicit (...) groups.
One way is a code section to store the current capture groups.
Another to create explicit captures.
DB<25> $_='1016.1 7.7 NaN -20.6 3.8 72.9 215.0'
DB<26> $pat = qr(NaN|-?\d+\.\d)
DB<27> x m/($pat)/g
0 1016.1
1 7.7
2 'NaN'
3 '-20.6'
4 3.8
5 72.9
6 215.0
DB<28> x m/($pat)(?:\s|$)/g
0 1016.1
1 7.7
2 'NaN'
3 '-20.6'
4 3.8
5 72.9
6 215.0
DB<29> x (m/($pat)(?:\s|$)/g)[0..3]
0 1016.1
1 7.7
2 'NaN'
3 '-20.6'
...
DB<33> x m/(?:($pat)(?:\s|$)){4}/
0 '-20.6'
DB<34> x m/(?:($pat)(?:\s|$)(?{push @a,$1})){4}/
0 '-20.6'
DB<35> x @a
0 1016.1
1 7.7
2 'NaN'
3 '-20.6'
DB<36>
...
DB<47> $delim = '(?:\s|$)'
DB<48> p $explicit= "($pat)$delim" x 4
((?^u:NaN|-?\d+\.\d))(?:\s|$)((?^u:NaN|-?\d+\.\d))(?:\s|$)((?^u:NaN|-?
+\d+\.\d))(?:\s|$)((?^u:NaN|-?\d+\.\d))(?:\s|$)
DB<49> x m/$explicit/g
0 1016.1
1 7.7
2 'NaN'
3 '-20.6'
DB<50>
| [reply] [d/l] [select] |
Re: Repeating a capture group pattern within a pattern
by talexb (Chancellor) on Jul 16, 2024 at 00:25 UTC
|
To me, the simpler solution would just be to split on a space, then use the regex on each of the four elements.
It's also possible that I'm missing something.
Alex / talexb / Toronto
Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.
| [reply] |
Re: Repeating a capture group pattern within a pattern
by Anonymous Monk on Jul 18, 2024 at 05:46 UTC
|
my $x = "0.01 NaN 2.30 4.44";
Match the form rather than the content of the data:
my $r1 = qr/(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/;
Or ditch the pattern and split, as others suggest:
my ($d, $e, $f, $g) = split /\s+/, $x;
| [reply] |
Re: Repeating a capture group pattern within a pattern
by WithABeard (Beadle) on Jul 25, 2024 at 11:08 UTC
|
Maybe I'm missing something, but this doesn't seem too difficult:
> perl -e 'my $x = "0.01 NaN 2.30 4.44";
my ($d, $e, $f, $g) = ($x =~ /([Na0-9\.\-\0]+\b)/g);
print "d: $d, e: $e, f: $f, g: $g";'
output:
d: 0.01, e: NaN, f: 2.30, g: 4.44
the /g flag makes it return a list of all matches.
I changed \s+ to \b (word-boundary) since the last piece doesn't have a space after it | [reply] [d/l] [select] |
|
|