Should be a simple spaces/digits regex....but I'm turning grey!

Re: Should be a simple spaces/digits regex....but I'm turning grey! (?=)
by tye (Sage) on Aug 05, 2014 at 02:14 UTC

#           n-1 spaces or digits: vv vvvvv: last digit
(?:(?=\s[\s0-9]|[0-9]{2})[\s0-9]){15}[0-9]
#     (^^^^^^^^^^^^^^^^) ^^^^^^^: leading chars
[download]

and only allow \s\s, \s\d, or \d\d at each point along the way.

Note that I always use [0-9] and never \d as, these days, \d includes tons of characters besides '0'..'9'.

- tye

Re^2: Should be a simple spaces/digits regex....but I'm turning grey! (?=)

by Bethany (Scribe) on Aug 05, 2014 at 02:16 UTC

Note that I always use 0-9 and never \d as, these days, \d includes tons of characters besides '0'..'9'.

Re^2: Should be a simple spaces/digits regex....but I'm turning grey! (?=)

by Anonymous Monk on Aug 05, 2014 at 02:25 UTC

Note that I always use [0-9] and never \d as, these days, \d includes tons of characters besides '0'..'9'.

Only if you let it :) use (?a) or /a its ASCII-restrict (or ASCII-safe);

Re^3: Should be a simple spaces/digits regex....but I'm turning grey! (/a)

by tye (Sage) on Aug 05, 2014 at 02:36 UTC

Yeah, if you've got a version of Perl that supports such. Way too many versions of Perl after \d began including Klingon^* digits yet before /a was implemented.

Plus, /a messes with more than just \d. I have yet to run into a single project I was involved in where a string of Klingon^* digits would be correctly parsed as a numeric value. But I've touched plenty of projects where \w including more letters than a-z was quite useful. Perl itself is that way, after all. Sure, you could write (?a:\d) but that's just longer and less clear (and less portable).

So I suspect I'll be sticking with [0-9] for quite a while still.

^* No, Unicode doesn't actually include Klingon (yet, anyway).

- tye

Re: Should be a simple spaces/digits regex....but I'm turning grey!
by Athanasius (Archbishop) on Aug 05, 2014 at 02:28 UTC

Hello viffer,

As others have noted, your question is somewhat under-specified. Here is my take on what you may be looking for:

#! perl
use strict;
use warnings;

my $len = 7;

for my $s ('    999', '   9999', '9999999', '      9', '  99  9', '   
+9')
{
    if (length $s == $len && $s =~ /^\s*\d+$/)
    {
        printf "%-*s matches\n",        $len + 2, "|$s|";
    }
    else
    {
        printf "%-*s does not match\n", $len + 2, "|$s|";
    }
}
[download]

Output:

12:33 >perl 960_SoPW.pl
|    999| matches
|   9999| matches
|9999999| matches
|      9| matches
|  99  9| does not match
|   9|    does not match

12:33 >
[download]

When the size of the field changes, no need for a new regex — just change the value of $len.

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Should be a simple spaces/digits regex....but I'm turning grey!
by Anonymous Monk on Aug 05, 2014 at 02:15 UTC

'     999' or
'   9999' or
'9999999'
'      9'.
[download]

Is "field" length 7 or 8 chars? What is a "field" (part of a larger string)?

but when you're checking a 16 byte filed ZZZZZZZZZZZZ9V99

Um, there is no description of what you want for that one :) one thing at a time?

whilst doing the job,is getting ludicrously large and unreadable

Instead of one regex, write ~~twelve~~ thirteen??

Stop writing unreadable regex :) Write beautiful regex, not ugly, so you can read :) How can I hope to use regular expressions without creating illegible and unmaintainable code?

Someone at work suggested using an sprintf in it's stead within the regex, but I must be honest and say that suggestion has left me clueless on how to do it.

Are you trying to format a string? If you are, go ahead and use sprintf, otherwise ...

write a function?

Hope this helps

Re^2: Should be a simple spaces/digits regex....but I'm turning grey!

by viffer (Beadle) on Aug 05, 2014 at 07:26 UTC

Re: Should be a simple spaces/digits regex....but I'm turning grey!
by Laurent_R (Canon) on Aug 05, 2014 at 06:40 UTC

$num = $1 if length($string) == 7 and $string =~ /^\s*(\d+)$/;
[download]

Re^2: Should be a simple spaces/digits regex....but I'm turning grey!

by viffer (Beadle) on Aug 05, 2014 at 07:30 UTC

/^\s*\d+$/

I'm officially an idiot :)

Re^3: Should be a simple spaces/digits regex....but I'm turning grey!

by Bethany (Scribe) on Aug 05, 2014 at 15:00 UTC

I'm officially an idiot :)

Heavens, no! An idiot is a person who doesn't ask for help. Glad it works for you.

Re^4: Should be a simple spaces/digits regex....but I'm turning grey!

by viffer (Beadle) on Aug 05, 2014 at 23:15 UTC

Re: Should be a simple spaces/digits regex....but I'm turning grey!
by BillKSmith (Monsignor) on Aug 05, 2014 at 13:26 UTC

use strict;
use warnings;
use Test::Simple qw(no_plan);
my %test_cases = (
    '   999' => 'valid',
    '  9999' => 'valid',
    '999999' => 'valid',
    '     9' => 'valid',
    ' 99 9'  => 'invalid',
);

foreach my $case (keys %test_cases) {
    my $does_match
        = $case =~ /
            ^\s*        # any number of leading spaces
            \d+         # followed by a number of digits
            (:?
            [^\s]*      # but cant have any spaces after a digit has b
+een found
            \d+         # It must contain at least one digit at the en
+d
            )?$
        /x ;
    ok( !($does_match xor $test_cases{$case} eq 'valid'),
        "'$case': $test_cases{$case}" );
}
[download]

ok 1 - '999999': valid
ok 2 - '   999': valid
ok 3 - '     9': valid
ok 4 - '  9999': valid
ok 5 - ' 99 9': invalid
1..5
[download]

Bill

Re^2: Should be a simple spaces/digits regex....but I'm turning grey!

by Anonymous Monk on Aug 05, 2014 at 14:25 UTC

But that matches " 999foo9", which isn't valid. Also viffer already gave the solution.

Re^3: Should be a simple spaces/digits regex....but I'm turning grey!

by BillKSmith (Monsignor) on Aug 05, 2014 at 16:12 UTC

Bill

Re^4: Should be a simple spaces/digits regex....but I'm turning grey!

by Anonymous Monk on Aug 05, 2014 at 21:20 UTC

Re: Should be a simple spaces/digits regex....but I'm turning grey!
by Anonymous Monk on Aug 05, 2014 at 12:57 UTC

Tasks like this can also be flummoxed by greed. The greed of a regular expression, that is, not the Deadly Sin. Left to its own devisings, a regex will normally consume the longest substring that qualifies. It must be told not to be so greedy; to take the shortest one instead. Probably not applicable to this case but worth keeping in mind.