Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Regular expression for hexadecimal number

by isha (Sexton)
on Jul 30, 2007 at 12:46 UTC ( #629540=perlquestion: print w/replies, xml ) Need Help??
isha has asked for the wisdom of the Perl Monks concerning the following question:

I want to check that the value is between 0x0 and 0xffff or not. how can i do with regular expression.
Please tell the regular expression to check the hexadecimal number..
  • Comment on Regular expression for hexadecimal number

Replies are listed 'Best First'.
Re: Regular expression for hexadecimal number
by FunkyMonk (Chancellor) on Jul 30, 2007 at 12:55 UTC
    /^ # start of string 0x # 0x prefix [0-9A-F] # a hex digit {1,4} # 1 to 4 of them $ # end of string /xi # x = allow spaces & comments in regexp # i = ignore case

    or just /^0x[0-9A-F]{1,4}$/i

    update: slight rewording to keep ww happy;)

Re: Regular expression for hexadecimal number
by Anno (Deacon) on Jul 30, 2007 at 13:40 UTC
    Checking a string whether it is the hexadecimal representation of some number (i.e. consists only of hexadecimal digits) is easily done with a regex. Assuming your string is in $_,
    does that. I'm using the named character class [[:xdigit:]] in preference to the equivalent [0-9A-Fa-f].

    Checking whether a number is in a specific range is usually quite hard to do with a regex. While your specific case is an exception (the range is all numbers with up to four digits, which can easily be checked with a regex) I wouldn't make use of that. Instead I'd use a numeric comparison which also works for other ranges:

    /[[:xdigit]]+/ and hex( $_) <= 0xFFFF;
    There is no need to check for >= 0 because Perl treats hexadecimal as unsigned, so every valid hex string will be non-negative.


    Update: Trivial (=silly) mistake corrected.

Re: Regular expression for hexadecimal number
by grinder (Bishop) on Jul 30, 2007 at 13:24 UTC

    Something like....


    ... should do the trick. Curlies are pretty slow, relatively speaking, though. You will achieve approximately equivalent results with


    except that this will also match "0xdeadbeefcafe" and the like (which might be considered a feature).

    Note that you do not want to use, as suggested elsewhere in this thread:

        /0x[\da-f]{1,4}/i # or /0x[\da-f]+/i

    ...since that would allow 0X2a (note the uppercase X) which is not a legal hexadecimal definition.

    • another intruder with the mooring in the heart of the Perl

      Hmm, this answers a question that's been in the back of my mind for a while, "Can you turn on/off modifiers in the middle of a regex."

      Thanks for prompting me to pop over to Perldoc to learn about it.

      A question, though: The goal is to keep the 0x case sensitive (to disallow 0X, as you said), but let the hex that follows match in a case insensitive way (i.e. match both 0xFF and 0xff)? If so, doesn't (?-i:PATTERN) actually turn off the insensitivity modifier (which wasn't on?). Based on my (probably wrong) interpretation of what I just learned at Perldoc, it seems like you'd want (?i:PATTERN) to turn it on.

        What you describe is accomplished not by using a modifier, but by being explicit in the definition of the char class:


        There's no reason to have or use modifiers on only part of a regex, when one can simply use the mechanisms already available to accomplish the same goal.

        Ramblings and references
        The Code that can be seen is not the true Code
        I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: Regular expression for hexadecimal number
by radiantmatrix (Parson) on Jul 30, 2007 at 15:18 UTC

    I'm curious about your actual requirement, because I'm not sure a regex is the way you want to go. If you have a hexadecimal value already, working with it as a number is pretty easy: using a regex is working with it as a string.


    my $hex_val = '0xCAFE'; if ( hex($hex_val) >= 0 && hex($hex_val) <= 0xFFFF ) { print "$hex_val is in range\n"; } else { print "$hex_val is out of range\n"; }

    All this requires is use of the hex function.

    Ramblings and references
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: Regular expression for hexadecimal number
by andreas1234567 (Vicar) on Jul 30, 2007 at 13:36 UTC
    The Data::Validate module has a is_hex function for which the source looks like this:
    sub is_hex { my $self = shift if ref($_[0]); my $value = shift; return unless defined $value; return if $value =~ /[^0-9a-f]/i; $value = lc($value); my $int = hex($value); return unless (defined $int); my $hex = sprintf "%x", $int; return $hex if ($hex eq $value); # handle zero stripping if (my ($z) = $value =~ /^(0+)/) { return "$z$hex" if ("$z$hex" eq $value); } return; }
    Note that this module does not support the '0x' hexadecimal syntax, so you would have to strip that off yourself.
    use warnings; use strict; use Data::Validate qw(:math); my @data = qw(1234 ABCD 0x1234 foo 0xABCD bar); foreach my $val (@data) { if (defined(is_hex($val))) { print "$val\tis hex"; } else { print "$val\tis not hex"; } } __END__
    $ perl -l 1234 is hex ABCD is hex 0x1234 is not hex foo is not hex 0xABCD is not hex bar is not hex
Re: Regular expression for hexadecimal number
by mjscott2702 (Pilgrim) on Jul 30, 2007 at 13:39 UTC
    I have used regex's in the past to check a minimum and maximum number of digits, usually using base10 integers - check between 0 and 99 for example.
    One thing I'd like to be able to do is check that a value is between two limits, without capturing and subsequent processing (for reasons beyond the scope of this discussion). Is there any regex construct that will check that the value is between two other values? For example, check that it is between 3010 and 4123, inclusively?

      Sure you could; the more important question is should you. You're probably better off implementing a "lower level" check (is this a 4 digit number) at the regex level and then using comparisons in code to check range membership.

      What you want to be worried about is when your requirements shift and now you've got to check if it's between 3021 and 5123 for the comparisons-in-code version it's just a matter of changing two numbers (or redefining FOO_RANGE_MIN and FOO_RANGE_MAX constants since one would never use magic numbers inline, of course :) rather than coming up with a new clever regex to match the new rage.

      Update: Just to show it can be done (but again, probably shouldn't): /^ 3 (?: 0 [1-9] \d | [1-9] \d \d ) | 4 (?: 0 \d \d | 1 (?: [01] \d | 2 [0123] ) ) $/x

        As I pointed out, the reason for doing this is off-topic, let's just say I need a regex that ensures a value is between A and B, where A and B are defined elsewhere (in this case, it's an installer that supports Perl regexes, and I want to check that the port number entered by the user is within an allowable range. Hence the reason I can't post-process the value, I need a regex that does it in one step).

        Ideally, I wanted to be able to pass A and B to the regex, in a way similar to defining {min, max} for the number of characters to match.

        I want to avoid the breaking down of the min and max values into a sequence of ranges, for exactly the reason you point out.

        I still think it is a reasonable question ....
Re: Regular expression for hexadecimal number
by mickeyn (Priest) on Jul 30, 2007 at 12:58 UTC
    do mean something like m/0x[0-9a-f]+/ ?




      No downvote, but your regex misses the limit of 1 to 4 "f"s in OP's statement of the problem.

      See FunkyMonk's prior reply (which <span class="pedantic picky)"> errs (albeit, in it's original form, and only marginally) in the comment at line 4:

      {1,4}  #   between 1 & 4 of them

      only insomuch as the set of integers between "1 & 4" includes only "2" and "3."</span>

      Hmm... :-) maybe the "picky pedantic" tag should enclose this whole note....

        Depends on whether one reads "between 1 & 4" as [1,4] (inclusive; which is what the regex notation used implements) or (1,4) (exclusive; which would be {2,3}). Given that the inclusive sense was implied by the code itself one could probably overlook it (of course there's a huge potential for debate on whether an unqualified between should be read as one or the other . . . :)

        (And if you want pedantic nits to pick, one could also have knocked everyone's use of [0-9a-f] rather than [[:xdigit:]] . . . :)

        Update: Bah, it's [[:xdigit:]] not [[:hexdigit:]]. Never try and pedant before caffeine . . .

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://629540]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2018-05-28 05:52 GMT
Find Nodes?
    Voting Booth?