Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

UK postcode regex

by Anonymous Monk
on Apr 21, 2005 at 13:34 UTC ( #449971=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

A quickie regarding UK postcode (zip) validation... Is it possible to do the following 3 transformations in one regex or am I asking too much !
Typical UK postcode : LE12 3GX

1.Insert a space before the last 3 characters if there isn't one there already.

2.If the first character of the second part (3 characters as above) is a letter O, swap this to a zero.(this is a very common error)

3. uc() the whole thing.

The final result would pass the validation: /^[A-Z]{1,2}\d{1,2}([A-Z])?\s\d[A-Z]{2}$/

TIA

Replies are listed 'Best First'.
Re: UK postcode regex
by Mutant (Priest) on Apr 21, 2005 at 13:40 UTC
    Have a look at this.

    Also beware of the Geo::Postcode module on CPAN. I've had it reject valid postcodes before.
      I trust you have submitted bug reports?
Re: UK postcode regex
by dragonchild (Archbishop) on Apr 21, 2005 at 13:38 UTC
    It sounds like you have three transformations which translate from your excellent specifications to three steps, plus one for validation.

    I'm serious about the specs - those are probably the best written specs I've ever seen in a Perlmonks question.

Re: UK postcode regex
by Roy Johnson (Monsignor) on Apr 21, 2005 at 13:51 UTC
    Two transformations works well:
    s/(\w{3,5})\s*(\w{3})/\U$1 $2/; s/ O/ 0/;
    Update: conforms to specs cited by Mutant.

    Caution: Contents may have been coded under pressure.
Re: UK postcode regex
by mlh2003 (Scribe) on Apr 21, 2005 at 13:57 UTC
    This is the regex I use for checking UK postcodes. It is similar, but uses alternation instead...
    # Check UK postcode format # Format is LD DLL, LLD DLL, LDD DLL, or LLDD DLL where L = letter, D += digit $postcode = uc($postcode); # Could be removed if you use the /i regex +switch if ($postcode !~ /^(([A-Z]\d|[A-Z][A-Z]\d|[A-Z]\d\d|[A-Z][A-Z]\d\d)\s( +\d|o)[A-Z][A-Z])$/i) { $error_msg .= "Invalid postcode for UK.<br /><br />\n"; }
    The (\d|o) has the letter o to cater for an incorrect o instead of a zero. To actually change it to a zero would require a second line (using the substitution operator). I don't know of any way to make that change in the regex line while it is being parsed... UPDATE: Looks like there are a couple of ways to do it in one line, including changing an incorrect letter o to a zero. I stand corrected :)
    _______
    Code is untested unless explicitly stated
    mlh2003
      An additional format not in you list is LLDL (WC1N).

      The two parts form the incode (which postoffice) and the outcode (street address). The outcode is always DLL. The incode can be LD, LDD, LLD, LLDD or LLDL.

      You can use ZZ99 9ZZ for an official "no postcode" and I think ZZ98 9ZZ also can be used for special purposes.
Re: UK postcode regex
by Gilimanjaro (Hermit) on Apr 21, 2005 at 13:55 UTC
    #!/usr/bin/perl -l $zipadidoda = 'le12 ogx'; print "In: $zipadidoda"; if( $zipadidoda =~ s/^([A-Z]{1,2}\d{1,2}[A-Z]?)\s?(O|\d)([A-Z]{2})$/uc +("$1 ".(0+$2).$3)/ie ) { print "Out: $zipadidoda"; } else { print "Invalid zipcode"; }

    You wanted a single re, so you got one!

    Or is the 'e' modified considered cheating?

    :)

    Update: removed redundant quotes

    Update: and fixed the space after a tip from ww

      A cheat that lets you avoid the /e cheat:
      s/^([A-Z]{1,2}\d{1,2}[A-Z]?)\s?(O|\d)([A-Z]{2})$/\U$1 ${\(0+$2)}$3/i

      Caution: Contents may have been coded under pressure.
      Gilimanjaro: your regex strips the space between segment_one and the last_three, if one exists, whereas OP sought to insert one, if there was no space before \d[A-Za-z]{2}
        You're right! Fixed in an update...
Re: UK postcode regex
by Molt (Chaplain) on Apr 21, 2005 at 15:05 UTC
    Unless you're golfing why not do this as three readable and maintainable regexes rather than one monolithic and messy thing?
      Thats exactly what I was thinking. But perhaps this is one of those, "I feel like I could do this, even though I probably won't want to" things.
Re: UK postcode regex
by tlm (Prior) on Apr 21, 2005 at 14:15 UTC

    Become One with Regexp::Common::zip.

    Update: R::C::z is a handy module (as is the entire Regexp::Common suite), but in this case irrelevant. Mea culpa. The authors do request that regexps for missing countries be sent to them.

    the lowliest monk

      It doesn't appear to have a UK option. It also wouldn't make the required corrections.

      Several countries, but not for the UK... Can the module be modified or extended to allow for UK postcodes?
      _______
      Code is untested unless explicitly stated
      mlh2003
Re: UK postcode regex
by Anonymous Monk on Apr 21, 2005 at 20:42 UTC
    Thanks for the responses.
    Actual validation was never an issue here.
    The original question related to something I have had to use for the best part of my career (ever tried to register a domain with Nominet with user-provided data ??) ... and overly relying on the Laziness aspect of the perfect programmer, I tend to opt for the cut and paste method a bit to often.

    Roy Johnsons response I found very legible , and of course pasting one line is a tad quicker than three ;)

    cheers
Re: UK postcode regex
by DrHyde (Prior) on Apr 22, 2005 at 09:47 UTC
    Your algorithm would also match something bogus like:

    OO00O 0OO

Re: UK postcode regex
by TedPride (Priest) on Apr 22, 2005 at 10:23 UTC
    You don't need to use regex for this either:
    substr($code, -3, 0) = ' ' if substr($code, -4, 1) ne ' '; substr($code, -3, 1) = '0' if substr($code, -3, 1) eq 'O'; $code = uc($code);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://449971]
Approved by dragonchild
Front-paged by dragonchild
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2021-12-09 03:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (36 votes). Check out past polls.

    Notices?