Re: UK postcode regex
by Mutant (Priest) on Apr 21, 2005 at 13:40 UTC
|
Have a look at this.
Also beware of the Geo::Postcode module on CPAN. I've had it reject valid postcodes before. | [reply] |
|
I trust you have submitted bug reports?
| [reply] |
Re: UK postcode regex
by dragonchild (Archbishop) on Apr 21, 2005 at 13:38 UTC
|
It sounds like you have three transformations which translate from your excellent specifications to three steps, plus one for validation.
I'm serious about the specs - those are probably the best written specs I've ever seen in a Perlmonks question.
| [reply] |
Re: UK postcode regex
by Roy Johnson (Monsignor) on Apr 21, 2005 at 13:51 UTC
|
Two transformations works well:
s/(\w{3,5})\s*(\w{3})/\U$1 $2/;
s/ O/ 0/;
Update: conforms to specs cited by Mutant.
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: UK postcode regex
by mlh2003 (Scribe) on Apr 21, 2005 at 13:57 UTC
|
This is the regex I use for checking UK postcodes. It is similar, but uses alternation instead...
# Check UK postcode format
# Format is LD DLL, LLD DLL, LDD DLL, or LLDD DLL where L = letter, D
+= digit
$postcode = uc($postcode); # Could be removed if you use the /i regex
+switch
if ($postcode !~ /^(([A-Z]\d|[A-Z][A-Z]\d|[A-Z]\d\d|[A-Z][A-Z]\d\d)\s(
+\d|o)[A-Z][A-Z])$/i) {
$error_msg .= "Invalid postcode for UK.<br /><br />\n";
}
The (\d|o) has the letter o to cater for an incorrect o instead of a zero. To actually change it to a zero would require a second line (using the substitution operator). I don't know of any way to make that change in the regex line while it is being parsed...
UPDATE: Looks like there are a couple of ways to do it in one line, including changing an incorrect letter o to a zero. I stand corrected :)
_______
Code is untested unless explicitly stated
mlh2003
| [reply] [d/l] |
|
An additional format not in you list is LLDL (WC1N).
The two parts form the incode (which postoffice) and the outcode (street address). The outcode is always DLL. The incode can be LD, LDD, LLD, LLDD or LLDL.
You can use ZZ99 9ZZ for an official "no postcode" and I think ZZ98 9ZZ also can be used for special purposes.
| [reply] |
Re: UK postcode regex
by Gilimanjaro (Hermit) on Apr 21, 2005 at 13:55 UTC
|
#!/usr/bin/perl -l
$zipadidoda = 'le12 ogx';
print "In: $zipadidoda";
if( $zipadidoda =~ s/^([A-Z]{1,2}\d{1,2}[A-Z]?)\s?(O|\d)([A-Z]{2})$/uc
+("$1 ".(0+$2).$3)/ie ) {
print "Out: $zipadidoda";
} else {
print "Invalid zipcode";
}
You wanted a single re, so you got one!
Or is the 'e' modified considered cheating?
:)
Update: removed redundant quotes
Update: and fixed the space after a tip from ww | [reply] [d/l] |
|
A cheat that lets you avoid the /e cheat:
s/^([A-Z]{1,2}\d{1,2}[A-Z]?)\s?(O|\d)([A-Z]{2})$/\U$1 ${\(0+$2)}$3/i
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
|
Gilimanjaro:
your regex strips the space between segment_one and the last_three, if one exists, whereas OP sought to insert one, if there was no space before \d[A-Za-z]{2}
| [reply] |
|
You're right! Fixed in an update...
| [reply] |
Re: UK postcode regex
by Molt (Chaplain) on Apr 21, 2005 at 15:05 UTC
|
Unless you're golfing why not do this as three readable and maintainable regexes rather than one monolithic and messy thing? | [reply] |
|
Thats exactly what I was thinking. But perhaps this is one of those, "I feel like I could do this, even though I probably won't want to" things.
| [reply] |
Re: UK postcode regex
by tlm (Prior) on Apr 21, 2005 at 14:15 UTC
|
Become One with Regexp::Common::zip.
Update: R::C::z is a handy module (as is the entire Regexp::Common suite), but in this case irrelevant. Mea culpa. The authors do request that regexps for missing countries be sent to them.
| [reply] |
|
| [reply] |
|
| [reply] |
Re: UK postcode regex
by Anonymous Monk on Apr 21, 2005 at 20:42 UTC
|
Thanks for the responses. Actual validation was never an issue here.
The original question related to something I have had to use for the best part of my career (ever tried to register a domain with Nominet with user-provided data ??) ... and overly relying on the Laziness aspect of the perfect programmer, I tend to opt for the cut and paste method a bit to often.
Roy Johnsons response I found very legible , and of course pasting one line is a tad quicker than three ;)
cheers
| [reply] |
Re: UK postcode regex
by DrHyde (Prior) on Apr 22, 2005 at 09:47 UTC
|
Your algorithm would also match something bogus like:
OO00O 0OO | [reply] |
Re: UK postcode regex
by TedPride (Priest) on Apr 22, 2005 at 10:23 UTC
|
You don't need to use regex for this either:
substr($code, -3, 0) = ' ' if substr($code, -4, 1) ne ' ';
substr($code, -3, 1) = '0' if substr($code, -3, 1) eq 'O';
$code = uc($code);
| [reply] [d/l] |