Re: Golf challange: match U.S. State names
by dws (Chancellor) on Jun 06, 2001 at 09:39 UTC
|
A hybrid approach, for 110 109 (including the honorary state KT).
print
/^\w\w\z/&("MNCAKSCOHINMOKTNVALARIDCTX SDE GAZ ORIL"=~/$_/|
/(FL|IA|KY|M[ADEIST]|N[DEHNJY]|PA|[UV]T|W[AIVY])/)
for qw(AK AL AR AZ CA CO CT DC DE FL GA HI IA ID IL IN KS
KT KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM
NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV
WY)
Getting here involved finding the longest unbroken string that could hold overlapping state names, adding a few more with breaks (spaces), then handling the strays with a regexp. A close inspection will show that Rhode Island is handled twice. That actually saved strokes.
I started down the path of looking for the shortest string that contains all of the state names (with breaks). That script is still running... So far, the solution above is shorter.
| [reply] [Watch: Dir/Any] [d/l] |
|
/^\w\w\z/&("MNCAKSCOHINMOKTNVALARIDCTX SDE GAZ ORIL IAKY"=~/$_/|
/(FL|M[ADEIST]|N[DEHNJY]|PA|[UV]T|W[AIVY])/)
Another stoke off by adding a redundant check for Alaska, which allowed me to move Iowa and Kentucky out of the regexp and into the (more compact) string.
This suggests that there's a lesson in here about how a bit of redundancy in data can lead to more compact representations. I think we often hear problems like "match every state and nothing else" as "match every state exactly once, and nothing else."
| [reply] [Watch: Dir/Any] [d/l] |
|
print
/^..\z/&("MNCAKSCOHINMOKTNVALARIDCTX
SDE
GAZ
ORIL"=~/$_/|
/FL|IA|KY|M[ADEIST]|N[DEHJY]|PA|[UV]T|W[AIVY]/)
for qw(AK AL AR AZ CA CO CT DC DE FL GA HI IA ID IL IN KS
KT KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM
NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV
WY MY MNC .* ), " SD", "AZ\n"
(Dropping the inner parens around the or-chain and disregarding the state of NN which is not being honored today.)
OOPSDATE: Yep, matches if $_='.*' (see runriq immediately below and slightly to the right). 
It needs /\Q$_/, as so often happens, which defeats the character savings.  I was looking at the \w as avoiding matching the spaces (which it also does).    (Oh, and thanks runriq and I'm sorry too!)
  p
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
Re: Golf challange: match U.S. State names
by runrig (Abbot) on Jun 06, 2001 at 03:21 UTC
|
Not just a golf solution, its a golf solution
generator:
Update: 'Borrowed' tilly's list of
states. (my result is a 113 character
string - though I too am now wondering where 'KT' is ...
I'm told its a brand of whiskey, so I'll keep it in - besides,
it is now the 'official' list for this challenge :-)
Another update: Added dws's fix. #!/usr/local/bin/perl -l -w
use strict;
my @states=qw(
AK AL AR AZ CA CO CT DC DE FL GA HI IA ID IL IN KS
KT KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM
NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV
WY);
my %states;
@states{@states} = ();
my @results;
my $is_done;
while (%states) {
my ($letter, $is_first);
($letter, $is_first, $is_done) = get_next_letter(\%states);
push(@results, keys %states), last if $is_done;
my $pos = $is_first ? 0 : 1;
my @states = grep { substr($_,$pos,1) eq $letter } keys %states;
my $chr_class =
'[' . join('', map { substr($_, 1-$pos, 1) } @states) . ']';
push @results, $is_first ? $letter.$chr_class : $chr_class.$letter;
delete @states{@states};
}
print '/^(',join("|", @results),')$/';
sub get_next_letter {
my $states = shift;
my (%first, %second, $is_done);
for (keys %$states) {
my ($first, $second) = split '';
$first{$first}++;
$second{$second}++;
}
my ($first) = sort { $first{$b} <=> $first{$a} } keys %first;
my ($second) = sort { $second{$b} <=> $second{$a} } keys %second;
$is_done = 1 if $first{$first} == 1 and $second{$second} == 1;
($first{$first} > $second{$second}) ?
($first, 1, $is_done) : ($second, 0, $is_done);
}
# 113 characters!
# Add 5 characters ('pop=~' to the beginning) if you want
# it to work on @_ instead of $_
/^([WLPCVGIM]A|N[CDEHJMVY]|M[DEINOST]|[VKUC]T|A[ZKLR]|[WHR]I|O[HKR]|I[
+LND]|[SD]C|[KW]Y|T[XN]|KS|SD|DE|FL|WV|CO)$/
| [reply] [Watch: Dir/Any] [d/l] |
Re: Golf challange: match U.S. State names
by merlyn (Sage) on Jun 06, 2001 at 06:57 UTC
|
sub state {
1&index" AKALCAARAZCOCTDCDEFLGAHIIAIDILINKSKTKYLAMAMDMEMIMNNCMOMSMTOHN
+DNENHNJNMPANVNYOKORRISCSDTNTXUTVAVTWAWIWVWY",pop
}
And it's 117 chars if you replace pop with $_, as the original rules ask.
The character string was constructed to ensure that a valid state would
never match a non-valid boundary. It was easier than I thought:
#!/usr/bin/perl
use strict;
$|++;
my @states = qw(
AK AL AR AZ CA CO CT DC DE FL GA HI IA ID IL IN KS
KT KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM
NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV
WY
);
my %states = map { $_ => 1 } @states;
try("", \@states);
sub try {
print "try: $_[0] @{$_[1]}\n";
my $prefix = shift;
my @later = @{+shift};
unless (@later) {
print "WINNER: $prefix\n";
exit;
}
for my $i (0..$#later) {
my $prefix2 = "$prefix$later[$i]";
if (length $prefix2 < 3 or not $states{substr($prefix2, -3, 2)}) {
## worth recursing
try($prefix2, [@later[0..$i-1, $i+1..$#later]]);
}
}
}
-- Randal L. Schwartz, Perl hacker | [reply] [Watch: Dir/Any] [d/l] [select] |
|
This has two problems: on a non-match, index will return -1, which, ANDed with 1, returns 1, so virtually any string will match. This sub will also match strings like "ALC" or just "A".
MeowChow
s aamecha.s a..a\u$&owag.print | [reply] [Watch: Dir/Any] |
Re: Golf challange: match U.S. State names
by no_slogan (Deacon) on Jun 06, 2001 at 21:53 UTC
|
106 charcters, making use of the fact that no abbreviation starts with x, y, or z. gack.
/^[A-W]\w\z/&&"OHILAKTXWVTNVARINYKSDE MIAZWIDCOKYFL MOR MNCAL WYMSCTNM
+TNE MD UT ME"=~/$_/|/[WGPM]A|N[DHJ]/
| [reply] [Watch: Dir/Any] [d/l] |
|
/^[A-W]\w\z/&&"OHILAKTXWVTNVARINYKSDE MIAZWIDCOKYFL MOR MNCAL WYMSCTNM
+TNJ NH UT"=~/$_/|/[WGPM]A|[MN][DE]/
| [reply] [Watch: Dir/Any] [d/l] |
Re: Golf challange: match U.S. State names
by Brovnik (Hermit) on Jun 06, 2001 at 02:48 UTC
|
How many states are you including ?
E.g. Are you going to count dependent territories as per the USPS
codes, or just 52 states.
-- Brovnik | [reply] [Watch: Dir/Any] |
|
I think it is the principle of the thing that counts. How about:
AK AL AR AZ CA CO CT DC DE FL GA HI IA ID IL IN KS
KT KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM
NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV
WY
And the obvious answer is, of course,
sub state {
grep$_[0]eq$_,'AKALARAZCACOCTDCDEFLGAHIIAIDILINKSKTKYLAMAMDMEMIMNMOMSM
+TNCNDNENHNJNMNVNYOHOKORPARISCSDTNTXUTVAVTWAWIWVWY'=~/../g
}
at 127 characters. I don't see an obviously better approach... | [reply] [Watch: Dir/Any] [d/l] [select] |
|
124 chars, using that set :)
sub state {
pop=~/^(A[KLRZ]|C[AOT]|D[CE]|FL|GA|HI|I[ADLN]|K[STY]|LA|M[ADEINOST]|
+N[CDEHJMVY]|O[HKR]|PA|RI|S[CD]|T[NX]|UT|V[AT]|W[AIVY])$/
}
MeowChow
s aamecha.s a..a\u$&owag.print | [reply] [Watch: Dir/Any] [d/l] |
|
|
|
|
But the match isn't just on the names, but also on the abutment of names. I think you will get some false positives Eg. 'ZC' and 'OC' etc...
Have fun,
Carl Forde
| [reply] [Watch: Dir/Any] |
|
|
|
Hmm, that's indeed 52 names. 50 states plus DC plus what?
| [reply] [Watch: Dir/Any] |
|
|
or just 52 states
uh.. fifty-howmany? last i heard, the U.S. only had 50 states and one district.
update: re: tilly's list:
uh... 'KT'? since when is Kentucky Tavern whiskey its own state?
.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |