Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

[perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please

by princepawn (Parson)
on Oct 31, 2002 at 20:46 UTC ( [id://209517]=perlquestion: print w/replies, xml ) Need Help??

princepawn has asked for the wisdom of the Perl Monks concerning the following question:

sub by_priority { return -1 if $a =~ /^Organization/ ; return 1 if $b =~ /^Organization/ ; return -1 if $a =~ /^Service/ ; return 1 if $b =~ /^Service/ ; return -1 if $a =~ /^FAQ/ ; return 1 if $b =~ /^FAQ/ ; return 0 ; } @htmlfiles = sort by_priority @htmlfiles;
  • Comment on [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
  • Download Code

Replies are listed 'Best First'.
Re: [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
by Corion (Patriarch) on Oct 31, 2002 at 21:09 UTC

    I'm not sure what algorithm sort uses, but in general, it's a bad idea not to implement identity whenever you write a comparision function.

    In your example, the function will always return -1 for the three sets of identical pairs - something that will result in interesting sort orders, depending on the algorithm used.

    You should at least rewrite the routine like this :

    sub by_priority { # Return 0 on string identity return 0 if $a eq $b; return -1 if $a =~ /^Organization/ ; return 1 if $b =~ /^Organization/ ; return -1 if $a =~ /^Service/ ; return 1 if $b =~ /^Service/ ; return -1 if $a =~ /^FAQ/ ; return 1 if $b =~ /^FAQ/ ; return 0 ; }

    My personal favourite when rewriting that code would be via grep :

    my @priorities = qw(Organization FAQ Service); my $prio_top = join "|", map { "^$_" } @priorities; sub by_priority { # Return 0 on string identity return 0 if $a eq $b; return -1 if $a =~ /$prio_top/o; return 1 if $b =~ /$prio_top/o; # Alphabetical sort for the rest return $a cmp $b; }

    If you're concerned with raw sort speed (for many values), a Guttman-Rosler-Transform would be the best thing - you simply encode the priorities together with each item in a string, let sort rip through it and unpack it afterwards.

    If you're not tied to your current algorithm, it might be a good idea to first partition your dataset into the four categories, indivitually sort them and afterwards simply concatenate them together.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web

      Corion,
      the algorithm in your sort sub doesn't do what princepawn wants : put 'Organization' before 'Services' before 'FAQ' (and not care about anything else...) .
      In fact, if you're sorting a list based only on these, the function will always return 1, because the first match will always be true.
      I think the best you could reduce this to (excluding the transform you mentioned, which I'm unfamiliar with) would be a foreach loop, like so (shows pp's implementation, yours and mine) :

      #! perl -l my @foo =( "Service", "Organization", "FAQ"); my @priorities = qw(Organization FAQ Service); my $prio_top = join "|", map { "^$_" } @priorities; print join " ", sort by_priority @foo; print join " ", sort boo @foo; print join " ", sort corion @foo; sub by_priority { return -1 if $a =~ /^Organization/ ; return 1 if $b =~ /^Organization/ ; return -1 if $a =~ /^Service/ ; return 1 if $b =~ /^Service/ ; return -1 if $a =~ /^FAQ/ ; return 1 if $b =~ /^FAQ/ ; return 0 ; } sub boo { foreach ("Organization", "Service", "FAQ") { return 1 if $b=~/^\Q$_\E/; return -1 if $a=~/^\Q$_\E/; } return 0; } sub corion { # Return 0 on string identity return 0 if $a eq $b; return -1 if $a =~ /$prio_top/o; return 1 if $b =~ /$prio_top/o; # Alphabetical sort for the rest return $a cmp $b; }

        Ooops - yes, my code does an unstable sort on the categories, because the order of the bail-out is different from what princepawn wanted. I wanted to introduce independence from the order of the input values for the normal strings, and lost it on the categories ...

        perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
by fglock (Vicar) on Oct 31, 2002 at 21:25 UTC
    sub by_priority { for ( qw(Organization Service FAQ) ) { return -1 if $a =~ /^$_/; return 1 if $b =~ /^$_/; } return $a cmp $b; }
Re: [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
by Enlil (Parson) on Oct 31, 2002 at 21:39 UTC
    sub by_priority { for (qw(Organization Service FAQ)){ $a=~/^$_/?return -1: $b=~/^$_/?return 1:1 } return $a cmp $b; }

    fglock beat me to it. But same, idea different way to do it, and I think the same amount of strokes.

    -enlil

Re: [perlre/perlgolf] Golf: Sunset strip!
by BrowserUk (Patriarch) on Oct 31, 2002 at 21:58 UTC

    77.

    Update: As anon. below points out, I hadn't read the spec properly, so... for a less desireable 89 85 88. Instead of Sunset strip, thats two fat ladies.

    # 1 2 3 4 5 6 +7 8 #234567890123456789012345678901234567890123456789012345678901234567890 +123456789012345678 sub by_priority{my$r=qr/^(?:Service|Organisation|FAQ)/;($a=~$r&&-1)+($ +b=~$r)||$a cmp$b;}

    Test code and results (Hopefully I didn't try a step to far this time.)


    Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy
      Still broken!

      sub by_priority{my$r=qr/^(?:Service|Organisation|FAQ)/;$a cmp$b|($a=~$ +r&-1)+($b=~$r)} my @foo =( "Service", "FAQ", "BrowserUK","Organization1", "FAQ2", "Org +anization2"); print join " ", sort by_priority @foo; __END__ BrowserUK FAQ FAQ2 Organization1 Organization2 Service

      This turns into an alphabetical sort!

        Update: Fixed the data and the code above.

        Okay. All my non-prioritised items started with lower case, so they got sorted last. D'oh!

        Evidence for the defence.


        Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy
      This will return 0 every time that both $a and $b contain Service|Organiz(s)ation|FAQ1. For example, a list like qw/Organization1 FAQ Organization2 Service Organization3/ will be unaltered by the sort!

      1I guess you really are from the UK, huh? Next you'll be saying there's a 'u' in 'color'!

Re: [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
by Aristotle (Chancellor) on Nov 02, 2002 at 07:08 UTC
    This looks like production code, so I'll assume you want not golfed code, but better abstraction. And it is sort of expensive, so you might want to use a Schwartzian Transform.
    sub by_priority { my @prio = (-1 => $a, 1 => $b); my @prfx = map { (qr/^$_/)x2 } qw(Organization Service FAQ); for(@prfx) { return $prio[0] if $prio[1] =~ /$_/; push @prio, splice @prio, 0, 2; # swap (-1 => $a) and (1 => $b +) } return 0; }

    Capture regularities in code, irregularities in data.

    Update: oops, thanks BrowserUk. I had used @match rather than @prio at first and forgot to update all locations. Hope it's clear now.

    Makeshifts last the longest.

      Looks more like an obfu than the requested golf.

      Where does the @match array come from?


      Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy
Re: [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
by petral (Curate) on Nov 07, 2002 at 18:11 UTC
    sub by_priority { (grep$_,map-($a=~/$_/)||$b=~/$_/,qw(ORGANIZATION SERVICE FAQ))[0] or$a cmp$b }

      p
Re: [perlre/perlgolf] Golf this: return -1 or return +1 on regexp subroutine please
by petral (Curate) on Nov 13, 2002 at 21:10 UTC
    Wierdly enough, this works (67)
    sub by_priority { ($a cmp$b,grep$_,map$b=~/$_/-$a=~/$_/,FAQ,SERVICE,ORGANIZATION)[-1] }
    Thanks to hints from a golfer who plays the pro circuit.

      p

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://209517]
Approved by zigdon
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-04-24 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found