Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Code Smarter

by japhy (Canon)
on Nov 19, 2000 at 02:18 UTC ( #42374=perlmeditation: print w/replies, xml ) Need Help??

I don't know where else this goes, so I'm putting in here. Add to this list as you see fit. Make it a good place for people to look. While there are only a few things here now, I'd like to see this get quite large.

Do as little as possible

Compare these two lines of code:
@small = grep { $_->size < 100 } sort $query->all_records; # vs. @small = sort grep { $_->size < 100 } $query->all_records;
Notice a difference? As the number of records increases, the first method will get slower. Why? Because you weren't smart enough to sort only the data you need. The two produce identical lists (unless you have severe magic going on, in which case, I don't wanna hear about it), but the second will produce it in better time, since you're only sorting the matches, not all the records.

Use the proper tool

Don't use m// where substr() will do. Don't use substr() where unpack() will do. Don't use m// where index() (or even better, rindex()) will do. Don't use s/// where tr/// will do.

Knowing when to use a hammer and when to use a jackhammer is a valuable skill in programming; and even moreso in Perl, where TMTOWTDYOG (there's more than one way to dig your own grave). Here are examples of the above:
  1. m// vs. substr()
    ($short) = $desc =~ /^(.{0,100})/; # breaks on embedded newlines, and is better as $short = substr($desc, 0, 100);
  2. substr() vs. unpack()
    $name = substr($rec, 0, 20); $age = substr($rec, 20, 5); $job = substr($rec, 25, 25); # repeated calls to substr() better as unpack() ($name,$age,$job) = unpack 'A20 A5 A25', $rec;
  3. m// vs. index() or rindex()
    if ($str =~ /jeff/) { ... } # why not if (index($str,'jeff') > -1) { ... } # or if you know that if it's there, it's toward the end: if (rindex($str,'jeff') > -1) { ... }
  4. s/// vs. tr///
    # (on average) s/[abc]*//g; # is FAR slower than s/[abc]//g; # is slower than s/[abc]+//g; # which is slower than tr/abc//d;

Love your string functions

Regexes are not the answer. They're an answer. I prefer lc($A) eq lc($B) to some convoluted, anchored regex like $A =~ /\A\Q$B\E\z/i (for reasons I hope you can see). And check this crazy trick -- to find the first occurrence of a lowercase or uppercase letter in a string, there's no need to use a regex, just use: $pos = index(lc($str), lc($c)).
Add more. That's more than a suggestion.

japhy -- Perl and Regex Hacker

Replies are listed 'Best First'.
Re (tilly) 1: Code Smarter
by tilly (Archbishop) on Nov 19, 2000 at 03:57 UTC
    What do I say? The right advice depends on the listener's level. However here are some general principles to add to that list.

    The act of programming naturally raises dust and creates entropy. The fundamental limit to programming is in the human ability to keep track of what is going on. If possible the goal should be to maintain acceptable performance and development speed while laying the ground to lessen how much has to be kept track of to redesign the code or add new functionality to it. There is a trade-off here since the internal code-quality is not visible. But long-term consistently doing this will result in faster overall development, fewer bugs, and better performance.

    What does this mean in practice?

    It means that you avoid having to synchronize code. Similar functions should both be wrappers around one if possible. Use hashes to avoid maintaining logic where multiple arrays have to be kept in parallel. So on and so forth.

    It means that you write in terms of lots of small functions that can easily be moved around. It means that you design code in chunks with simple external interfaces so you can overhaul each chunk. It means that you choose meaningful variable names. It means that you don't try to comment too verbosely - rather make the code stand as a comment. (If you comment too much you now have two documents, one in the code and one in the comments. Both start out equally likely to be buggy. Which gets maintained?)

    What else? Well choose the right tool for the job. Note that tools are constantly changing. Budget regular effort for keeping up because things change. OTOH don't throw away lessons learned. For instance there are a ton of Perlish constructs. Many can be carried over to other languages if you try. Do so. The principles of good programming don't change just because the language did!

    There is a lot more, but I am going to just start repeating classics like Code Complete and The Pragmatic Programmer. So let me just summarize that by saying that there are classic books out there on how to program well. They are classic for a reason. Go try to learn from them...

Re: Code Smarter
by metaperl (Curate) on Nov 20, 2000 at 04:34 UTC
    This is excellent advice. I think it belongs in a revised edition of the the book Effective Perl Programming right along with Michael Schwern's YAPC talk "Ineffective Perl Programming"

    I hope you do write a book or put all of your bright ideas in one place, for some of us will certainly benefit.

      ... Michael Schwern's YAPC talk "Ineffective Perl Programming"

      Is this available online anywhere? It sounds like an interesting talk.

      And japhy, I also think that these kind of suggestions are really cool things to know about. It's the kind of advice that you generally have to sweat through years of doing things the not-so-good way before you realise how you should have been doing it all this time. TIMTOWTDI[0], but SWABTO[1] (of course, WAHOF[2], and with any solution, YMMV[3])

      [0] - "There's More Than One Way to Do it"
      [1] - "Some Ways Are Better Than Others"
      [2] - "We All Have Our Favourites"
      [3] - "Your Mileage May Vary

Re: Code Smarter
by salvadors (Pilgrim) on Dec 31, 2000 at 22:22 UTC
    So, what's the best way to find of a string ends with a given substring?

    On another thread I posted example code for finding all your .mp3 files, and I used rindex, but I dawned on me later that that would also find false negatives like: "foo.mp3.gif"

    So what's the best replacement for m/.mp3$/?


      m/\.mp3$/ or substr($_,-4)eq".mp3"
        Anyone know which of these is the faster expression (intuitively speaking, not looking for a benchmark)? Doesn't the RE have to do a lot of work to get to the end of each string passed, whereas substr simply clips? The only thing to remember is to either test .MP3 as well or do lc() on the substr().
Re: Code Smarter
by tphyahoo (Vicar) on Nov 08, 2005 at 09:41 UTC
    Yep, and here's the benchmarks to prove it. (I just felt like learning the benchmark module.)
    use strict; use warnings; use Benchmark qw(:all); #banging on the keyboard my $target_string = 'al;nsdfj;oasmfdio;asdfoasdjfm;mioasfdsjdo;fijso;a +dfjmio;sadjfmos;adfmjosia;mdfjiosad'; Benchmark::cmpthese ( 1000000, { 'substituteStar' => sub { $_ = $target_string; s/[abc]*//g; } , 'substituteNoStar' => sub { $_ = $target_string; s/[abc]//g; } , 'substitutePlus' => sub { $_ = $target_string; s/[abc]+//g; } , 'trAbc' => sub { $_ = $target_string; tr/abc//d; } } )
    Rate substituteStar substituteNoStar substitutePl +us trAbc substituteStar 29935/s -- -85% -8 +5% -95% substituteNoStar 195733/s 554% -- - +1% -69% substitutePlus 196928/s 558% 1% +-- -68% trAbc 621504/s 1976% 218% 21 +6% --

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://42374]
Approved by root
[ambrus]: choroba: cbstream alternates between the three direct IP addresses, without hostname, and switches to the next immediately at the first slowdown or error. this heuristic seems to help.

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2017-10-24 11:17 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (288 votes). Check out past polls.