Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer


by snax (Hermit)
on Dec 14, 2000 at 16:28 UTC ( #46605=perlmeditation: print w/replies, xml ) Need Help??

Hanging out here at PM is an enjoyable thing, as I'm sure the regulars would agree. Fun, friendly, informative -- what more could you want?

What's better is that it has actually markedly improved my "craft" in employing the oddly-shaped tool we know (and love?) as Perl.

It's also helped me to think more "perlishly" about things. Before hanging out here, I would use perl to solve problems because it was a quick and accessible tool -- but my designs were, to echo meryln's comments elsewhere, too BASIC-like. Now I begin to think about lists and contexts as I prepare algorithms. I also believe I am more prone to consider some of the OW in TIMTOWTDI.

Example: I was browsing through comp.lang.perl.misc recently and saw a cry for how to remove unwanted whitespace. Now, this is a FAQ, but I've come to enjoy using the construction

$_ = join ' ', split;
which came to me a while back (after becoming a monk) because I thought it was cool how the list propagated, and how I had pushed the whitespace worry into perl (i.e. I made split worry about it). So I posted this solution just to share the different perspective. (Note: you need to use $x = join ' ', split ' ', $x; if your string isn't in $_.)

Turns out this is pretty efficient, too -- japhy (in his real world persona) ran and posted some benchmarks that show it is as (or more) efficient than the FAQ solution and faster than some naive regex solutions. That's somewhat beside the point, though -- I had shared my solution because (1) it was a different perspective from the usual approach and (2) there was something "perlish" about it that appealed to me. Perl, and the craft thereof, has been getting under my skin. I lay this at the feet of the Monks here :)

Another example: I recently had to write a checksum checking routine, where the last digit is a checksum digit based on an algorithm applied to the other digits (no surpises there). I used this:

my $check = chop $digits;
to grab the check digit and prepare the remaining string ...a solution that would never have occurred to me a couple of months ago (Monks have me reading the docs, and merlyn's stance on void contexts has had me thinking about return values of late) and would likely have led me to use a couple of less than appealing (to me) substr calls. It's not a huge thing, but aestethically I find it more appealing. Wait a minute -- aesthetically? It's over for me now, I'm afraid :) No longer is it enough to get code working; it has to appeal to me as well.

As a final example, also from my checksum routine just mentioned (which was a very cool problem since you had to think about the whole number both as a string and a number, a very natural thing to do in Perl), a basic part of the checksum routine invloved substituting numbers (10..35) for the letters A-Z before doing anything else. I solved this with

my %x; @x{0..9,'A'..'Z'} = (0..35); my $converted = join '', map {$x{$_}} split //, $digits;
...and what's better is that this was the first solution I thought of. I wrote it, in that haze of concentration you can only get while coding, and afterwards looked at the solution and said -- hey, that's pretty cool!

Well, that was pretty long winded when all I really wanted to convey was that I like it here, and my experience here has broadened my enjoyment of programming.

Thanks, Monks!

Replies are listed 'Best First'.
Re: Craftier
by Caillte (Friar) on Dec 14, 2000 at 22:35 UTC

    One of the many things I have found PM useful for is the breadth of experience you find here. Too many times in the past I have laboured for days to produce a soloution to a problem only to find that there is a module on CPAN that fills my requirements exactly.

    The problem with the number (if problem is really the word) of modules on the site is it is easy to get lost in the list, or search on the wrong keyword because you and the author don't quite think the same way. That's just human nature ;)

    However spending an hour or so a day listening, and contributing, to discussions here will expand your knowledge. Other monks face the same challenges an think about things in different ways. Many are the times I have had to write a script that is related to a question or meditation I have seen a few days before and working from that makes the task easier, even if you only use what you have read for inspiration and go off and do your own thing.

    Having an additional area to search for answers only increases your chance of finding something that will answer your question and, this being PM, answer it in ten different ways :)

    I like it here too, and am proud to be a monk amongst my fellow monks. Lets give ourselves a pat on the back an bask in the knowledge that, in a small way, we are making the world a better place ;)

Re: Craftier
by Elgon (Curate) on Dec 14, 2000 at 18:33 UTC
    Yep, code a bit BASIC-ish? Guilty sticks hand in air

    What I try to do now is write my code in the step-by-step way I do to get the sequence of operations planned out in my head, take a break and then come back and look at it with a fresh eye to see where I could tighten it up and make it more efficient and/or easier to read/maintain.

    What I'd really like to do is be able to write the fluent, lyrical language I see on PM straight off from fresh but I guess that will come with time and experience. I'm already making headway, particularly where deciding which flow-control structures to use: Goodbye...

    for ($i=0; $i<$foo; ++$i){ }
    and, Hello...
    foreach $i(@foo){ }

    I think one good way to 'play creatively' and improve fluency is probably to try writing obfuscated code and poetry. Reading and trying to disentangle obfuscated code is probably helpful too.


      Learning to read obfuscated C/C++/Perl code is invaluable when it comes to trying to support code written by others that may not have had your standards of programming. Sometimes I think that's the only way that I could have unravelled some of the code segments I've been given to review...
Re: Craftier
by royalanjr (Chaplain) on Dec 14, 2000 at 23:08 UTC
    Answers will come in the most unexpected places.

    I was reading this article, which in no way had anything to do with some scripts I have been toying with as of late, when suddenly the solution to a minor issue in the scripts came to me. It revolved around unwanted whitespace, and I was not even aware of that until I read this.

    BANG! like that... problem solved, and who would have known this article would do it.

    So I guess the moral of the story is simply Read your perlmonks!!! *grin*

    Roy Alan

Re: Craftier
by BatGnat (Scribe) on Dec 15, 2000 at 03:53 UTC
    Just on a side note, would it not be better to use s/\s//g; or tr/\s//;.
    I just did a benchmark on the two and the regex is approx 3 times quicker than the $_ = join ' ', split;
    This is the benchmarik that I ran.
    use Benchmark; my $junk = 'The quick brown fox Jumped over the lazy dog'; timethese(5000000, { 'split' => '$junk = join \' \',split $junk;', 'regex1' => '$junk =~ tr/\s//;', 'regex2' => '$junk =~ s/\s//g;', });
    and the results are
    Benchmark: timing 5000000 iterations of regex1, regex2, split... regex1: 3 wallclock secs ( 4.38 usr + 0.00 sys = 4.38 CPU) @ 11 +42334.93/s (n=5000000) regex2: 3 wallclock secs ( 3.50 usr + 0.00 sys = 3.50 CPU) @ 14 +26940.64/s (n=5000000) split: 15 wallclock secs (15.07 usr + 0.00 sys = 15.07 CPU) @ 33 +1762.99/s (n=5000000)

      I'm afraid there are serious problems with the Benchmark code that you posted. It is important to make sure all your code snippets do the right thing before you benchmark them, and to make sure the benchmark itself is doing the right thing. I hope it will be instructive if I detail the issues.

      There are two problems with the tr/// solution; \s is not special inside tr///, and /d is required for tr/// to delete characters. There are also two problems with the split solution; you are splitting $_ using $junk as the delimiter, and you are joining with a space instead of a null string.

      There are also problems with the benchmark itself. $junk is a lexical, so it is not accessible from the Benchmark module. Since you passed quoted strings, your snippets were compiled in the Benchmark module and were operating on an empty $junk. Once that problem is fixed, since each code snippet modifies $junk in place, only the first execution of the first snippet would have any work to do; all the remaining iterations would be processing a string that had already been stripped of whitespace.

      Here is an improved benchmark:

      #!perl use Benchmark; my $junk = 'The quick brown fox Jumped over the lazy dog'; timethese(-10, { 'split' => sub { $x = join '', split ' ', $junk; }, 'trans' => sub { ($x = $junk) =~ tr/ \t\r\n//d; }, 'subst' => sub { ($x = $junk) =~ s/\s+//g; }, });
      and the new results are:
      Benchmark: running split, subst, trans, each for at least 10 CPU secon +ds... split: 43912.46/s subst: 66211.19/s trans: 197755.00/s
      As you can see, the translation solution is actually the big winner, and the substitution is only 1.5 times as fast as split/join.
        Translation is always the fastest if you can use it, at least that's my experience.

        One thing to consider is that the alternatives suggested remove all whitespace -- the join/split just drops leading/trailing whitespace and squishes all "extra" in between down to one space -- provided you join on ' ' and not '' -- you need to do more than one regex/translation to accomplish the same thing.

        The simplest (to read) equivalent would be

        s/^\s*//; s/\s*$//; s/\s+/ /g;
        The quickest alternative would likely be to use translation to squish all the white space first and then do regexes to strip the (perhaps) remaining single spaces at the beginning and end of the string:
        tr/ \t/ /s; s/^ //; s/ $//;
        (add extra whitespace equivalents in the translation if you want to lose carriage returns and such).

        In any event, the point was that the join/split is an interesting alternative, and not all that inefficient. Lots of times when I want to strip extra whitespace speed isn't that big a deal, anyway :) Some people like to use s/^\s*|\s*$/g to strip leading and trailing whitespace but that's less efficient than doing two substitutions, so it's not always about speed.

        Perhaps I can persuade japhy to present his benchmarks?

        Sorry for the incorrect posting, but either way, you proved my point. As for the space in the split, I copied that directly from his code, $_ = join ' ', split; and modified it. I didn't even look to see if his code was wrong, I should have checked.
        Thanks for the help, I have only started use Benchmark recently.

        Micro$ofts new corporate motto: RESISTENCE IS FUTILE

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://46605]
Approved by root
and the voices are still...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (10)
As of 2018-06-25 16:53 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (127 votes). Check out past polls.