Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Common Perl Pitfalls

by Joe_ (Beadle)
on Apr 09, 2012 at 21:52 UTC ( #964216=perlmeditation: print w/replies, xml ) Need Help??

This node in no way means that I claim to be an expert on Perl. I hardly consider myself at an intermediate level (I'm still making my way through the Alpaca!). These are just some of the most common ways I've managed to shoot myself in the foot. I thought I would share them here in the hope that they would benefit others and in the hope that I may receive enlightenment from other, more experienced monks on how to better handle these issues. Most of them have to do with regex (go figure!). Here they are in no particular order:

EDIT: Made some changes to the proposed solutions above according to some keen insights from JavaFan and Jenda. Thank you for your constructive criticism.

Replies are listed 'Best First'.
Re: Common Perl Pitfalls
by JavaFan (Canon) on Apr 09, 2012 at 23:11 UTC
    The solution: redefine $/ right after your slurp:
    No. That's just another potential problem. The solution is:
    my $slurp; {local $/; $slurp = <INPUTFILE>};
    my $slurp = do {local(@ARGV, $/) = "inputfile"; <>};
    Or even:
    my $slurp = `cat inputfile`;

      Great stuff. Thanks for the feedback.
      I really like the second solution.
      The third solution isn't as portable, though.
      Why do you think redefining $/ as I did is a potential problem?

        The third solution isn't as portable, though.
        cat is probably available on more platforms than perl is. Of course, Windows rules the world, and both cat and perl are ported to Windows -- and, AFAIK, neither comes standard with the OS. Unlike cat, perl is not included in the POSIX standard for shell utilities.

        Why do you think redefining $/ as I did is a potential problem?
        Well, you consider someone modifying the code to be a potential problem. Would if someone modifies your code, and adds a return after the first assignment to $/, but before the second? Would if someone wraps the code in an eval, and the read triggers an exception?

      Rather than mess around with $INPUT_RECORD_SEPARATOR (AKA: $/), and restore it afterwards, a better method would be to use the File::Slurp module.

      It is another common Perl pitfall to write new code for a common problem when you should have looked on CPAN. There is a very good chance that you will find a fully debugged implementation that considers all the edge cases. It never ceases to amaze me that people would prefer to spend half a day writing and debugging code, instead of 15 minutes finding and installing a module from CPAN.

        It always amazes me people prefer downloading a CPAN module, and using it, over writing a one-liner. I'm even more amazed that people think that just because there's a module on CPAN, it automatically is fully debugged and covers all the edge cases.

        I do wonder though, if it takes half a day to write:

        my $slurp = do {local $/; <HANDLE>};
        how long do you need to type in:
        use Some::Module::From::CPAN; my $slurp = Some::Module::From::CPAN->some_API(some_argument);
        Twice the number of lines, so, a full work day?
      First solution is not equal to last two, as it implies that INPUTFILE is already open. I would say that the correct idiom looks like this:
      my $slurp = do { open my $fh, '<', "inputfile"; local $/; <$fh> };

      P.S. I really like the second one, thanks. Not for production use, of course. :)


        Considering that Joe_'s example uses the handle INPUTFILE, I don't have any problems with the implication, and I really don't see the need to come up with the snobby term correct idiom. (You consider a piece of code with error handling to be correct idiom? You're fired).
        I really like the second one, thanks. Not for production use, of course. :)
        Why not? It's not significant different from your correct idiom. It misses error handling (but then, so does your correct idiom), but that's easily handled: just add a // die "slurp: $!";.
Re: Common Perl Pitfalls
by Jenda (Abbot) on Apr 09, 2012 at 23:55 UTC

    Re: Regex in a loop The first while statement is perfectly fine ... if you intend to modify the variable. The first loop reads "while the variable still matches the regular expression do something with the variable" while the second reads "while there's still something more to match in the variable do something with the match". The rule is that in the first case you SHOULD modify the variable within the loop, while in the second case you SHOULD NOT modify it.

    Re: Deleting some array elements You should use grep():  @filtered = grep {whatever('test', $you, want_with($_, 'aliased to an array element'))} @all;

    Re: Slurping gone wrong my $data = do {local $/; <INPUTFILE>}; or use File::Slurp

    Re: True is 1, false is...? Don't print just the value, print some more info to make sure you are looking at the result of the print statement you think you are and always put some kind of quotes around the variable: print "Computed the number of angels: >$angel_count<\n";

    Enoch was right!
    Enjoy the last years of Rome.

      I agree on almost all counts.
      I remember having used the option you talked about (modifying the regex and matching in the loop without 'g') before. I just don't like it, though. I feel that it's quite unstable and will turn into an infinite loop the second you're not looking...

Re: Common Perl Pitfalls
by JavaFan (Canon) on Apr 09, 2012 at 23:16 UTC
    Deleting some array elements
    I'd write that as:
    @array = @array[grep {!should_delete($_)} 0..$#array];
    If only because your splice solution can be quadratic worst case, while the above is linear (assuming should_delete has a running time bounded by a constant).

      That's a really great one, too.
      I've only recently started coming to grips with grep and map. I've always felt that this problem can be tackled by a one-liner but I just couldn't put my hands on it. Thanks for finally providing it :)

      Care to elaborate on that "quadratic" comment? How do you figure? I'm not that good with complexity theory, I'm afraid...

        Care to elaborate on that "quadratic" comment?
        Say you want to delete all elements in the second half of the array. The first N/2 iterations of your loop, no splicing happens. But on the N/2 + 1st iteration, the splicing takes at least N/2 - 1 steps, as that many array elements need to be moved. On the N/2 + 2nd iteration, the splicing takes at least N/2 - 2 steps. In total, you will be moving


        array elements. If I've done my math correctly, the above sum equals (N2 - 2N + 4)/8. Which means your algorithm runs in Ω(N2) time.

Re: Common Perl Pitfalls
by JavaFan (Canon) on Apr 09, 2012 at 23:28 UTC
    The thing is, Perl treats the result of a false logical test as the empty string (in scalar context)
    It's actually a dual (triple) var:
    $ perl -MDevel::Peek -wE '$x = 1 < 0; Dump $x' SV = PVNV(0x97f45a8) at 0x9805ad0 REFCNT = 1 FLAGS = (IOK,NOK,POK,pIOK,pNOK,pPOK) IV = 0 NV = 0 PV = 0x9801438 ""\0 CUR = 0 LEN = 4

      I will have to RTFM on that one :)

Re: Common Perl Pitfalls
by Anonymous Monk on Apr 09, 2012 at 23:56 UTC

      Thanks for the links. They seem really interesting.

Re: Common Perl Pitfalls
by sundialsvc4 (Abbot) on Apr 12, 2012 at 13:33 UTC

    I wish that this thread had not promptly become “threaded” so much, thereby diluting its content such that now someone would have to wade through a lot of back-and-forth conversation to glean the “final” meaning out of it -- some of those conversations seeming to be fairly nit-picking anyhow.   Threads, particularly in the Meditations section, are going to be referred-to for many years to come.   I’d therefore suggest that they ought to be built-up in this way, when possible ... you are building a sort of reference article when you write in this particular section.   If you want to debate fine-points, do it over in Seekers, then put a summarization or edit over here, with appropriate hyperlinks to the relevant discussions.   If you want to do major edits, come back up to the first level of reply-nesting.   I think that would make for better final-content, IMHO.   The <strike> strike tag is great to explicitly show edits.   Someone ought to be able to read just the top-article and perhaps the first-level replies and come away with an accurate reply (incorporating all of the various back-and-forths which they didn’t have to wade through) with a minimal amount of reading.   I just think that would be better... it reflects how I use this resource (constantly), anyhow, and what I personally would prefer as a reader.

Re: Common Perl Pitfalls
by JavaFan (Canon) on Apr 10, 2012 at 06:49 UTC
    Of course if you had meant the string $to_replace is an actual regex to match against, you're better off using the qr operator:
    I don't get this point. You started off that section with:
    $to_replace='some_string'; $my_string=~ s/$to_replace/$better_data/;
    and doomed this catastrophically unsafe, because $to_replace may actually contain characters that have a special meaning.

    But if $to_replace is actually a regexp, the premises is gone -- any special characters are intentional. In fact, it's quite fine in that case to use the above.

      I actually meant to say that one shouldn't use a scalar as a regex anyway. I meant to say that, even if your correct semantics didn't require the use of \Q and \E (i.e. you actually needed the metacharacters) then you're better off using the qr// operator instead of building your regex as a literal string.

        Well, that's nice you want to say that, but can you back up your statement with an argument?
Re: Common Perl Pitfalls
by girarde (Hermit) on Apr 11, 2012 at 18:06 UTC
    This post would have been imporoved by the use of <continue> tags.

      Thanks for the advice. I do agree. The <continue> tags didn't work so I used spoilers instead.

Re: Common Perl Pitfalls
by muba (Priest) on Apr 29, 2012 at 19:32 UTC

    while($string=~ m/reg(ex)/) { $string=~ s/$1/ister/; }

    That frightens me a bit. Your substitution replaces the first occurance "ex" with "ister" in the string, no matter whether "ex" is part of "regex" or something else:

    my $string = "Sometimes there are extra effects you didn't foresee whe +n using a regex."; while($string=~ m/reg(ex)/) { $string=~ s/$1/ister/; } print $string;


    Sometimes there are istertra effects you didn't foresee when using a r +egister.


      You are quite right. It's a terrible mistake on my part. I will edit it.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://964216]
Approved by ww
Front-paged by ww
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2017-03-27 05:38 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (315 votes). Check out past polls.