The value of declarations

We are all familiar with the standard advice. Use strict.pm. It catches typos.

But have you noticed that most other scripting languages don't agree with us? If you look at JavaScript, PHP, Ruby and Python you will look in vain for any corresponding piece of standard advice that makes variable declarations required. Why is this?

My theory is that it is an over-reaction to statically typed languages. When you come from a world where you are used to typing:

FooBar fooBar = new FooBar();
[download]

over and over again it is easy to overreact to the realization that constantly declaring types gave you no real benefit. It is more convenient to not declare the types at all, so why not go all of the way and remove the declaration entirely?

As we all know, the answer is that the answer is that the apparent convenience is misleading. Required declarations automatically catch a class of real bugs for you. Not requiring declarations forces eternal vigilance. As I noted in Re: Avoiding silly programming mistakes, eternal vigilance is a bad thing.

However it seems to me that Perl programmers have less cause to be smug than might appear. It seems to me that as a culture we have internalized the practice, but not the principle.

Let me give a simple example that many of us have experienced. Have you ever written any logfile processing? It is pretty straightforward. You parse lines into hash refs, then work with the hash refs. Not exactly hard, and you've likely done it.

Did you think of using Hash::Util's lock_keys function to catch typos in your hash access? I will confess that for years I did not. For me the moment of revelation was when I read Damian Conways PBP and he argued against using Hash::Util because it wasn't secure. And I argued back at the page, "Of course it isn't secure! It isn't supposed to be! It is a strict declaration for hashes!" Then I decided to use it the next time the opportunity came up to see if it was helpful.

When that opportunity arose I found that there is a serious performance penalty, but while developing the log processing I caught 2 bugs that would have otherwise taken me much longer to catch. And found that once developed I could avoid the speed penalty by commenting out one line. I've been using the technique ever since.

Let's move on to argument processing. Most of us know that using named arguments to functions is a good thing. It is common to see people pass hashes into functions, and then process them for exactly that reason. You've probably done it. (If not, then consider it.) But how many of you have a check to find when named arguments are passed in that are not on the list of allowed arguments? I do that, and frequently catch errors where I pass in baz and should have passed in bar. Those errors would otherwise be much harder to catch because to see it you have to carefully compare the function call and definition which are in different pieces of code.

Now let me give a somewhat more bothersome example, which is what prompted this meditation.

Not long ago I had to prepare a small Catalyst site as a code sample. In the process of doing that I was quite surprised to find that if a template variable called a method that wasn't there, you had missing data but no error message to help you track it down. Luckily Template::Stash::Context doesn't suffer from that bug.

But that stash does not catch the use of template variables that were not passed in. But I already had a piece of code for that. Which took more work to produce when I needed it than it really should have. (Sorry, highly non-reusable code for reasons that are mostly out of my control. For Catalyst I also added a few variables that were OK to not pass in.)

And there we have it. The most popular web framework in Perl, using the most popular templating solution, defaults to silently swallowing the most egregious possible signs of error. And unless you really know what you're doing you're not going to find the option of making it helpfully give you a hint of what you did wrong.

So the next time you sit down to write some Perl, pause and give thought to this issue. Think through in different contexts what a "declaration" looks like, and how you might automatically catch things that were not properly declared. The first bug that you automatically catch will likely be your own.

Comment on The value of declarations Download Code

Replies are listed 'Best First'.
Re: The value of declarations by Porculus (Hermit) on Apr 06, 2009 at 20:44 UTC
I think the reason many scripting languages lack declarations is the same reason Perl lacked them until version 5. These languages have tended to evolve, rather than being designed by a committee before the first line of code was written. Perl has evolved a form of safety, and also appears to be the only scripting language that has sensible and consistent variable scoping rules. Python hasn't got that, but it has evolved other cool stuff, like generators and comprehensions -- I often wish Perl had a simple equivalent. PHP... okay, I can't think of anything nice to say about PHP. (And please, people, stop slandering static typing by associating it with explicit, verbose typing. Perl hackers should all be aware enough of Pugs to have heard of Haskell, and that means we should all know that it's possible for static typing to be as concise and expressive as Perl. Java isn't, but Java is mediocre by design; it's aimed at companies that want to hire a horde of replaceable cogs, instead of a handful of smart hackers working in a language that provides real expressive power but takes more skill to use.)	[reply]
Re^2: The value of declarations by brennen (Novice) on Apr 06, 2009 at 22:18 UTC
PHP... okay, I can't think of anything nice to say about PHP. I have a couple: The array syntax is easy, you don't generally have to think about references, and it kind of does named function parameters with default values. Arguably these are all places where it's evolved to be friendlier than Perl, especially for its userbase. There are probably a few others. Just don't get me started on saying things about PHP which aren't nice.	[reply]
Re^3: The value of declarations by Anonymous Monk on Apr 08, 2009 at 09:14 UTC
lol, syntax	[reply]
Re: The value of declarations by samtregar (Abbot) on Apr 06, 2009 at 18:11 UTC
Preach it. I happen to be learning Python at the moment and I did find it interesting that Python lacks my-style declarations but does check hash key access and named parameters. I haven't actually done much with Python but I do expect those to both be big wins. Less sure about what problems not having declarations is likely to cause. -sam	[reply]
Re^2: The value of declarations by MonkOfAnotherSect (Sexton) on Apr 09, 2009 at 05:08 UTC
A lot of it comes down to default DWIMmery not being a core Python ideal. It's not just that hash key access and named parameters must normally exist (unless you explicitly indicate differently using defaultdict or a *kwargs parameter). There's also no autovivification, numbers cannot be concatinated to strings unless explicitly converted, if there's an error it will throw an exception by default rather than continuing to run without complaint, and lots of other little differences that spring from the differing philosophies of the languages. There are costs, of course, to Doing What I Say, but on balance it avoids whole classes of errors that Perl programmers may hit by default unless they explicitly use strict, etc. Agreed re Python's scoping (Porculus) although it's gotten much* better over the years, with the addition of the nonlocal keyword being the most recent addition. You can run pychecker or pylint to find errors in sourcecode. The places I can think of offhand where you could get caught with assignments is a/ to non-existent local variables, b/ to non-existent object member variables, c/ to member variables of the wrong object, and d/ to non-existent hash keys. Problems a/ and b/ are lintable. Problems of types c/ and d/ require mindreading to fix (although you could use a frozendict for type d/) Cheers, -T. (that's more than enough trespass for the moment ;-)	[reply]
Re: The value of declarations by autarch (Hermit) on Apr 07, 2009 at 07:19 UTC
I wrote MooseX::StrictConstructor to check for unknown arguments to the constructor. Very handy. Params::Validate has long supported a similar feature, and by default it blows up on unknown names in a set of named arguments.	[reply]
Re: The value of declarations by moritz (Cardinal) on Apr 07, 2009 at 14:35 UTC
I share your love for strict and declarations, and I'm pleased that Perl 6 has strict mode enabled by default (except for one-liners). It also comes with very expressive signatures, different declarations for subs and methods (which also improves introspection) and all sorts of other things you can expect from a modern programming languages. Heck, even sub names will be checked at compile time (actually at CHECK time, iirc) and you'll get very friendly error messages.	[reply]
Re^2: The value of declarations by TimToady (Parson) on Apr 07, 2009 at 16:45 UTC
The one place where the design of Perl 6 doesn't quite jibe with this meditation is that, while Perl 6 can do compile-time checking of named arguments on subroutines, it refuses to do so for method calls. (It does this by supplying a default slurpy `%_` parameter to methods.) The reasoning for this is that base methods should be allowed to ignore named parameters that are intended only for derived methods, and vice versa. Perhaps there is some way to check named arguments against all the candidates at run time, but it seems as though this could get expensive, and we're at least trying to maintain the hope that Perl 6 could be very fast in theory, even if 6.0.0 doesn't actually achieve that. (And we don't expect it to, since we're optimizing for correct over fast for now.) Anyway, we're still open to ideas in this area, maybe an optional check that only runs first time when dispatch order is established (or reestablished).	[reply] [d/l]
Re^3: The value of declarations by moritz (Cardinal) on Apr 07, 2009 at 17:38 UTC
I wonder, can you do any reliable compile time checks on methods at all? I think this is valid: `my $x = eval 'class A { method foo($bar, $baz} }; A.new'; $x.foo();` [download] Any compile time check would complain about a missing method.	[reply] [d/l]
Re^4: The value of declarations by TGI (Parson) on Apr 08, 2009 at 22:00 UTC
Re: The value of declarations by ww (Archbishop) on Apr 06, 2009 at 20:43 UTC
We are all familiar with the standard advice. Use strict.pm. It catches typos. But as you're well aware there's more to it than catching typos... at least for those whose Perl_fu is less than "expert." And then there's the addendum, which you didn't quote, of which this is one variant: "...unless you know what you're doing and why you're doing it." Now, I see that you didn't ignore this, but given the high regard in which you're held, I fear some less-knowledgeable coders (especially those who merely scan the first few lines) might wind up telling themselves "but tilly said I shouldn't `use strict`. Of course, you did no such thing, which is what makes your meditation worth CAREFUL and thoughtful reading, even by those to whom I have imputed "lesser fu" or impatience (and, BTW, to /me). ++ !	[reply] [d/l]
Re: The value of declarations by ambrus (Abbot) on Apr 07, 2009 at 14:22 UTC
While ruby does not have a strict mode, I'd actually like such an option. In ruby, you can't declare local variables or instance variables. When developing ruby code, I typo variables very often, and ruby is very unhelpful in finding these bugs. (I sometimes also use the same local variable twice in the same scope, a strict mode would help warn me about this too.) I think it might theoretically be possible to add a strict mode, but of course you'd first have to invent syntax for declaring variables.	[reply]
Re: The value of declarations by JavaFan (Canon) on Apr 06, 2009 at 18:53 UTC
But how many of you have a check to find when named arguments are passed in that are not on the list of allowed arguments? I do that, and frequently catch errors where I pass in baz and should have passed in bar. Generally, I don't. But then I often write code that looks like: `sub foo { bar(@_); # Needs parameters key1, key2, key3; baz(@_); # Needs parameters key3, key4; ... use key2, key4 and key5 ... }` [download] Now, if bar() and baz() would balk at extra parameters, I'd have to write more code. And change all calls to baz() if baz() is modified to do something with key1 as well. Now, I understand your concerns, and I share the opinion that when you're using hash keys, you basically throw away all the goodness strict gives you. Therefore, I often put all key names into variables (constants) and use the variable name as key (although I'm lazy more than once and don't always follow this good practice).	[reply] [d/l]