A question of style - declaring lexicals

The result of this node will most likely be "it depends on the programmer; it is a personal preference", but I would still like to see how others tend to organize their code. The issue I am talking about is where to put the declarations for lexical variables (declared with my()), whether we are talking about file-scoped lexicals (not contained within any block), or lexicals of any block (subroutine, bare block, loop construct, conditional block, etc). As the simple example for this meditation, I will use an example which uses file-scoped lexicals.

The first style is to declare all your lexicals at the beginning of the scope in which they are found. The advantage to this style is that you can see at first glance which variables are meant to be used within the scope you've just entered. The disadvantage is that when you finally stuff the variable with data, you can't tell at first glance whether this is the first time data is being assigned to the variable, or if we are redefining it (replacing the current contents with new data). Our example demonstrating this first style:

#!perl -w

use strict;
use IO::Socket;
use IO::Select;

# advantage: we know all the variables that are used
# within this particular scope.
my ( $server, $handles, $sessions );

$server = IO::Socket::INET->new(
    Listen => 1, LocalAddr => 'localhost:8000'
) or die "socket failed: $!";

$handles = IO::Select->new( $server );

# disadvantage: at first glance, we can't tell if this is
# the first time we are assigning to $sessions. You'd have
# to scroll up from this point to see if we've already
# put data into $sessions.
$sessions = {};

{
    my $socket;

    while ( $socket = $server->accept() ) {
        $sessions->{$socket} = {
            peerhost => $socket->peerhost()
        };
    }
}
[download]

The second style is to not declare all lexicals at the beginning of the scope in which they are found, but to declare them when they are to first come into existance. The advantage is that you can immediately see where the variable first comes into play in the flow of the program, since the place where it is declared is also the first time it is given contents. The disadvantage is that it is not as easy to determine which variables are used within a particular scope, since we are scattering variable declarations everywhere. The example code rewritten with the second style:

#!perl -w

use strict;
use IO::Socket;
use IO::Select;

my $server = IO::Socket::INET->new(
    Listen => 1, LocalAddr => 'localhost:8000'
) or die "socket failed: $!";

my $handles = IO::Select->new( $server );

# advantage: we know this is the first time $sessions
# is being assigned to. Any '$sessions = ...' later in the
# program are obviously replacing the contents of $session.
my $sessions = {};

while ( my $socket = $server->accept() ) {
    $sessions->{$socket} = {
        peerhost => $socket->peerhost()
    };
}
[download]

My personal preference tends to lean towards #2 as it is generally easier to follow the program flow. There are times that #1 seems more convenient, such as when you'd like to label each variable with its function. But then there is always POD to do that with if so desired. So I ask of you, the fellow monk, to share your personal preference, as well as any advantages (pros) and disadvantages (cons) you see pertaining to each style.

updates: Added a disadvantage to style #2. Modified code for style #1 by making it more equivalent to style #2 -- I moved the $socket declaration to a separate scope containing the while loop. Updated code to reflect tilly's note. I didn't realize I'd declared a hashref and then tried to access a hash. Oops :)

Comment on A question of style - declaring lexicals Select or Download Code

Replies are listed 'Best First'.
Re: A question of style - declaring lexicals by demerphq (Chancellor) on Jun 03, 2004 at 08:08 UTC
Theres no hard and fast rules for me, generally IMO lexicals should be declared at the begining of the closest scope to where they are used. And unless a variable is deliberately intended to be a static for a closure then all sub declaration should precede any lexical declarations at the file scope, and the latter (lexicals at file scope) should be avoided outside of snippets. So for instance your code in version two would get wrapped in a subroutine almost instantly. But the reason these are general is that I think there are two issues in chosing this placement, what the variable is used for and transfering the authors intentions. If I see: `my $sum=0; while (<>) { ... if ($blah) { my $foo=somefunc; something($foo); } }` [download] Then I know $foo is a throwaway, and $sum is a "working" var of some sort (poor thing should get off the streets :-). Other arangements are to declare a set of vars in such a way its clear that they are fields in a record or the like. All in all, whatever is readable and self documenting is good in my book. --- demerphq _{First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi}	[reply] [d/l] [select]
Re: A question of style - declaring lexicals by Ryszard (Priest) on Jun 03, 2004 at 08:04 UTC
I tend to declare my variables wrt to these guidelines. If the variable is used thru out an entire sub, I declare it at the top. If the variable is used exclusively within a loop, declare it inside the loop. If a variable will be used part way thru' the method/sub I will declare it on a line just before it is used. I tend to keep away from globals unless i'm being particularly lazy, but in that case, the global is always declared at the top of the script.. :-)	[reply]
Re: A question of style - declaring lexicals by Abigail-II (Bishop) on Jun 03, 2004 at 13:38 UTC
When do you consider a variable to be used "thru out an entire sub", and "used part way thru' the method/sub"? Is a variable that used for the first time on the second line used throughout, or part way? Abigail	[reply]
Re: A question of style - declaring lexicals by fireartist (Chaplain) on Jun 03, 2004 at 10:09 UTC
I've just spent some time making changes to an app that I wrote a year ago (always fun!). I discovered that within subroutines that spanned up to a couple of screens' depth, I had declared all variables used within the subroutine, right at the top of it. I didn't like it at all, as I had to keep scrolling up to see if something had already been assigned to. I think now I tend to declare a var when it's first used, with the following exceptions: Loop vars are declared at the start of the loop, Package vars in modules are declared right at the top of the file.	[reply]
Re: A question of style - declaring lexicals by dragonchild (Archbishop) on Jun 03, 2004 at 11:48 UTC
Here's another disadvantage of style #1 - what happens if you have name reuse? For example, you have a @foo that's used throughout the file. Only, in one place, it means one thing and somewhere else it means something else. By declaring variables as close to first use as possible, you minimize a lot of risks. And, frankly, I don't care what variables are available in a given scope. I only care what variables are being used in the snippet I'm looking at. If I have to look around without a good reason, I get pissed off. ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose I shouldn't have to say this, but any code, unless otherwise stated, is untested	[reply]
Re: A question of style - declaring lexicals by Abigail-II (Bishop) on Jun 03, 2004 at 13:44 UTC
Here's another disadvantage of style #1 - what happens if you have name reuse? For example, you have a @foo that's used throughout the file. Only, in one place, it means one thing and somewhere else it means something else. Actually, I see that as an advantage of style 1. Say you have a @foo, and it's used at different places, but in the same lexical scope. If you declare it close to where it's first used, you'll get annoying warnings that you redeclare a variable (I hated that warning when it was first introduced - it was introduced before, or at the same time that `foreach my $var (LIST)` was allowed). If you declare all your variables at the top, you see you have a conflict the second time you try to declare @foo, and can pick another name. Abigail	[reply] [d/l]
Re^2: A question of style - declaring lexicals by Fletch (Bishop) on Jun 03, 2004 at 17:46 UTC
Here's another disadvantage of style #1 - what happens if you have name reuse? For example, you have a @foo that's used throughout the file. Only, in one place, it means one thing and somewhere else it means something else. Some would say that it means you should have had two varaibles	[reply]
Re: A question of style - declaring lexicals by tbone1 (Monsignor) on Jun 03, 2004 at 13:15 UTC
I've noticed recently that my code, in whatever language, is written such that a relatively inexperienced programmer can pick it up and have some idea of what's going on. I tend towards Readability in my code, probably because of my own experiences. In fact, if I ever start writing songs again, the first one will be a blues called "Other People's Software". In my first 'real' job, I was working at a well-known American space agency which shall go nameless. I was handed some C code which had been written by someone long gone and which was untouched for months. There were no comments. All variables were 'a', 'b', 'c', and so on. All functions, subroutines, macros, etc, were 'a', 'b', 'c', and so on. All I knew about the code was: 1) It converted files from one data format to another. 2) There was no documentation available other than the code. 3) There was a problem under VMS (though not any of the Unices) when the file was over, say, 15MB. 4) There was no "#ifdef" anywhere in the code. Given that experience, I tend to make my code obvious, unless I need to adjust for performance purposes. If someone else can't support it, I'll never get rid of it. -- tbone1, YAPS (Yet Another Perl Schlub) And remember, if he succeeds, so what. - Chick McGee	[reply]
Re: A question of style - declaring lexicals by jdporter (Paladin) on Jun 03, 2004 at 14:16 UTC
I always declare lexicals at the point of first assignment, if possible. The issue almost becomes irrelevant if one keeps lexical scopes short. Of course, this is often not practical. If I do find myself with a block several pages long, I'll revert to style #1 for variables that are in fact used "throughout" that scope, even if first assignment isn't at the top of the scope. But if, in a pages-long block, there is a valid need for name re-use but not value longevity, I will often introduce bare blocks just for the purpose of constraining the scope of distinct variables, rather than recycle a single variable. E.g. `{ # long block begin . . { my $name = 'foo'; . . } . . { my $name = 'bar'; . . } . . }` [download] Oftentimes, if the net effect of a section of code is to produce a value, a `do` block is useful for introducing an "artificial" scope. E.g. instead of a plain bare block like this, `my @lines; { my $fh = new IO::File "< $infile"; @lines = <$fh>; } for ( @lines ) { ...` [download] I might use a `do` block, like this, which obviates an auxiliary variable: `for ( do { my $fh = new IO::File "< $infile"; <$fh> } ) { ...` [download] (Note that lexical scopes are also useful for constraining the effects of `local` operators.)	[reply] [d/l] [select]
Re: A question of style - declaring lexicals by hardburn (Abbot) on Jun 03, 2004 at 12:41 UTC
you can see at first glance which variables are meant to be used within the scope you've just entered Why do you need that information? If that variable isn't going to be used for another two screen lengths worth of code, it seems like a silly peice of information to know. It may even be a hindrence, because it adds noise at the beginning of the sub. Lexicals should be declared within the most narrow scope possible. That's why we have them in the first place. ---- send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.	[reply]
Re: A question of style - declaring lexicals by jordanh (Chaplain) on Jun 03, 2004 at 20:18 UTC
I'm not sure that this is the most important consideration, but I've found that declaring variables as close to the scope as is required, what you call style #1, allows you to pull things out into subs more easily. This is not only for refactoring, but also for simplifying code by reducing the indentation level. The use of good subroutine names and comments can make things more clear. I don't know, but I always feel that subroutine level comments are clearer. I tend to use fairly descriptive variable names, making accidental reuse rare. It is a danger though, I agree.	[reply]
Re^2: A question of style - declaring lexicals by jordanh (Chaplain) on Jun 07, 2004 at 15:00 UTC
I'm not sure that this is the most important consideration, but I've found that declaring variables as close to the scope as is required, what you call style #1, allows you to pull things out into subs more easily. This is not only for refactoring, but also for simplifying code by reducing the indentation level. The use of good subroutine names and comments can make things more clear. I don't know, but I always feel that subroutine level comments are clearer. Update: After reading what tilly had to say, I realized that I'm actually advocating style #2, not #1. I hadn't read what you had to say closely enough! Update II: How did this happen? I thought I was updating the parent, but I created a new node... oh well.	[reply]
Re: A question of style - declaring lexicals by baruch (Beadle) on Jun 04, 2004 at 01:38 UTC
Mostly I declare them at the top, which habit I developed from C. If I'm only going to be using them in a relatively small place, then I'll declare them close to the code where I use them. For loops I declare them right at the loop unless it's a really short bit of code. I don't have enough Perl experience yet to know yet whether my habits really work for Perl. בּרוּך	[reply]
Re: A question of style - declaring lexicals by Theo (Priest) on Jun 03, 2004 at 14:34 UTC
One could (sort of) combine the styles by making a list of all the vars as comments at the top, but acutally declare them when used. -Theo- (so many nodes and so little time ... )	[reply]
Re^2: A question of style - declaring lexicals by EdwardG (Vicar) on Jun 03, 2004 at 14:55 UTC
By doing this you would introduce an undesirable maintenance overhead, similar to the problem with flower box comments.	[reply]
Re: A question of style - declaring lexicals by tilly (Archbishop) on Jun 04, 2004 at 23:33 UTC
I strongly lean towards what you call style 2 for the reasons that I explain at RE (tilly) 3: redeclaring variables with 'my'. Incidentally you may wish to modify your code so that it runs. You declare a variable named $sessions which is initialized with a hash ref, and then you access the hash %sessions. Since you used strict.pm, it won't let you run that without fixing the bug.	[reply]