Coding style: truth of variable name

perlancar has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Coding style: truth of variable name by GrandFather (Saint) on Apr 19, 2020 at 00:44 UTC
In general the cost in any sense of introducing a new variable is trivial or nothing so go wild. Using appropriate variable names is a large part of good coding technique. Having a variable change its stripes easily leads to hard to understand code. The issue is added "cognitive load" - the reader needs to remember more stuff to understand the code. Another thing to think about is how does changing the meaning of a variable affect debugging? If you introduce a new variable it means you have both versions available for inspection in a debugger at the same time so it can be much easier to see where unexpected results were introduced and why. For this reason I often break down complex expressions into multiple statements with appropriately named variables holding intermediate results. It makes writing, debugging and maintaining the code easier. Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re^2: Coding style: truth of variable name by BillKSmith (Monsignor) on Apr 19, 2020 at 03:20 UTC
I was about to post exactly the opposite advice, but your post changed my mind, at least conditionally. The key concept is "appropriate". Extra variable names only reduce the "cognitive load" if both variables have appropriate names and scopes. I once had a customer who insisted that we always use the extra names. Any advantage was lost because he also insisted that all variables be named 'parmxxx' where xxx was a serial number he assigned. I often use your technique of breaking up complex statements on "Schwartzian Transforms" (Refer to How do I sort an array by (anything)?). I almost never code them right the first time. I usually rewrite them in the idiomatic form after I am satisfied that they are correct. Bill	[reply]
Re^2: Coding style: truth of variable name by perlancar (Hermit) on Apr 19, 2020 at 01:19 UTC
Thanks for the debugging perspective, I didn't consider it.	[reply]
Re: Coding style: truth of variable name by choroba (Cardinal) on Apr 19, 2020 at 01:28 UTC
Sometimes, you can solve the problem by avoiding the situation completely: `for my $dir (glob '/') {` [download] I'm not sure how it works on MSWin, so maybe more portably `for my $dir (grep -d, glob '') {` [download] The second case is different. I'd probably declare two subroutines: `sub load_from_path { my ($module_path) = @_; # etc. } sub load_from_name { my ($module_name) = @_; my $module_path = ...; load_from_path($module_path); }` [download] I mean, it's not the responsibility of the "load" subroutine to convert the module name to its path. It's the caller's responsibility to know what they have and what they want to do with it. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^2: Coding style: truth of variable name by perlancar (Hermit) on Apr 19, 2020 at 02:00 UTC
True in the above cases. Bad examples then :) How about the common cases where a function needs to validate its input. Do you assign the pre-validated content to the variable first, or do you assign it to something else first and then to the final variable after it's validated?	[reply]
Re^3: Coding style: truth of variable name by roboticus (Chancellor) on Apr 19, 2020 at 12:16 UTC
perlancar: In the case of input validation, I usually let the variable name express the intent then validate the value into submission before doing the work: `sub frob_file { my $frobbable_filename = shift; die "Error" if !-e $frobbable_filename or -d $frobbable_filename; die "Nope!" if not_frobbable($filename); ... frob file ... }` [download] In other cases when the variable isn't so clear but it will be clear shortly, I'll often use $t or $tmp for the placeholder. Then I'll give it a name or pass the data off to a better-named thing: `sub zap_the_thing { my $t = shift; my @files_to_zap; if (-d $t) { zap_dir($t); } else { push @files_to_zap, $t; } ... yadda ... zap_files(@files_to_zap); }` [download] I don't think $t or $tmp is a great name, but finding good names is hard. I use it so that I can look at it and dispose of it ASAP. Frequently I find I can't name something well the first time I encounter or use it. So I come up with my best guess of the name and use it. Then, when it feels like the name is wrong, and I find it doesn't fit, I do one of two things: If I have a better name in mind, I'll rename it. Sometimes, though, I can't think of a better name, so I instead give it a prefix of 'z' to "call it out". That way, when I revisit the code, I know I need a better name. Not perfect, not even good, but it usually gets me by. Yet I still wind up with stuff like: `# ?NEED GOOD NAME? # If a group (Row, Col, Blk) has only one slot for a particular value, # solve that cell. sub solve_v_in_only_one_cell_in_R_C_B { my ($self, $GEN) = @_;` [download] an atrocity which came directly off my screen from last nights session. ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply] [d/l] [select]
Re^3: Coding style: truth of variable name by davies (Prior) on Apr 19, 2020 at 18:03 UTC
I read unvalidated into one variable and then put it into a validated variable when I call the validation routine. The variables differ only in their prefix. I learned this from https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/, which is still worth reading. Here's a pseudocode example: `my $inv_data = get_input(); my $val_data = val_from_inv($inv_data);` [download] Regards, John Davies	[reply] [d/l]
Re: Coding style: truth of variable name by dsheroh (Monsignor) on Apr 20, 2020 at 08:01 UTC
1A/2A for me, pretty much every time. Name the variable what it's supposed to be, then immediately check and skip/abort/throw an exception/halt and catch fire as appropriate if it's actually something else (1A), or reformat it if it's the right thing but not expressed in quite the way you want (2A). 1B and 1C feel like meaningless expansion of code to me. I prefer that subs are short enough to look at the whole thing at once, and "meaningless-but-not-blank" lines of code that don't do anything more than "copy data from an unvalidated-data-name variable into a validated-data-name variable" reduce the number of meaningful lines of code that can be in view, and they don't even give you the visual structure that blank whitespace lines provide. 2B just gives me the heebie-jeebies. Variable names should be meaningful and `$arg` is the opposite of meaningful. Yes, yes, it is an argument to the sub, but that's the only information the name `$arg` tells you. I want a name that tells me what the arg is (or at least what it should be). If the only information you want to convey about the value is that it's an argument, you may as well just skip the `shift` and refer to it as `$_[0]`, or use a bare `shift` and access it as `$_`. (Yes, IMO `$arg` really is that utterly meaningless as a name.)	[reply] [d/l] [select]
Re: Coding style: truth of variable name by ikegami (Patriarch) on Apr 19, 2020 at 22:28 UTC
If you have to simply copy the value from one variable to another just because the value has changed, you're probably going to far. (Your first example.) If, however, some transformation was applied, might as well use an accurate variable name for the transformed value. (Your second example.) It's often that one must deal with file name, file paths and absolute paths in the same piece of code. I use the following convention to distinguish them: `$fn`, `$dir_fn` or `$foo_fn`: A file name (no path) `$qfn`, `$dir_qfn` or `$foo_qfn`: A qualified file name (a relative or absolute path) `$fqfn`, `$dir_fqfn` or `$foo_fqfn`: A fully-qualified file name (an absolute path) As such, you'll find me doing `while (defined( my $fn = readdir($dh) )) { my $qfn = "$dir_qfn/$fn"; ... }` [download] No point in using the same variable. The same goes for your second example. `sub load { my ($pkg) = @_; my $qfn = $pkg =~ s{::}{/}gr . '.pm'; ... }` [download] Why would you use the same var? To save memory? Perl might not be the best choice of language if you think that's important. In your first example, you call the variable "dir", but you would already have a variable by that name in practice. `for my $qfn (glob("\Q$dir_qfn\E/")) { stat($qfn) or do { warn("Skipping \"$qfn\": Can't stat: $!\n"); next; }; next if !-d _; ... }` [download] You could also use the following: `for my $subdir_qfn (glob("\Q$dir_qfn\E/")) { stat($qfn) or do { warn("Skipping \"$subdir_qfn\": Can't stat: $!\n"); next; }; next if !-d _; ... }` [download] Sure, it might not be a subdir, but you could think of it as a subdir candidate. Adding another variable to the mix wouldn't help. If someone wanted to assist on not putting the value in `$subdir_qfn` unless the name matches perfectly, one could use the following: `for my $subdir_qfn ( grep { if (stat($_)) { -d _ } else { warn("Skipping \"$_\": Can't stat: $!\n"); 0 } } glob("\Q$dir_qfn\E/*") ) { ... }` [download]	[reply] [d/l] [select]
Re: Coding style: truth of variable name by jcb (Parson) on Apr 19, 2020 at 04:01 UTC
My general rule is to reuse a variable when the old value is no longer needed and the variable name also describes the new value, so I would prefer `2A` but the comment explaining that `$module` is to be a canonicalized module name is very important. Creating additional lexicals is cheap, but not free in Perl. (Additional locals are essentially free in most cases in C since modern compilers allocate the entire stack frame at once.) My fellow monk choroba made a good point about filtering input when you can, but I would also prefer `1A` because that type of filtering at the beginning of a loop's block is idiomatic in Perl. Concision in this case is also useful in that the more concise code requires fewer VM steps because it avoids an extra lexical. Filtering the input is the best option, since grep iterates in C and reduces the number of iterations perl's VM must execute. This is a trivial concern in most cases, but can be serious in an inner loop. Lastly, I think you meant "`next unless -d $dir`" in `1A`, `1B`, and `1C` — "`next if -d $dir`" skips the iteration if `$dir` does name a directory and would be very confusing in all three cases. Edited by jcb: Add missing caveat; thanks to GrandFather for pointing out my mistake.	[reply] [d/l] [select]
Re^2: Coding style: truth of variable name by GrandFather (Saint) on Apr 19, 2020 at 05:37 UTC
My general rule is to reuse a variable when the old value is no longer needed Taken at face value that is terrible advice. A large part of understanding code is understanding the role of variables at any particular point. That is why choosing good variable names is important. If the role of the variable changes through the code then understanding the code becomes much harder. So maybe that wasn't what you meant by that statement? Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re^3: Coding style: truth of variable name by jcb (Parson) on Apr 20, 2020 at 03:11 UTC
I had forgotten an important detail. That should be "when the old value is no longer needed and the variable name also describes the new value". This balancing act is to avoid proliferating variable names like `$file`, `$file2`, `$realfile`, and similar problems that I have seen in existing code, including the questioner's example `1B`. If the role of a variable can change, then (in my view, in Perl) the variable is defined in a scope that is too wide for the code as written. I often reuse the same name for another (similar) purpose later in a `sub` or script, for example, if iterating over two different sets of files, both `foreach` loops are likely to use `foreach my $filename ...`, but the variables are separate lexicals and `$filename` does not exist outside of those loops. Thanks for catching that — the idea that a variable name must describe its contents is something that I tend to assume goes without saying and that the questioner here seems to also tacitly understand, but that is an important detail that a new programmer might not yet know.	[reply] [d/l] [select]
Re^2: Coding style: truth of variable name by perlancar (Hermit) on Apr 19, 2020 at 07:22 UTC
Ah yes, thanks for the correction about `next unless`.	[reply] [d/l]
Re: Coding style: truth of variable name by leszekdubiel (Scribe) on Apr 19, 2020 at 22:31 UTC
for my $dir (glob "") { # for a brief moment, $dir might not hold a directory's name # ^^^^ that's ok -- you just check if $dir is okey for you: next unless -d $dir; ... } for my $dir (glob "") { -d $dir or next; # put "-d" first, because it is more importan +t than "next" $dir =~ /photos\|thumbs/ or next; ... ... ... ... bla bla $dir ... ... bla bla for (...) { ... bla bla $_ and $_ ... ... and $dir ... } ... ... ... $dir... ... ... # long loop body -- it is important to use "$dir" variab +le ... ... } Short processing: for (glob "") { -d or next; # "or next" -- fall back less important + then "-d" /photos\|thumbs/ or next; do_something } Better written like this, data flow from bottom to up: do_someting with $_ for # finally feed good dirs to "do +something" sort # third step grep { -d && /photos\|thumbs/ } # second step glob ""; # first step [download]	[reply] [d/l]
Re: Coding style: truth of variable name by Anonymous Monk on Apr 19, 2020 at 13:46 UTC
Try to write short routines with locally-scoped variables that are named to clearly illustrate their meaning, not their data type. The very worst thing that can ever happen is that I am trying to understand the meaning of your code ... you got mashed by a bread truck so I can't ask you ... and I get it wrong. I overlook something. I misinterpret it. Or even, your variable-names suggest something that is not or is no longer true. "K. I. S. S." I can only read your code � I cannot read your mind. But you can clearly suggest to me what you were thinking at the time.	[reply]
Re: Coding style: truth of variable name (subroutine length) by Anonymous Monk on Apr 19, 2020 at 02:54 UTC
How long is your subroutines? Variable names are much less important than subroutine names file folder `for my $file ... grep glob whatever for my $path ... grep glob whatever for my $anal ... grep glob whatever` [download]	[reply] [d/l]
Re^2: Coding style: truth of variable name (subroutine length) by perlancar (Hermit) on Apr 19, 2020 at 07:27 UTC
My subroutines can range from just a few lines to over several hundred lines long. Labelled blocks sometimes help in making long subroutine clearer, as well as creating lexical scope to isolate the effect of variables. `sub do_some_task { my ($arg1, $arg2, $arg3) = @_; SUBTASK1: { my $some_var = ... ... ... ... } ... SUBTASK2: { ... ... ... ... ... ... } ... ... }` [download] I do have to question your claim. What makes variable names much less important than subroutine names? Variables are referred to much more often.	[reply] [d/l]
Re^3: Coding style: truth of variable name (subroutine length) by Anonymous Monk on Apr 19, 2020 at 10:01 UTC
My subroutines can range from just a few lines to over several hundred lines long. I do have to question your claim. What makes variable names much less important than subroutine names? Variables are referred to much more often. isnt it obvious? subroutine length, obviously :) these are all identical for a screen full (40 to 100 lines) dir dir0 entry arg module modulepm file path ana entry arg anal dir0 are the shit versions from least to most. But only cause I dont actually think that way. Even they doesnt increase cognitive load... Var names only need to be close enough. hundred lines long ... arg1 arg2 arg3 subtask1 subtask2 never chop it down to skimmable code @args Ebony() Ivory() Harmony() age of peter, sum of bob	[reply]


Perl: the Markov chain saw
	PerlMonks