Just a short "thought for the day".
Yesterday, I found that I had a subroutine that I couldn't figure out how to name and I was forced to add comments to explain it. For the most part, I believe that well chosen names can eliminate much of the need comments, thus leaving comments only for those "you'll never figure it out on your own" moments. For example, with the following line, you can probably guess what is going on.
if ( tokens_are_the_same($token, $CONFIG{stack}[0]) ) {
If I had made the mistake of trying to do the entire comparison on the spot, it would have been terribly clumsy and confusing. To avoid such things, I often take confusing code and move it into its own function, even if it's not reused.
# if we have an expense account and the amount is too large
if (4223 == $acct and $amount >= 100) {
Even the above simple example can do without a comment, when properly handled. We move the test into its own function and dispense with documentation that frequently wouldn't be updated anyway.
if ( excessive_expense_account($acct,$amount) ) {
One of the side benefits is that maintenance programmers are considerably less likely to change the sense of the test without changing the function name. If we just left the explanation as a comment, it may not have been changed.
This all leads to something interesting. If you can't figure out a good name for your function, what do you do? I sometimes struggle with this and I'll sit back in my chair and I'll restate what the function does. Usually the name of the function will just magically occur at that point. Yesterday, it didn't. I had a function that, not matter how many times I tried, I just couldn't figure out what to call it. It shouldn't be too hard from the following snippet to figure out which function I am talking about.
while ( my $token = $parser->get_token ) {
if ( tokens_are_the_same($token, $CONFIG{stack}[0]) ) {
$parser->unget_token( $token );
my ($html_chunk,$matched) = stack_match_or_current( $parse
+r );
$html .= $html_chunk;
$REPLACEMENTS{$file} += $matched;
}
else {
$html .= $token->as_is;
}
}
What the heck does &stack_match_or_current do? To figure it out, you need to read the comments that I was adding to the top of that function. Further, this function was returning two values: some HTML and a value for the number of matches. It's rather clumsy and the poor maintenance programmer isn't going to figure it out very quickly. But here's what I have noticed, time and time again: if you can't figure out a good function name, look for a design problem. It's akin to a school teacher telling a student "if you can't write it down, you don't know what it means". In this case, I didn't know what my design problem was, but I knew it was probably there since I couldn't figure out the name of the function. My first two runs of the code exposed it.
By the time I had fixed the problem, the &stack_match_or_current function was merely returning a boolean value: does my stack match my current parser position or not? Once you get into the code, it makes perfect sense and it fixed my design issue. With the correct design, a good name of the function was obvious, &stack_matched, and the code was much cleaner.
while ( my $token = $parser->get_token ) {
if( stack_matched($parser) ) {
$html .= $CONFIG{replace};
$REPLACEMENTS{$file}++;
}
else {
$html .= $token->as_is;
}
}
Cheers,
Ovid
New address of my CGI Course.
Silence is Evil (feel free to copy and distribute widely - note copyright text)
Re: $bad_names eq $bad_design
by dpuu (Chaplain) on Dec 20, 2002 at 20:33 UTC
|
I agree with the premise: but sometimes the problem isn't actually the design: its that you're trying to abstract something too far.
Consider your example:
# if we have an expense account and the amount is too large
if (4223 == $acct and $amount >= 100) {
which became
if ( excessive_expense_account($acct,$amount) ) {
If I don't understand the system, then the second form is probably no better than the first. Although the named function is explicit, I still need to look at the definition because I don't know what an excessive_expense_account is. A better name might help (e.g. "expense_account_is_over_limit(...)"): but, as a maintenance programmer, I'd prefer to see:
if ( $acct->is_expense_account
&& $amount > EXPENSE_LIMIT) {
This contains meaningful names, which helps me to read the code: plus it has detail, which is needed for me to truely grok it. When we abstract out the detail, even to a meaningful name, then we circumvent the human brain's chunking mechanism. We only have a meaningful name: not the detail that enables the brain to see the patterns.
The summary: don't abstract too far. If you have difficulty finding a name, then by all means use this as a flag to re-examine the design. But if the design look good, then consider leaving it alone: don't force details into function names. Polysynthetic identifier names are all very well, but grammar exists for a reason: to aid comprehension.
--Dave | [reply] [d/l] [select] |
|
I agree with Ovid's premise. Though it is not always true.
Sometimes the writer's energy for naming just failed or the
naming was just boggled somehow. Additionally Ovid's
premise is just not true, if I have well designed code
and randomly change the identifiers I would have obfuscated,
well-designed code. I agree because The physical medium in
which we construct programs is so facile that it behooves us
to construct programs whose models reconstruct in our and other minds easily.
Your point is also sublime. Thanks.
All of this side steps the issue of domain knowledge.
What is an EXPENSE_LIMIT? Is this terminology
which is standard to accounting, to accounting programs, an
unknown or forgotten
computer science snippet or
a construct named idiocyncraticly to this program/system?
<p>
This is also an issue to consider in the pursuit of clarity.
Update: Struck out text, respelled computor.
| [reply] [d/l] [select] |
|
Yes! It side-steps domain knowledge. Lets expand on this point.
Consider this code:
if ($a->is_xxx_yyy && $b < PPP_QQQ) { ... }
An experienced Perl coder would instantly, subconciously, grok the control-flow within it. Part of this understanding relies on convention, so could be subverted, but just look at the amount of information/expectation we have:
- $a is an object (the -> operator)
- is_xxx_yyy is a predicate (starts with word "is")
- is_xxx_yyy has no side effects (because its a predicate)
- If is_xxx_yyy is false, then $b is irrelevant (&& short-circuits)
- PPP_QQQ is a constant (upper-case)
- PPP_QQQ is numeric (the numeric comparison)
- $b is numeric (ditto)
- The rhs has a single boundary (the inequality).
So we'd know that there are 3 paths through the control logic of the if statement, and be able to see how they interact. Bad coders could subvert my expectations (e.g. they could add side-effects to the predicate): but "hard cases make bad law".
Compare this with an abstracted function call:
if (xxx_yyy_ppp_qqq($a,$b))
The perl-brain gleams no information from this. Understanding it relies entirely on the quality of the name of the identifier. And we all know how hard it is to create a really good identifier name.
--Dave | [reply] [d/l] [select] |
Re: $bad_names eq $bad_design
by Aristotle (Chancellor) on Dec 20, 2002 at 20:09 UTC
|
| [reply] |
Re: $bad_names eq $bad_design
by sauoq (Abbot) on Dec 21, 2002 at 09:55 UTC
|
if you can't figure out a good function name, look for a design problem.
That seems like sound advice. (Heck, any reason to look for a design problem is a good one.) Even though it makes a good rule of thumb, like most advice, it could be hazardous if followed in every instance. Sometimes an awkward function is the right one. Code is not natural language afterall. Consider that, if you allow your spoken language to have too great an influence on how you code, you may be artificially limiting your design choices right from the start.
I think it is best when code approximates a written description. It needn't actually read like one. Frankly, your use of "_are_the_" in a subroutine name makes me cringe. Why? Because words of that nature could crop up in lots of subroutine names and when that happens your eye has to work harder to differentiate them.
Instead of tokens_are_the_same(), I would be inclined to use something like tokens_same(). I might call it tokens_equal() or tokens_equivalent() depending on what it actually did. (I think 'same' indicates that two references refer to the same memory, 'equal' indicates that the values are the same, and 'equivalent' indicates that two values have the same type. For instance, two STRING tokens might be equivalent but not necessarily equal or the same.) In general, I think good subroutine names are as short and dense with meaning as possible without being cryptic.
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] [select] |
|
Here, here!
Whilst I want my code to be self-documenting, extraneous "little" words in sub and variable names do "little" to enhance this IMO too.
That's probably the strongest argument for properly executed overloading.
if ( $current_token == $token ) {...} where $current_token is an object of a class Token that overloads the == operator and does the right thing would be so much more intuative I think.
To my way of thinking, overloading is more useful than inheritance for giving clarity, especially when the latter is taken too far. Java's propensity to use inheritance rather than attributes combined with the need for (which this practice in part generates) long, unweildy class and method names leading to stuff like
Reader in = BufferedReader( UnbufferedReader( Stream( filename ).getStreamHandle()) );
drives me nuts:^) (That's from memory, inaccurate and an exaggeration, but it makes my point).
It's also part of the reasoning behind my desire for lvalue subs (with appropriate mechanisms for preinspection and rejection of assignment attempts), for use as assessor/mutator methods. Having to write
my $point = new Point();
$point->set_X( 3 );
$point->set_Y( 4 );
....
my $start_point = new Point;
$start_point = $point.copy();
$point->set_X( $point->get_X() + 1 )
while $image->get_color_at($point) == RED;
my $end_point = new Point();
$end_point = $point->copy();
$image->draw_line_from_to( $start_point, $end_point, BLACK );
When it could be
my $point = new Point(3, 4);
...
my $start_point = $point;
$point->X++ while $image->color_at($point) == RED; # or even $point->
+X = point->X + 1
my $end_point = $point;
$image->line( $start_point, $end_point, BLACK );
seems unnecessarially unweildy to me.
Examine what is said, not who speaks. | [reply] [d/l] [select] |
|
I'm not sure I'd go with the operator overloading here. It does mean you have to keep a firm grasp on all the interfaces and remember what a variable is supposed to contain. It also makes spotting mistakes harder - if I have the wrong object in $current_token, $current_token->equals($token) may well have Perl complaining it can't find that method, while $current_token == $token will (more or less) quietly do the wrong thing.
I like overloading, but I believe it is very, very dangerous if used too readily. It's great when you're creating a class that shares characteristics with the basic data types, like writing a Math::Complex or something, where $a * $b ends up meaning exactly what I expect. Also, overloading stringification is very handy. I would be really hesitant to use it to represent more arbitrary semantics.
As for your example, I'd probably want something like this:
my $point = new Point(42, 42);
my $start_point = $point->copy;
$point->add_x(1) while $image->element_at($point)->color == RED;
my $end_point = $point->copy;
$image->draw_line($start_point, $end_point, BLACK);
Makeshifts last the longest. | [reply] [d/l] |
|
|
Re: $bad_names eq $bad_design
by pdcawley (Hermit) on Dec 21, 2002 at 06:39 UTC
|
Me? I worry about those global variables like $file, $html, %CONFIG and %REPLACEMENTS and find myself wondering if it wouldn't be better to make them attributes of the parser object:
sub ParserClass::do_global_replace {
my $self = shift;
while (my $token = $self->get_token) {
$self->maybe_do_replacement($token);
}
return $self
}
sub ParserClass::maybe_do_replacement {
my $self = shift;
my $token = shift;
if ($self->stack_matched) {
$self->append_replacement_html;
}
else {
$self->append_html($token);
}
return $self
}
sub ParserClass::append_replacement_html {
my $self = shift;
$self->append_html($self->replacement_html);
$self->increment_replacement_count;
}
(Admittedly, some of the utility functions are left as an exercise to the interested reader, I'm assuming that you have a parser per file and I worry that 'stack_matched' is still a bad name for the function, but I can't think of a better one at present. With these changes in place, your while loop becomes:
$html = $parser->do_global_replace->html;
I still worry mind. Now I'm worried that the parser is doing too much... | [reply] [d/l] [select] |
Re: $bad_names eq $bad_design
by seattlejohn (Deacon) on Dec 21, 2002 at 04:24 UTC
|
| [reply] |
Re: $bad_names eq $bad_design
by adrianh (Chancellor) on Dec 21, 2002 at 14:59 UTC
|
If you can't figure out a good name for your function, what do you do? I sometimes struggle with this and I'll sit back in my chair and I'll restate what the function does. Usually the name of the function will just magically occur at that
I agree completely. Having trouble naming a class, function, method, variable, etc. is a classic pointer to a design problem.
Have a look at Using Good Naming To Detect Bad Code and Vague Identifier for some related issues to do with method naming and class structure.
| [reply] |
|
|