Re (tilly) 1: When do you function?
by tilly (Archbishop) on Dec 27, 2000 at 09:16 UTC
|
Hopefully most of the time. :-)
Seriously I find that my average function is about 10 lines.
Some are shorter - a lot shorter. A few are much longer.
But that seems to be an average for me.
Here is a list of reasons from chapter 5 of Code Complete
to think about. I won't copy explanations, just the
reasons:
- Reducing complexity
- Avoiding duplicate code
- Limiting effects of changes
- Hiding sequences
- Improving performance (optimize later)
- Making central points of control
- Hiding data structures
- Hiding global data
- Hiding pointer operations
- Promoting code reuse
- Planning for a family of programs
- Making a section of code reusable
- Improving portability
- Isolating complex operations
- Isolating use of non-standard language functions
- Simplifying complicated boolean tests
I have found that all of these benefits still hold in
Perl. Well performance is usually hit a little, but you
are left in a position to optimize where it counts later.
And you shouldn't have non-standard language functions.
But in practice I have noticed portability issues from time
to time.
So while that list doesn't hold perfectly for Perl, it is
still generally on target.
Note that in particular comments explaining what you
intended at one point are not a good substitute for clear
code. Should you change the code later, the comments will
often remain to confuse you. Also deeply nested loops
may not take many lines, but they make it much harder to
separate the forest from the trees. | [reply] [Watch: Dir/Any] |
|
when I was mid-BS (ahem) in CS, I used to write all my
programs "top down"; just create a bunch of function/procedure
names that'd handle the problem and pencil in the loops
(on the backs of output pages, w/ a couple of sheets
for 'global' vars and page for function prototypes, details
to be filled in later). As a mid-level programmer now,
performance is so rarely a concern (what I write runs
less often (e.g. daily) than its worth to squeeze (or 'bum')
any extra speed out of it0) that the value of a clean loop
calling meaingfully named functions outweighs any loss in the
internal context switches etc.Keeping the flow clean and
localizing the gritty details makes life much easier, and
makes the final program that much more maintainable: its
way easier to rewrite &Get_Image_Path to handle the
addition of a separate image server box than to go back
and find and handle all the spots that were calling
"Get_Image_Path($case_number, $document_number)"
a
| [reply] [Watch: Dir/Any] |
|
As in a's comment, I also tend to write top down using
wishful
programming. (Where wishful programming is using names
of functions I haven't written yet.) This has the benefit
of not having to think about everything at once, and
if you create stub functions it will actually run and you
can slowly build it up. Ah ha, starting to use some XP
techniques! (Except I learnt to do this before XP was
around...)
When it comes to breaking functions up,
if I have a function which is trying to do several different
things, it is usually time to break it up. Likewise, a
function which is more than a couple of screens long is
about due for a break up. Although the screens one is very
relative... Especially when I'm writing CGI scripts which
are generating forms, they can get rather lengthy.
Over the last couple of weeks I've been writing a interface
for managing Spong -
using Perl OO. In an object, if I realise that I need to
use
some code again (for example: sucking details out of
the database about a host), it get's broken out to a new
function in the object.
So, there's some of my approaches. Of course, if I sat
down and did some decent design before hand I'd only need
to use the first one - wishful programming - as the functions
would already be broken up logically.
Sigh...
Updated: Right, that link to Spong now works as it
should. My bad. (Thanks a for pointing it out!)
| [reply] [Watch: Dir/Any] |
|
Oh, you are so on the nail on this... I've gotten into a habit of doing this instead of writting pseudo-code:
# authenticate user
AuthenticateUser($foo,$bar)
or die('Could not authenticate user');
# a while later...
# do some stuff
SomeStuff($more,@params)
or die('Could not do more stuff cause reason');
# subs from here down
########################
sub AuthenticateUser {
return(1);
}
sub SomeStuff {
return(1);
}
I tend to code this way especially when I'm discussing the program with colleages. Then we all get a copy and each one goes out to fill in his/her respective blanks. In the end, I believe the result is usually very clear and quite maintainable!
#!/home/bbq/bin/perl
# Trust no1!
| [reply] [Watch: Dir/Any] [d/l] |
Re: When do you function?
by Albannach (Monsignor) on Dec 27, 2000 at 09:50 UTC
|
Just some random comments here... my basic principle (and this
is just a layman's opinion ;-) is
to isolate any repeated sequence of operations by placing them
in a function/subroutine with a descriptive name. This can certainly
be carried too far, so in practice I make functions of things
that are becoming irritating to type or cut/paste from elsewhere.
I certainly agree with your first three points, and I'd like
to add that the reduction of repetition also reduces coding
errors. Even (especially?) when I cut and paste, I can
introduce variations or subtle errors (often involving scoping)
which are entirely avoided when I take the time to make
a subroutine.
I find that building a sub forces me to think more fully about
what exactly I'm trying to express as I try to make it as much
a black-box as possible. Subs also give me ample room to
add better error-checking and handling that may (gasp!) get left out if I were
to strip down the operation and leave it in-line. And after
all this extra effort, I get something that I can re-use
elsewhere more easily than some sequence of lines from the
middle of a big loop.
On your second set of points:
1 - On looping, I just don't like deep indentation, and
especially if chunks of the loops are nicely isolated, I
will put them in subs just to unclutter the structure,
which leads me to commenting:
2- I tend to think the fewer comments the
better, and that's not to make things harder for others. I
mean that whenever I find myself making any comment at all
(apart from header blocks which should be quite detailed)
I ask myself just what is so confusing here, why isn't
the code obvious, and can I make it obvious and avoid the
comment altogether? Jumping from sub to sub shouldn't be
confusing if they each do something that makes sense on
it's own. For a trivial example, in $a = sin($b) * cos($c)
the functions each have clear and obvious purposes of their own, and
the thought of calculating them in-line would be a great
starter for the "fattest obfuscation" category...
Finally there is the consideration of performance especially
if you are passing a lot of data to a function (in which case
you should probably pass a reference anyways but there are
always issues...). I'd like
to think that the compiler (speaking generally here) should
optimize what I write and not really care if it's a subroutine
or in-line, but again in practice this isn't the case (yet anyway)
so it may well be that using a sub call can slow down an
operation that I will perform millions of times to the
point that I shouldn't make the call. When I wrote a lot of
C I enjoyed making elaborate preprocessor definitions to get
the best of both worlds, and to some extent I miss that in Perl.
I look forward to the more professional opinions of the
learned monks on this question!
--
I'd like to be able to assign to an luser | [reply] [Watch: Dir/Any] [d/l] |
|
I just have one comment on... comments:
I aggree that comments on code are often be redundant, and so should
be used with caution.
But comments on data should be all over the place,
especially with Perl complex types (yes, I mean hashes!),
a simple comment like
my %nodes; # node_id => ref_to_node
can go a _long_ way to help you or anybody else who has to maintain the code 3 month later.
| [reply] [Watch: Dir/Any] [d/l] |
|
In my experience 90% of comments are unnecessary. But, on
the other hand, 90% of the comments that should exist, are missing.
Yes, code should be 'self-documenting', but comments should explain why you're doing what you're doing.
It's no use being able to understand your regex or complex data structures etc if I can't work out why on earth you're doing things this way in the first place.
Of course, to bring this back to the initial point, if you break everything down into simple, short, self-contained functions that do only one thing, and are well
named, then it's going to be quite evident what they do,
and POD will mostly be sufficient :)
Tony
| [reply] [Watch: Dir/Any] |
Re: When do you function?
by Falkkin (Chaplain) on Dec 27, 2000 at 10:42 UTC
|
My rule of thumb: if a function doesn't make code any easier to read, it's not worth writing.
At one point in my life (not very long ago ;)), I had the temptation to create lots of little 2-line and 3-line functions (probably coming from an OO background in school, where I was taught to make even simple variable accesses into object methods...)
But, consider the following code:
# test1.pl (uses no user-defined functions):
for($i = 0; $i < 100000; $i++) {
print $i;
}
# test2.pl (uses a simple function):
for($i = 0; $i < 100000; $i++) {
print func($i);
}
sub func {
return $_[0];
}
# OUTPUT
[falkkin@shadow ~/perl] time perl test1.pl > /dev/null
real 0m4.826s
user 0m4.240s
sys 0m0.010s
[falkkin@shadow ~/perl] time perl test2.pl > /dev/null
real 0m14.227s
user 0m14.100s
sys 0m0.050s
It's clear in this case that the overhead involved in calling a function (mostly involving pushing variables to perl's stack and popping them back off) makes the code run roughly 3 times more slowly.
Generally, I try to avoid calls to "small" functions in inner loops whenever performance is anything of an issue; if I'm only reusing those few lines of code once or twice in a program, it's just not worth it to create a function for it, in my opinion. | [reply] [Watch: Dir/Any] [d/l] |
|
To mangle Mark Twain: "There are four kinds of lies: Lies, Damn Lies, Statistics, and Benchmarks".
This is a good case for learning to use the Benchmark module. I found that creating an anonymous sub ref was slightly faster than a full sub, and that a bare block had less efficiency gains (although still outstanding for this simple function) the larger the sample size. If you are sensitive to milliseconds of difference, or have extremely complex algorithms you should measure them in situ to determine whether a bare block is better than a subroutine.
To me, the main routine should be not unlike an outline. Some works, like the standard five paragraph essay, don't really need this. For reference books or novels, on the other hand, it can be an essential aid to the writer.
You can always paste the text of a subroutine into other parts of the script for production code where milliseconds count. This will assist the compiler in streamlining the code (although repeated functions will make the whole executable larger and perhaps increase compile time if you are compiling for each script load). But when you are designing the code, it would make sense to abstract most of your larger blocks, just like you would abstract chapters, sections, and paragraph sets when writing.
| [reply] [Watch: Dir/Any] |
|
I'm relatively new to perl (OK, a complete newbie), but after looking at the same program written in C, I got much different results.
330ms and 350ms user time. Is there a way in perl to force inlining of functions? Or maybe this just points out a place for improvements in the compiler...
(which, I understand, due to the interpreted nature of perl, is neccessarily minimal, but would this be a quick-fix job?)
| [reply] [Watch: Dir/Any] |
|
Did you test the performance of the perl programs (vs. equivalent C code) on your machine? Chances are relatively decent that your machine is better than my clunky 133 MHz Pentium.
Fact correction: Perl is a compiled language, actually (well, as compiled as Java, anyhow). The perl program works by taking in your source file, and compiling it in several stages.
Stage 1: the compile phase. In this phase, Perl converts your program into a data structure called a "parse tree". If the compiler sees a BEGIN block or any "use" or "no" declarations, it quickly hands those off for evaluation by the interpreter.
Stage 2: Code generation (optional). If you're running one of the compiler backend modules (such as B::Bytecode, B::C, or B::CC), the compiler then outputs Perl bytecodes (much like a Java .class file) or a standalone chunk of (very odd-looking) C code. These code-generators are all highly experimental at the present.
Stage 3: Parse-tree reconstruction. If you did stage 2, stage 3 remakes the parse tree out of the Perl bytecodes or C opcodes. This speeds up execution, because running the bytecodes as-is owuld be a performance hit.
Stage 4: The execution phase. The interpreter takes the parse tree and executes it.
This is Perl compilation in a nutshell... read Chapter 18 of the 3rd edition of Programming Perl for a more in-depth analysis.
For many tasks (especially simple ones such as these) Perl will be slower than C, because C is basically a more-portable form of assembly language, and assembly language (once actually assembled) works with raw hardware, and is hence as about fast as you can get.
Another difference between Perl and a native C app that may affect performance is the fact that Perl has its own stack (actually, it has several stacks) as opposed to a C program, which is likely to just use the system stack.
| [reply] [Watch: Dir/Any] |
|
I add my voice to this: the should be a way to inline
functions and method calls.
Granted you can use the pre-processor (perl -P)
to inline functions but this does not work for method calls.
This is really bad when designing OO Perl, where I find
myself using straight hash access ($o->{field}) instead
of accessors ($o->field) for some often-called methods
(or writing painfull and risky kludges),
which makes maintenance much harder
I am actually very surprised this is not even a Perl 6
RFC, I would
think that this is a simple (I would think it is quite easy to
implement) way to enhance speed or maintainability of OO Perl programs.
| [reply] [Watch: Dir/Any] |
|