snafu has asked for the wisdom of the Perl Monks concerning the following question:
Is it poor practice to try and crunch too much onto one line in a script? For instance, in a script I am writing right now I have the following block of code:
opendir(LOGS,".");
sort @val = map { -M $_ } grep(/.*\.log/,readdir(LOGS));
closedir(LOGS);
print join("\n",@val);
And the forementioned block could probably be crunched even more by one who knows what one is doing. The previous block could easily be written like so (and indeed this would be easier to understand to the lay Perl coder):
opendir(LOGS,".");
@FILES = grep(/.*\.log/,readdir(LOGS));
closedir(LOGS);
for ( $counter = 0 ; $counter <= $#FILES ; $counter++ ) {
$val[$counter] = -M ($FILES[$counter]);
}
sort @val;
for ( @val ) {
print $_."\n";
}
I am still very new to Perl (imo) and I have a ton of learning to do. The very thing in Perl that confuses me most is the fact that there are 25 ways to write something. Is shorter better? Where is the line drawn where too little is too much??
----------
- Jim
Re: Is too little too much? Coding under the microscope...
by chromatic (Archbishop) on Jun 28, 2001 at 09:35 UTC
|
Can you understand it when it's shortened? If you add a comment, can the other people who'll work on the program understand it?
In theory, the fewer lines of code you have, the less bugs. You might want to move sort to the right side of the assignment in the short version, though. :)
opendir(LOGS, '.') or die "Can't open directory: $!";
print join("\n", sort map { -M $_ } grep { /\.log/ } readdir(DIR));
closedir(LOGS);
I'd probably throw a \z anchor at the end of the grep regex, too. | [reply] [d/l] |
|
Yeah, I have much more to learn about map{} and sort{}. I tried the sort on the right side of the assignment but I was trying a more complicated version of sort because of what I was/am trying to achieve with it but nothing seemed to work. I am still working on this part as well. This script should be pretty nifty when I am done with it though.
----------
- Jim
| [reply] |
Re: Is too little too much? Coding under the microscope...
by bikeNomad (Priest) on Jun 28, 2001 at 09:37 UTC
|
There's clearly a balance between readability and efficiency. It's probably best (if you're working with others, especially) to err in the side of readability until you have to optimize. After all, no one cares how few characters you use.
This isn't to say that more is necessarily better; it can take longer to recognize several lines of code than one idiomatic one, if you know the idiom.
BTW, Just out of curiosity, what's the sort doing? Its return value (the sorted list) isn't being used for anything... | [reply] |
|
Hehe, funny you should ask. That is exactly the part I am having problems with. Here is the purpose of the block. I am trying to get the newest logfile in the directory. So, the idea is to read the directory, sort (newest to oldest) the array based on the age of the logs (hence the sort()) without losing my filenames (since those are what I need in the end) and then pop()'ing off the newest filename based on its age (which should have been sorted now and at the top of the array).
Am I on the right track? :)
----------
- Jim
| [reply] |
|
You are on the right track, but missing a couple of critical steps, which leaves you throwing out data that you don't want to throw out.
I'll walk through the first version you proposed, because I like it better (I personally find that style easier to read than the other one you mentioned, and it's also closer to being right):
opendir(LOGS,".");
sort @val = map { -M $_ } grep(/.*\.log/,readdir(LOGS));
Problem one: sort returns a (sorted) list, which you're discarding. So *if* everything else were right (we'll get there), you would want
@val = sort map {-M $_} grep {...
To see the bigger problem, we break it down a bit more, working from right to left down that line:
@tmp1 = readdir LOGS;
# @tmp now contains the files listed in "."
@tmp2 = grep {/.*\.log/} @tmp1;
#@ tmp2 now contains files that contain ".log"
@tmp3 = map {-M $_} @tmp2;
#@tmp3 now contains what?
@val = sort @tmp3;
If you re-read map, you'll discover that the list it generates is simply the result of the block {-M $_} applied to each value in @tmp2: since -M returns the age of the file, you now have a list of the ages of the log files in the directory, but *not* their names. Sort that list, and you have a sorted list of ages, but with no names associated with them. This is not what you wanted. :-)
What you *do* want is to associate age information with each of the items in @tmp1, sort on that, and get back the sorted version of @tmp1, which can be done thus:
use strict; #you are using strict, aren't you? ;)
my @val = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, -M $_] }
grep { /\.log$/ }
readdir LOGS;
(assuming I didn't drop any critical concepts in there). You may note that I implemented chromatic's suggestion (well, almost) and tagged a $ into the regex to match the end of the file name (so "foo.logger" doesn't match).
This is what's known as a Schwartzian Transform, a term you may or may not have encountered here before. It is very useful. It is also slightly strange to the naked eye--if it makes instant sense to you, congratulations! And if not, take some time to work it out (there are a couple good explanations attached to that link) , because it's a cool technique.
If God had meant us to fly, he would *never* have give us the railroads.
--Michael Flanders | [reply] [d/l] [select] |
|
|
|
|
Err, yes the logic is correct. But why don't just write that as simple as this?
$newest = (sort { -M $a <=> -M $b } <*.log>)[0];
or if you have large amount of files, you'd better cache the result of
-M:
$newest= (sort { ($m{$a} ||= -M $a) <=> ($m{$b} ||= -M $b) } <*.log>)[
+0];
| [reply] [d/l] [select] |
|
Re: Is too little too much? Coding under the microscope...
by Abigail (Deacon) on Jun 28, 2001 at 20:40 UTC
|
Shorter isn't better, nor is longer better. There are often
many ways of doing things in Perl, without one being "better"
than the other.
Programming means making trade offs. Ideally, code should
be correct (well, that usually isn't a trade off), fast,
resource friendly, small memory footprint, small amount of
code, easy to maintain, easy to understand. Usually, optimizing
one thing means pessimizing the others.
What is "best" depends on what your goals are. Code can be
quite different whether you have the need for speed, or
whether you just want clean, extensible code. And when
it comes to "easy to understand", different people prefer
different idioms. That's why Perl has many ways of doing things.
If you prefer having a language where someone else will decide
how things should be done, use Python, or to a lesser extend,
Java. But even what one person prefers differs over time.
For instance, because they start knowing Perl better, or are
getting more experienced in programming in general. Or they
are like me, I use more yellows in the morning, and gradually
use less yellows and more blues in the course of the day,
finishing with deep purples past midnight.
-- Abigail | [reply] |
|
- "Programming means making trade offs. Ideally, code should be correct ... , fast, resource friendly, 4with a5small memory footprint, 4using a5 small amount of code, easy to maintain, 4and5 easy to understand"
All of this is good and logical. Im not new to programming, just programming in Perl. I admit, however, I do ignore a lot of things or rather take many things for granted when coding simply because I <admittance type=shameful>assume</admittance> that shorter code is quicker and less resource munching. Perhaps there is a quick way to take a script and push it through some kind of perl module or perl-builtin function that can monitor the resources during runtime and spit out the resources used during that runtime. I know we have the benchmark module but admittedly I know next to nothing about what all it can do and how it can be used to monitor whole scripts (if indeed it can do so). And what I do know about it I have learned here on PerlMonks. It seems to be best used for finite benchmarking of code within a script and not the script itself. I will look at the documentation for benchmark more closely this evening when I have time to. It seems a little unreasonable and very time consuming, though not necessarily ridiculous, to benchmark every questionable block of code in a script.
Imo, understandability is perhaps the single most important thing when programming. Even if this means extensive documentation in the code to walk someone ** through your code. Is it just me, though? Perhaps I am just catching on to this, but short complex code -eq obfuscated code? But this is exactly my question. Why would the Perl developers make it possible to write tight liners (heh, thats what I call em) if it wasn't something that wasn't ''good''. I understand that it is most likely not always a good thing as not everything can always be good, but two things come to mind regarding this kind of code:
- 1. A tight liner is easier to write (less to type) and seems like it would be more efficient.
- 2. The operation of a tight liner doesn't seem like it would operationally be too much different from writing it out into blocks (see my first examples).
I am kind of playing the devil's advocate here. In doing so I am just trying to see if I can relate to both sides of the spectrum of this kind of code being good and/or bad.
** including yourself...yeah, we've all been down that road before where we look at something we wrote a year ago and wonder..."What on Earth was I thinking when I wrote this?"
----------
- Jim
| [reply] |
|
Why spend a lot of time worrying about efficiency as you're writing the code? Do you know that your program is slow? Is it too slow? Why worry about small gains in efficiency before it runs correctly? For that matter, how could you know that it needs speeding up until it works?
Once you have it running right, and you've decided that it needs speeding up, then you can apply a tool like Devel::DProf or Devel::SmallProf to find and fix the "hot spots". But it's unlikely that shortening two lines that get called once into one line is going to make enough difference to make it worth while agonizing about it up front.
| [reply] |
Re: Is too little too much? Coding under the microscope...
by bluto (Curate) on Jun 28, 2001 at 20:23 UTC
|
One thing you should consider before writing short code
is "Am I missing possible error codes?". For example,
If you get an error from your OS during the readdir()
call above and
it decides to return nothing (even though something is
really there in the directory), can you live with it?
After too many years of writing too much code I'm always a
slave to checking for errors on system calls -- even
something as basic as "readdir(LOGS) or die". (Obviously
this goes for opendir, closedir, etc.)
-- bluto
| [reply] |
|
There is a problem with using readdir(LOGS) or die,
and that's it's normal for readdir
to return a false value. If you have read all the files in
a directory, readdir will return undef
(in scalar context, an empty list in list context), signalling
you have finished reading. dieing in this situation
would really be bad. Also, if you have a file named 0
in your directory, readdir returns a false value.
-- Abigail
| [reply] [d/l] [select] |
|
You are right, of course. In fact, it looks like you'd
have to do something like...
my @a = readdir(FH);
die $! if !@a and $!;
... to catch an error for readdir (in list context).
Now that I think about it, readdir() either fails when
you fail to open the filehandle, in which case you
should have check opendir(), or you have something really
strange happen (i.e. reading from a NFS-ish directory and
the server goes away and you have the filesystem soft mounted).
So it was a very bad example.
-- bluto
| [reply] [d/l] |
|
|