Re: Optimizing existing Perl code (in practise)

Replies are listed 'Best First'.
Re: Re: Optimizing existing Perl code (in practise) by snafu (Chaplain) on Aug 19, 2002 at 18:13 UTC
This could quickly turn into a flame war... :) With that, I will reply once, then I suppose we can see where it goes from there. But, this is probably all going to be summed up as to each his own and we will have to agree to disagree. And a "[peeve]" of me is people who see everything black-and-white. I've written Perl programs where the majority of the work was done doing "system". What's the point of using a glue language, and not glueing? You might as well write the thing in C. Perl, is most certainly a glue language, but it most certainly does not have to be strictly considered or used only as a glue language. I believe that the fact that Perl has evolved so much makes it more than just a glue language. I prefer to use it as a full-blown programming language with the abilities to do whatever any other language can do natively. C has the ubiqitous system() function, too. That doesn't mean that because its there you should use it unless its absolutely necessary. I've seen too many programs written in both C and Perl that have non-sensical uses of the system() command that in all reality, the only reason the function was used was for [sheer] laziness. What ends up happening is that the command that was being called by system() from either language somehow ends up getting removed from the system, moved, changed, manipulated, whatever, and now the program that used the system() call is broken because it relied on an outside source for its ability to complete its task. If the coder had taken the time to write the extra 3 lines that would have allowed his/her program to do the same job that s/he was trying to complete using system() s/he could have kept from making somebody else maintain his/her code and saved a few CPU cycles at the same time. Thus, portability is a big issue to me when it comes to this matter. I don't buy it. The use of system() should be kept at a minimum. I'm not saying never ever use it. Sometimes, you just have to. I usually use Shell so I can at least keep my code more Perl-esque when I need to make OS level system calls. In fact, I would consider using system() for your example of the mkdir -p command. But I would still use Shell which is a core Perl module so you shouldn't have to worry about installing it. Of course, making use of external programs makes you less portable, but so does making use of modules not coming with the core. And many programs dealing with file names aren't portable anyway. Do you always use File::Spec when dealing with file names? I certainly don't. I agree that the use of non-standard modules can be troublesome. In this case, you have to pick and choose your battle. I'd rather install a Perl module than try and install a program on the box that my script needs to call to get a job done. Lets face it, that whole ordeal is pretty much 6 in 1, half-dozen the other. Not always. Perl gives you more control flow syntax than a shell. Yes, while I will agree that you definitely have more flow control syntax with Perl, you definitely have enough flow control in shell to do the task. Bull. Programming means making trade-offs between developer time and run-time. Don't forget maintainabilty. In my opinion, this should always be considered high on any coder's list of things to be observant of as I am sure you agree. In the professional IT environment, you cannot afford to have your program break on a production box because a system level command comes up missing on the host ($PATH?, OS upgrades, etc) that your program needs to use. In my opinion, this alone makes for an excellent reason NOT to use system() often. The fact that you have choosen Perl instead of say, C, means that you strongly favour developer time over run time. Your arguments make sense if you are a C coder - but for a Perl coder they are just silly. The fact that you favor writing something in ksh vs Perl is the same as saying you favor developing something in Perl vs C. You have to choose your poison. My point is that if you create a Perl script using all system() calls you might as well write in ksh. You're basically doing the same thing. You're just gluing the shell commands together with Perl instead of sh. I mean, really, what's the difference? I know this is more of an instance by instance situation because it could go both ways. It is ultimately up to the developer to figure out what s/he wants to develop in. Thus, if it were me, I'd not write a Perl script that was nothing but system() calls. I am using extreme examples, though, which is what I was considering the original poster's (to this system() thing) statement to mean. I think we might be on the same page here. You seem to be more liberal toward your use of calling system(). I, on the other hand, am more conservative. I'm not sure arguing over this is worth the time. :) update: Fixed mispellings pointed out by dmmiller2k (shear -> sheer and peave -> peeve). _ _ _ _ _ _ _ _ _ _ - Jim Insert clever comment here...	[reply]
Re: Optimizing existing Perl code (in practise) by Abigail-II (Bishop) on Aug 20, 2002 at 10:15 UTC
Perl, is most certainly a glue language, but it most certainly does not have to be strictly considered or used only as a glue language. I believe that the fact that Perl has evolved so much makes it more than just a glue language. I prefer to use it as a full-blown programming language with the abilities to do whatever any other language can do natively. I've never claimed that Perl was only a glue language. But because it can now also be used as a "full-blown" language doesn't mean it has lost its glue capabilities. It's not an exclusive or. That doesn't mean that because its there you should use it unless its absolutely necessary. Why? Who makes the rules which parts of the language can be used freely and which parts can only be used if absolutely necessary? What is the fear of system()? Use of system() is certainly the Unix spirit, where you have a set of small tools, all doing a specific task well. I've seen too many programs written in both C and Perl that have non-sensical uses of the system() command that in all reality, the only reason the function was used was for shear laziness. Laziness is one of the three virtues of a programmer. Most programmers think software reuse is a good thing, and they don't dismiss it as laziness. What ends up happening is that the command that was being called by system() from either language somehow ends up getting removed from the system, moved, changed, manipulated, whatever, and now the program that used the system() call is broken because it relied on an outside source for its ability to complete its task. If the coder had taken the time to write the extra 3 lines that would have allowed his/her program to do the same job that s/he was trying to complete using system() s/he could have kept from making somebody else maintain his/her code and saved a few CPU cycles at the same time. Thus, portability is a big issue to me when it comes to this matter. Many programs rely on outside sources to do their task - in fact, many programs are useless if there aren't outside sources. There can be many outside sources, data files, sockets, database, users. I've no problem relying on system tools. If `mkdir` or `cat` disappears from a system, I'd consider the system to be broken. In fact, if you want to guard yourself against such disasters, you shouldn't be using Perl in the first place. `mkdir` and `cat` are part of the POSIX standard - Perl isn't. I've written programs and used `mkdir`, `cat` and other tools, while shying away from Perl because I could not rely on anything non-standard to be present. And when I use `system` to call non-standard programs they are typically very specialized programs which cannot be coded in Perl in a few lines. I also think that: `system "mkdir -p /foo/bar/baz blurf/quux" and do { ... it failed ... };` [download] is a lot easier to understand (and hence maintain) than `eval {mkpath (['/foo/bar/baz', 'blurf/quux'])}; if ($@) { ... it failed ... }` [download] if only for the fact `man mkdir` gives you the syntax of `mkdir`, while the `mkpath` syntax can only be found by doing `man File::Path`, but you just have to know `mkpath` comes from `File::Path`. I don't buy it. The use of system() should be kept at a minimum. I'm not saying never ever use it. Sometimes, you just have to. I usually use Shell so I can at least keep my code more Perl-esque when I need to make OS level system calls. In fact, I would consider using system() for your example of the mkdir -p command. But I would still use Shell which is a core Perl module so you shouldn't have to worry about installing it. You do know I hope that all `Shell` is doing for you is calling `system` in a fancy way, don't you? It just takes more overhead, more overhead than a plain `system`, and you were already argueing against that. I agree that the use of non-standard modules can be troublesome. In this case, you have to pick and choose your battle. I'd rather install a Perl module than try and install a program on the box that my script needs to call to get a job done. Lets face it, that whole ordeal is pretty much 6 in 1, half-dozen the other. Note that none of my examples uses any fancy program called by `system`. I used `mkdir` and `cat`, programs that will be available on any POSIX complaint system. And certainly on any Unix system. Yes, while I will agree that you definitely have more flow control syntax with Perl, you definitely have enough flow control in shell to do the task. That's a silly argument. `if` and `goto` give you enough flow control in Perl to do any task as well, but that shouldn't be an argument to not use `else`, `for` or `next`. Don't forget maintainabilty. In my opinion, this should always be considered high on any coder's list of things to be observant of as I am sure you agree. In the professional IT environment, you cannot afford to have your program break on a production box because a system level command comes up missing on the host ($PATH?, OS upgrades, etc) that your program needs to use. In my opinion, this alone makes for an excellent reason NOT to use system() often. If a production system suddenly loses tools like `mkdir` and `cat` you will have lots of problems. In fact, I suspect a shell script using standard tools to be more robust when it comes to OS upgrades than a Perl program. The standard tools are quite stable when it comes to their syntax and their output - unlike Perl programs who can suddenly break or produce different results if the version changes. And then there's the case of binary compatability - the non-standard XS-modules you relied on no longer work because someone recompiled Perl using a different compiler, different configuration options or because the version changes. As for specialized tools let me give an example: I recently wrote a Perl program to log and graph usuage statistics of some application server. To get current information, a connection to the server has to be made, (using CORBA), a query needs to be done, and an XML response is send back) I could do two things, either use `system` to call a Java program that makes the connection and writes the response to standard output, or reimplement the thing in Perl. I chose the former. And not just because it save a lot of time. Also because it's more robust. If the client and server decide to use a different protocol, I don't have to reimplement my program - the client will be updated as well. Of course, if the format of the response changes, I need to make modifications, but I had to do that anyway. The fact that you favor writing something in ksh vs Perl is the same as saying you favor developing something in Perl vs C. You have to choose your poison. My point is that if you create a Perl script using all system() calls you might as well write in ksh. You're basically doing the same thing. You're just gluing the shell commands together with Perl instead of sh. I mean, really, what's the difference? I know this is more of an instance by instance situation because it could go both ways. It is ultimately up to the developer to figure out what s/he wants to develop in. Thus, if it were me, I'd not write a Perl script that was nothing but system() calls. I am using extreme examples, though, which is what I was considering the original poster's (to this system() thing) statement to mean. Now you are assuming that if you use `system`, almost anything in the program is using `system`. That is of course not true. Even in a 10 line program, if half of them are using `system` that doesn't mean the other five lines could be done as easy in ksh as in Perl. Abigail	[reply] [d/l] [select]
Re: Re: Optimizing existing Perl code (in practise) by dug (Chaplain) on Aug 20, 2002 at 23:28 UTC
I also think that: `system "mkdir -p /foo/bar/baz blurf/quux" and do { ... it failed ... };` [download] is a lot easier to understand (and hence maintain) than Of course the p5 to p6 converter will have to change that "and" to an "or" :^)	[reply] [d/l]
Re: Re: Optimizing existing Perl code (in practise) by snafu (Chaplain) on Aug 20, 2002 at 13:55 UTC
I will only say that I disagree with most of your comments and I agree with a couple of them. With that, I opt that this thread is dead. _ _ _ _ _ _ _ _ _ _ - Jim Insert clever comment here...	[reply]
Re: Re: Re: Optimizing existing Perl code (in practise) by dmmiller2k (Chaplain) on Aug 21, 2002 at 03:11 UTC
"...peave..." "...shear..." Forgive me, but the former should be "peeve" and the latter, "sheer", in context. For all that these are just silly typos, amidst an otherwise interesting and provocative exchange, the subtle effect of those mispellings only serves to mitigate the impact of your argument. Sorry, I just *had* to comment ... :) dmm	[reply]
Re: Re: Optimizing existing Perl code (in practise) by sauoq (Abbot) on Aug 19, 2002 at 22:34 UTC
I agree entirely with the message you were trying to get across. Still, this bothered me: I also prefer `system mkdir => -p => $dir;` [download] over the Perl equivalent. It takes to long to figure out which module implemented it, and to download and install it. That would be File::Path and it comes with perl. I just thought your statement might be misleading to some. -sauoq "My two cents aren't worth a dime.";	[reply] [d/l]
Re^2: Optimizing existing Perl code (in practise) by Aristotle (Chancellor) on Aug 20, 2002 at 12:25 UTC
I think we've gone far off track and read much too much into the initial point here: one shoud avoid the kind of pretzel logic that results in superfluous system calls - a class of stupid coding practices one of which merlyn dubbed a useless use of `cat`. F.ex, I recently saw this in a script: my @data = `some_tool foo bar params 2> /dev/null \| sort`; Tell me where there's point in piping over to `sort(1)` here? You can just do it in Perl: my @data = sort `some_tool foo bar params 2> /dev/null`; I have seen quite a bunch of cases where people shell out to `awk` or `grep(1)` from inside a Perl script, when what they were intending to do was perfectly doable in Perl and would have taken no more effort (if not less). I cringe every time, and I do call that a peeve of mine - I think you would too. Really, what's the point of writing: `my $text = do { open my $fh => $file or die "open: $!\n"; local $/; <$fh>; };` [download] If you can just write: my $text = `cat $file`; This example is just silly, IMHO. If I need `cat $file` more than once, I'll write a `sub slurp_file($) { open my $fh, (my $file = shift) or die "open $file: $!\n"; local $/; <$fh>; }` [download] and next time it's `slurp_file $file` - no extra process, no extra typing, no loss of clarity. If I do need to slurp a file more than once. If not, of course, the effort is silly. But then, it's very unlikely that I will need to do that in spite of Perl offering the lovely `@ARGV` / diamond operator combo. In the remaining 0.5% of cases, sure, I concede that `cat(1)` would be the tool of choice. Makeshifts last the longest.	[reply] [d/l] [select]
Re: Re: Optimizing existing Perl code (in practise) by smalhotra (Scribe) on Aug 19, 2002 at 16:16 UTC
my $text = `cat $file`; If you're gonna do that, you might as well do `my $text = join '', <FileHandle->new($file)>` and still do it in perl. Frankly I don't know which is faster but I would choose to do it the second way because I think it would be faster.	[reply] [d/l] [select]
Re: Optimizing existing Perl code (in practise) by Abigail-II (Bishop) on Aug 19, 2002 at 16:22 UTC
I'd never choose the second method. With the first, it's immediately clear what it does. With the second, it isn't. And somehow I doubt it's faster - it certainly doesn't look like it, as you first make a list of all the lines, then join them together, meaning you get more than double the memory usage. I'd rather do `my $text = do {local (@ARGV, $/) = $file; <>};` [download] "Still do it in Perl" isn't a goal, IMO. Abigail	[reply] [d/l]
Re: Re: Optimizing existing Perl code (in practise) by IlyaM (Parson) on Aug 19, 2002 at 21:05 UTC
For simple file read/write operation like in this example File::Slurp is the winner: `use File::Slurp; my $text = read_file($filename);` [download] -- Ilya Martynov (http://martynov.org/)	[reply] [d/l]


P is for Practical
	PerlMonks