http://www.perlmonks.org?node_id=54033

"My approach to language design has always been that people should learn just enough of the languages to get their jobs done. They shouldn't have to learn the whole language to begin with. - Larry Wall "

I am one of those people who would much rather hire a programmer to write scripts to my specifications, but is not in a position to do so. Instead, I set out to teach myself the basics of a programming language so that I could perform permutation tests on data. It seemed to me that something so simple as the random permutation of an array could not be that difficult, and that basic input and output of data was so common, in any language, that even a beginner should be able to accomplish that correctly.

I shopped around for a while and even tried to do it in Nisus Writer (go figure!), until I saw a small and elegant script that included both AppleScript and perl. Using apple script, it called for an input file, and then the data was read into a perl array, concatenated and written to disk. I am very grateful to the author of that script, because at that point I was completely stuck with my data analysis. I had run into the limits of an Excel sheet and this script solved that specific problem. With this I already had input and output, so I just needed a small subroutine to do the permutations and presto!: a working permutation script (feel free to laugh at my naivité).

I thought I was ready when I saw the Fisher-Yates shuffle (page 121 of the Cookbook) and with a big smile sat down to write my first script. It turned out not to be that easy, but not too difficult either. Perl baby talk may not be nice to look at, but if it is fault free and gets the job done, then it is the answer semi-computer-savvy people like me have been looking for. I can even envision a Lego style of programming in Perl, were students are offered pre-built subroutines to toy around with. This would allow, for example, its use in a course-like environment, where the course is not about programming but something less arcane, like statistics.

But there is one lurking danger. At some point I will (have to) publish my results, in the knowledge that most of my scripts, save some components that I considered worthy enough to post, will not have been properly reviewed by knowledgeable people. As the responses to this post pointed out, I was lucky to have put it on public display here. But I would make myself very unpopular if I started to put all my individual scripts in 'craft' just to be sure if they are correct.

This leads me to the following questions:
1. Is it just a matter of programming self-confidence, to be able to say "this is finished and correct", without a uneasy sense that there may possibly still be a mistake somewhere.
2. What would you advice a "I just want to apply the basics correctly" type of monk like me?
3. What are the inherent dangers of cut and paste programming?

  • Comment on Some of use just want to know the basics

Replies are listed 'Best First'.
(jeffa) Re: Some of use just want to know the basics
by jeffa (Bishop) on Jan 24, 2001 at 21:44 UTC
    1. I often find that a program is never finished. As long as there are end-users using your program, it will never be finished. I think that programming self-confidence can be a bad thing, if you get too confident, then your vanity will get in the way of clearly seeing your solution. Another set of eyes, if available, will always see past the filters that cloud your vision.
    2. Keep programming. Keep reading. Keep learning. And most important, keep communicating with knowledgeable people.
    3. About the same as cut and paste bomb construction, except you won't loose your hands. :) Cut and paste is not the same as module re-use. There are docs for modules, not for pasted code.
    Oh, I almost forgot to mention - study software design.

    Jeff

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    F--F--F--F--F--F--F--F--
    (the triplet paradiddle)
    

      Thank you all for the response. If you allow me, I will try to get the context in which I posed the questions into a sharper focus, and propose a conclusion.

      Since I signed up for a password here at PM, I have learned a lot by reading, and reading more and trying to follow your discussions. Most of this in my spare time, since I am not supposed to programming, I’m supposed to be analyzing data. I enjoy learning Perl to a great extent, and there are many pitfalls that I am now able to avoid, thanks to PM.

      However, I started this thread with a quote from Larry Wall, because to be perfectly honest, in a sense I was fooled by the Perl evangelists into believing that this would be a language that would really allow me to define a subset of instructions that I needed for the (relatively) simple job at hand, and think no more of it. I work at a university, so any new tool or technique I develop, I’m supposed to be able to explain to someone else (be it a member of staff or a student). If I quote Larry, some of them will be as enthusiastic as I am, but if I tell them that ‘IT=learning perpetually’ it is unlikely that they will rush out and buy a copy of the Camel. Science is learning perpetually too, and most of us hardly have the time to keep up with the primary literature. If there is no ‘copy and pasteable’ set of instructions I can come up with, or a small and well defined subset of the language, then the applicability of this in a course on statistics (for example) is rather limited.

      Permutation analysis appears to be a promising (or at least a very useful) tool in the biological sciences 1,2. It has the great disadvantage that one has to be computer savvy and know some basic programming to be able to apply them effectively. From your response so far, instead of talking about the marvels of this language (as I have done in public), I should advise them to either hire me (which is not that bad an option, since I will reach the end of my contract soon :) or seek other, professional, advice when staff or students want to apply permutation methods or need to massage large datasets.

      Refs:
      1 Manly, B. F. J. (1991). Randomization and monte carlo methods in biology. New York, Chapman and Hall.
      2 Good, P. (1994). Permutation tests, a practical guide to resampling methods for testing hypotheses. New York, Springer Verlag.

        Ah! Now you're asking about effective advocacy.

        I'm going to disagree with an analogy you've made here:

        If I quote Larry, some of them will be as enthusiastic as I am, but if I tell them that ‘IT=learning perpetually’ it is unlikely that they will rush out and buy a copy of the Camel. Science is learning perpetually too, and most of us hardly have the time to keep up with the primary literature. If there is no ‘copy and pasteable’ set of instructions I can come up with, or a small and well defined subset of the language, then the applicability of this in a course on statistics (for example) is rather limited.

        Consider this: In one course on statistics--let's say, a sophomore-level terminal course designed for people in the social sciences--one doesn't need calculus. All that's going to be needed is high school algebra (if you went to a good high school)--a reasonable expectation for any college student.

        In another course--let's say, a junior-level course, not terminal but without a required follow-up designed for people in engineering--one might well require some simple first-semester calculus--a reasonable expectation for an engineering student.

        In another course--let's say, a senior-level course in stat theory designed for statistics and math majors--you'd darned well better be able to integrate over multiple variables, and you'd better have a fairly broad overview of math--a little linear algebra wouldn't hurt--a reasonable expectation of an upper-class stat/math student.

        Now, can we really say that the first or the second course have "a small and well-defined subset" of mathematics required for statistics? Not really, in my opinion.

        Much of what is talked about here for really good style is about writing safe and robust code that'll work in a an uncertainly secured, multi-user, production environment. What you describe isn't that--it's someone working at a desktop or a workstation on data they've captured, which can reasonably be expected not to have any booby-traps in it--for instance, I don't think you have to worry about tainting.

        The analogy to make when talking Perl to these folks is language--no one (even ex-English majors like me) expects to know everything there is to know about the language, but everyone expects to know what they need to know to use it. Perl is quite similar--there is a fairly basic set (darned if I can define it right now) of functions and operators that'll do simple data munging, and most of them operate in a fairly intuitive way.

        I do know the attitude you're describing--it's prevalent in business (the number of highly-paid, otherwise competent business people in the IT industry that I've had to show things like how to defrag their drives (and why), or how to make Excel add up a column of values (I'm not making this up) is astounding), but I think you have a chance to advocate Perl effectively, if you can rein in your enthusiasm just a bit.

        Tell people that Perl is this very neat programming language that's a lot like English (or whatever language they favor)--you can do very easy things in it very easily, and you can (if you want) shoot for becoming Shakespeare.

        They key to doing this is to define for yourself the very basic tools in Perl that are sufficient for the task you have at hand, then use, and demonstrate the use of, those tools tirelessly.

        As far as course presentation goes, consider two approaches:

        1) Write a module (simple modules aren't hard to write--you can (don't hurt me!) cut and paste the basic template from perlmod and begin experimenting) that does what you need done, then present that.

        2) Suggest that Perl is such a valuable yet simple tool for data analysis that it would be worth teaching as a part of the course--only a week or two. It is (in my opinion) a reasonable expectation of a student studying statistics in the sciences that said student has had some exposure to the basic concepts of programming. Check the catalog where you work and see whether or not a basic computing course is a core course--it's likely that it is.

        (If it's not, then a larger, more difficult, and longer-term project is to convince the powers that be that a course in programming should be required, and that Perl is an ideal language for such a course.)

        It's very easy to get caught up in the fancy stuff Perl can do and forget that the basics of Perl are still in there, and that one doesn't have to use all the bells and whistles just because they make pretty noises.

        Good luck--and thanks for starting an interesting (and, I hope, useful) thread.

        They laughed at Joan of Arc, but she went right ahead and built it. --Gracie Allen

        I find it ironic that Larry would invent a very large language that very few could be reasonably expected to master and then says that you don't have to master it... :-)

        But I think that Larry is right. You can get things done while knowing remarkably little Perl. But there is a lot of Perl to learn, and learning it brings benefits. There is a lot to learn about programming in general, which will likewise bring even more benefits. You don't need that to find it useful. But without it you will repeatedly shoot yourself in the foot, and you won't even know you are doing so! That is life.

        Neither Larry Wall or anyone else can make learning more a useless or irrelevant thing to do. (Though some - in particular Microsoft - try to market products that they claim have done so.) But you can certainly define a useful subset of the language which you stick to and can readily bring someone else up to speed on. There is sometimes great value in doing so.

      I strongly agree with jeffa.
      The only thing I would add is that you even though a "program is never finished" and I am confident in my work, I still occasionally get nervous regarding the response of the end user. Their response determines how much longer it will take me until I can move on or accept another project. Nervous is good. It reminds you to go through your tests (for the millionth time) of your work:)
Re: Some of use just want to know the basics
by Adam (Vicar) on Jan 25, 2001 at 00:49 UTC
    1. Is it just a matter of programming self-confidence, to be able to say "this is finished and correct", without a uneasy sense that there may possibly still be a mistake somewhere.
      No. I have found that a program is only finished when it is no longer being used, and that it is never truly "correct" in the sense that it is the best way to do it. TIMTOWTDI is more then just a slogan, its a reality. The result is that as you learn more techniques your view on what is "correct" will change. As for "correct" in terms of being bug free, self confidence is not the same as sufficient testing. And neither is ever enough to avoid all bugs. A solid design and a deep understanding of all the potential problems will put you on the right path, but there will always be bugs. The trick is keep the bug count low. (Unless you work for NASA, in which case the trick is to keep the bug count == 0, but they devote more time to design then some companies spend doing an entire life cycle.)
    2. What would you advice a "I just want to apply the basics correctly" type of monk like me?
      Time for the Nike slogan: "Just do it." The best way to learn to use Perl, is to use Perl. Define your problems and write code to solve those problems.
    3. What are the inherent dangers of cut and paste programming?
      Many. The worst danger is that you inherit some one else's bugs. Other dangers include the fact that if you lack an understanding of the code then it makes it impossible to maintain. Never use code that you don't understand. This is, as others have said, different from using third party solutions or modules.
Re (tilly) 1: Some of use just want to know the basics
by tilly (Archbishop) on Jan 25, 2001 at 08:31 UTC
    What is wrong with cut-and-paste? Well our resident wizard said it very well in On reinventing the wheel.

    As for general advice, well that is a huge topic. I would suggest trying to learn to think beyond the next line of code. Books like Code Complete, The Pragmatic Programmer, some more reviewed ones and many others are good. There are a lot of good design discussions here.

    And contrary to what people here have said, there is at least one fairly large piece of software that is actively used and complete. Donald Knuth declared on Oct 3, 1990 that TeX and Metafont were feature complete and would only get bug fixes. At the time TeX was at version 3.1. There have been 2 releases since then. From the summer of 1993 onwards the version of TeX has stood at 3.1415.

    It is today used by perhaps a million people, and in several specialty areas (eg publishing math) is the standard. OTOH Knuth is, well, Knuth... :-)

Re: Some of use just want to know the basics
by Beatnik (Parson) on Jan 24, 2001 at 22:04 UTC
    1. Sometimes it takes self-confidence to keep you going on the code, since the aim might seem a bit too high, but as Jeff says, too much will kill you uhm spoil it :) The difference between commercial projects and personal projects (or non-commercial if you will) has a great deal to do with expectations. Altho the users will always want more, on personal projects you have complete freedom (well, uhm usually :) )
    2. IT = learn ad infinitum. There are always new things to learn, always unknown territory. Take small steps. Don't aim TOO high. Learning the hard way, is the best way. Don't be afraid to roll your own code (I might get --'ed for saying this :) ) , but look at alternatives as well. Don't take things for granted. Read the perlstyle.
    3. You won't learn a thing if you don't look into the code you copy/paste. You might learn what something does, but not why it does so. Look up the functions you use in the perlfunc, look at how other people use the functions but don't follow it blindly.


    Greetz
    Beatnik
    ... Quidquid perl dictum sit, altum viditur.
Re: Some of use just want to know the basics
by clemburg (Curate) on Jan 25, 2001 at 15:55 UTC

    1. Is it just a matter of programming self-confidence, to be able to say "this is finished and correct", without a uneasy sense that there may possibly still be a mistake somewhere.

    No. You can and should write tests that verify your claims about your software. If your software beats the tests, you can say "this is finished and correct, as far as my tests go", which is quite an achievement over the state where you just have beliefs about the software. Single-stepping your code in a debugger (small pieces!) also helps in assuring yourself that your software does what you think it does. Code reviews by others are another good thing to increase your confidence. Giving your users a chance to give you feedback, and then listening to them, is helpful, too.

    2. What would you advice a "I just want to apply the basics correctly" type of monk like me?

    To gain confidence? Read a lot of code, write a lot of code. Try things, don't be afraid. Write tests. Debug your programs, even if you think there are no mistakes. Have others review your code. Have others use your code. Reuse your own code.

    3. What are the inherent dangers of cut and paste programming?

    Putting things in your program that you probably do not understand to any degree. Nasty side-effects. Reinventing the wheel (cut-and-paste-and-cut-and-paste-and-cut-and-paste-and- ... you get the picture). Not learning to write modules. Not learning to think in terms of modular building blocks. Not learning what programming is all about. Aside from that, of course nearly everybody cuts and pastes. The only difference is that the really good programmers do it only once or twice per item.

    Christian Lemburg
    Brainbench MVP for Perl
    http://www.brainbench.com

Re: Some of use just want to know the basics
by mothra (Hermit) on Jan 25, 2001 at 05:52 UTC
    I'll offer my thoughts on copying and pasting, if only because I feel pretty strongly about them. :)

    Never copy and paste code.

    There's no magical reasoning for this. It's quite simple actually. When you innoncently think "oh! I'll just copy and paste this code here, make a few changes and I'll be finished!", it rarely works out that you actually REMEMBER to make all of the changes that were necessary.

    Of course, I'm not talking about the copying and pasting associated with pieces of code that should be put into a module, but with pieces of code (perhaps even one if statement that you want to make into four) that "almost" do what you want, but not quite.

    The one situation where I'll use copying and pasting is when writing stored procedures in SQL. It's much faster to copy and paste table names and columns (one at a time usually) than it is to keep retyping them.

      For a real world example of cut-n-paste wrong doing: This idiot had made multiple modules with this error in it: The code was Info-Basic, but I'll present it "Perl" style

      for ($x=1; $x < 999; $x++) { ProcessAccountNumber($x); }
      He did this for every single number based item. ("Gee, we'll never see numbers that big.") Yes 999, for transactions he used 999999.

      Ow. remembering that hurt.