http://www.perlmonks.org?node_id=407735

Is there a point beyond which layers of abstraction are a bad thing for new Perl programmers? Take, for example Class::DBI. Is it a good idea to teach new programmers the use of this module rather than DBI + SQL, considering they might not learn how Perl connects with the database if abstraction intervenes too soon? I can see the point of going straight in with CGI.pm since it takes care of some very gory details but I'm not sure about others. As abstraction becomes more sophisticated it certainly aids productivity but shouldn't we also consider what is lost, ie. a root connection with core Perl code? With some modules, such as Maypole for example, aren't we in danger of becomeing Maypole programmers rather than Perl programmers?

Replies are listed 'Best First'.
Re: Appropriate amount of abstraction
by BrowserUk (Patriarch) on Nov 15, 2004 at 00:03 UTC

    It depends on your reasons for programming in Perl.

    If your aim is to get a job done quickly and efficiently, and have working code ASAP, then the higher level you can code at the more likely you are to meet your goals. In this respect being a Maypole, ColdFusion, FrontPage or even a Excel programmer is about achieving the end goal--the website or 3D-bar chart etc.--with as little effort on 'extraneous' tasks, like dealing with cookies or making sure that your floating point math can handle the numbers you are manipulating with the required accuracy. You want to concentrate on the Business logic of the task, not Perl's esoteria.

    However, if your goal is to be able to write a better Maypole or a command/data-driven charting application, then you need to work at the lower level.

    There is also the school of thought that says that knowing how stuff works at the lower levels, allows you to use the higher level abstractions more effectively. It can stop you from building slow and memory hungry, houses of cards, by combining too many, incompatible, high level abstractions into a single app.

    I am of this latter school of thought, but tempered with the reality that getting the job done often takes priority over getting the job done perfectly.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: Appropriate amount of abstraction
by FoxtrotUniform (Prior) on Nov 14, 2004 at 23:41 UTC

    It Depends(tm).

    If your primary concern is "getting it done", using a high level of abstraction (for instance, Class::DBI rather than plain DBI and hand-written utility functions) will probably result in cleaner, more easily maintained code... as long as the abstraction you pick is the right one. On the other hand, if you're more interested in learning how something works, thick layers of abstraction can get in your way.

    From a pedagogical standpoint, I don't know whether it's better to start by teaching a high level of abstraction and moving to lower and lower levels, or to start with the low-level basics and build on those.

    --
    Yours in pedantry,
    F o x t r o t U n i f o r m

    "Anything you put in comments is not tested and easily goes out of date." -- tye

Re: Appropriate amount of abstraction
by etcshadow (Priest) on Nov 15, 2004 at 03:57 UTC
    It's a ridiculous oversimplification, but a good rule of thumb is: whatever level of abstraction you could be working at, to get the job done, you will be much more successful if you actually work (or at the very least understand and maybe fiddle around a little) one level of abstraction lower. You won't just write more efficient code that way, you'll have a better understanding of the landscape that your particular layer of abstraction exists in.

    For example, if your job is to build business logic at an object layer, over an object relational mapping abstraction (like Class::DBI, for example), then you're going to be able to do a significantly better job by going one level deeper, that is: understanding and possibly working with SQL. If you try to do some really hardcore business logic at the object-relational mapping layer without understanding the relational layer, you will basically write crappy code. It will be slow and it will do ugly things.

    Likewise, if you are trying to write a web application server, then you should really understand at the HTTP layer what is going on. Don't just rely on the fact that apache (or whatever your HTTP tool of choice is) will do it for you. Read the raw socket traffic, look at the headers, that sort of thing.

    Obviously, the better the layer of abstraction, the less meat this argument carries. For example, TCP is a good enough layer of abstraction, that I wouldn't tell you to fiddle around really deeply in the TCP stack if you wanted to write some new layer 5 protocol. That kind of extremely well built layer of abstraction, however, is more the exception than the rule.

    I think that Paul Graham (or maybe it was Joel Spolski) wrote something about this... But his point was more along the lines of: along whatever axis you want to differentiate your product, punch one layer of abstraction lower than you would otherwise have to... I think his specific example was that if you are writing a graphically driven game, don't use the available 3D graphics packages: write your own instead. That sort of thing... but it's still a similar point.

    ------------ :Wq Not an editor command: Wq
Re: Appropriate amount of abstraction
by chromatic (Archbishop) on Nov 15, 2004 at 01:46 UTC

    Why stop with a "root connection with core Perl code"? Perl is an abstraction. SQL is an abstraction. Files and sockets and environment variables are all abstractions.

    At some level you have to talk about something more than which direction electrons flow. Me (not "we"), I like to explain just enough to beginners to help them solve the kinds of problems they face in ways that help them create other solutions on their own.

      Why stop with a "root connection with core Perl code"? Perl is an abstraction. SQL is an abstraction.

      Because Perl and SQL are my starting-points as a programmer. I'm not interested in the details below my point of entry but neither do I wish to do most of my work in an abstraction layer which is too far removed from the core language. I know TIMTOWTDI at the end of the day but I just wonder how far the abstracting goes before you get layers built with, say, Maypole as one of the dependencies and all you have to write is:

      #!/usr/bin/perl use Application; use strict; my $app = new Application; $app->build($data);
        Perl and SQL are my starting-points as a programmer. I'm not interested in the details below my point of entry but neither do I wish to do most of my work in an abstraction layer which is too far removed from the core language.

        Your meditation asks, "what is an appropriate level of abstraction?" It sounds to me like you've already decided this. For you, the highest appropriate level of abstraction is Perl and SQL. That's a fine choice to make, but it makes your question seem a bit misleading.

        I think chromatic was simply trying to make the point that choosing the level that's "abstract enough" is somewhat arbitrary. I'm sure you have reasons for choosing where to draw the line, but, at the end of the day, the most important thing is whether the application does its intended job.

Re: Appropriate amount of abstraction
by pg (Canon) on Nov 15, 2004 at 00:03 UTC

    Even with CGI module, I doubt there are too many users of that module, if there is any, actually have no idea of, for example the basic html tags. It is hard for me to believe that there are people who are capable of coding in Perl and yet has absolutely no interest in knowing those tags.

    Also there is a subtle difference between Class::DBI and CGI:

    For CGI, all what you care is that it delivers the visual effect you want, simple and straight. Class::DBI is certainly a good tool to generate SQL statements for you, but the SQL statement it generated is not always optimized.

    For some database implementations, they join tables in the sequence you stated in your SQL statement, without optimization or thinking. In some cases, the sequence of tale joins is the key to your query performance (you want the join happens in such a sequence that allows you to quickly narrow down the result set). With Class::DBI, you lost this kind of control.

      I doubt there are too many users of that module, if there is any, actually have no idea of, for example the basic html tags

      I'm thinking more along the lines of the magic behind CGI.pm's param() function.

Re: Appropriate amount of abstraction
by itub (Priest) on Nov 15, 2004 at 05:05 UTC

      <voice timbre="with feeling">Yes!</voice>

      If you follow a link or two from your link you come across this gem (amongst others):

      These are all things that require you to think about bytes, and they affect the big top-level decisions we make in all kinds of architecture and strategy. This is why my view of teaching is that first year CS students need to start at the basics, using C and building their way up from the CPU. I am actually physically disgusted that so many computer science programs think that Java is a good introductory language, because it's "easy" and you don't get confused with all that boring string/malloc stuff but you can learn cool OOP stuff which will make your big programs ever so modular. This is a pedagogical disaster waiting to happen. Generations of graduates are descending on us and creating Shlemiel The Painter algorithms right and left and they don't even realize it, since they fundamentally have no idea that strings are, at a very deep level, difficult, even if you can't quite see that in your perl script. If you want to teach somebody something well, you have to start at the very lowest level. It's like Karate Kid. Wax On, Wax Off. Wax On, Wax Off. Do that for three weeks. Then Knocking The Other Kid's Head off is easy.

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: Appropriate amount of abstraction
by bwelch (Curate) on Nov 15, 2004 at 06:35 UTC
    I agree, it depends.

    Considering one case where a new perl programmer has a good understanding of all the abstraction is replacing, things are likely fine, aren't they? For example, could they likely code it without the abstraction so it works, albeit not as cleanly and with more effort? Perhaps the coding requirement above isn't needed. If the programmer clearly understands all the things that are happening using the abstraction, maybe things are fine without learning the details.

    As I learned recently, sometimes a cost in using the abstraction isn't noticed initially. A c program called by a script ended up needing considerable memory.

    So, many considerations exist.

    • Does they understand what exactly is happening?
    • Are they any costs (i.e. time, memory) that should be avoided?
    • Is the task being overly simplified or wrongly performed due to use of the abstraction?

    It's very late, so I'd best leave continuing this list to my fellow monks.

Re: Appropriate amount of abstraction
by Mutant (Priest) on Nov 15, 2004 at 10:30 UTC

    As everyone has already said, it depends on whether you're interested in doing the job, or learning how something works. There's a good chance I'll never use assembler level code in my professional career, but learning how this lower level works gives me a deeper understanding of the higher levels.

    The only case where you really need to use lower levels is for efficiecny reasons. The higher the level of abstraction, the worse the efficiency. However, on the occasions when you do require maximum efficiency (and these cases are far more rare than a lot of people think), then it's good to be able to drop into the lower level.

    Another point, though, is that the more mature a level of abstraction is, the smarter it is about being efficient. Early 3GL languages were probably amazingly inefficient, which is why heavy use of inline assembler was necessary. This is now the case with things like Class::DBI. But maybe in 10 years time, we'll very rarely need to write SQL statements, and the DB abstraction layers will be smart enough to optimise even complex queries for us automatically.

    The ultimate goal of Information Technology is to reach the highest level of abstraction possible. If this goal is ever achieved, then it won't just be SQL we won't need to worry about, it'll also be Perl.

      The ultimate goal of Information Technology is to reach the highest level of abstraction possible. If this goal is ever achieved, then it won't just be SQL we won't need to worry about, it'll also be Perl.

      Then no-one will be able to understand how anything works. Layer upon layer of abstraction usually decreases efficiency and increases the number of different places you have to look to understand how something works. There is a law of diminishing returns at work and beyond a certain point you end-up with Java lego code, which is how many new students now enter the software industry. I got into Perl after rejecting this facet of Java but Perl can end-up the same if you keep adding more and more layers of abstraction.

        There are good and bad abstractions. Speaking purely in terms of computational efficiency, a good abstraction gives an optimizer more information about what you're trying to do. Optimization is basically about reconginzing that there are multiple ways of compling a given bit of code, and the Programmer shouldn't have to worry about what the best way is.

        Speaking in terms of programmer efficiency, a good abstraction will allow the most work done with the least amount of code. A good language will aim for the easiest way to program something to also be the most computationally efficient. For instance, a Perl programmer who has a large ammount of data indexed by a string will often automatically reach for a hash. Which is great, because a hash is likely the most computationally efficient way, too.

        Now, Class::DBI gets tripped up because it can't get enough information to create the SQL. Consider this Law-of-Demeter-breaking snippet for getting through a few joined tables:

        my $table3 = $table1->table2->table3;

        If we were writing raw SQL, we could get this in a single SQL statement with a three-table join (which is hairy, but possible). But because Class::DBI has to fetch each table individually, a lot more SQL would be needed (I'd have to dig through the source to know just how much, but it's definately more than a hand-optimized version).

        So this is a case where the abstraction makes it easy on the programmer, but hard on the computer. Maybe that's OK, and maybe it isn't. It all depends.

        "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        There's a difference between needing to understand how something works in order to use it, and needing to understand it in order to build and maintain it. I'm sure so long as the lower levels use Perl, then people will need to know Perl (besides just for personal fulfillment). But that number could be a lot smaller than it currently is. (I'm not trying to predict the demise of Perl here, I'm just saying that if the goal of a much higher level of abstraction was achieved, this is how it could potentially be.)

        There are also good and bad implementations of higher level abstraction. Someone might try to implement a Natural English programming language that compiled to Perl code, but unless they made their abstraction layer extremely smart, it probably wouldn't work very well. But that's no reason to cast out the idea of having a Natural English programming language. If it could be implemented well, it would be something that would probably be very useful. The difference between a good and bad abstraction layer is how smart the layer is. If it's good at understanding what the programmer (or user) is trying to do, and optimising it, then it's much more likely to be successful.

        Sometimes, levels of abstraction are attempted too early. That is, we needed one or more intermediate levels before the higher one was possible. For example, if someone had tried to implement Class:DBI without DBI, it might've been a huge failure. Again, this isn't a reason to say Class:DBI is a level of abstraction too high, just that we need to wait until the intermediate levels are completed (something which can take a long time - they have to be mature) until the next level is possible.

        I'm sure there are plenty of attempts to create a level of abstraction that are implemented poorly, or are just before their time. Perl tends to be good at not rushing towards higher levels.

Re: Appropriate amount of abstraction
by johnnywang (Priest) on Nov 15, 2004 at 07:55 UTC
    I assume most of us taught ourselves perl, rather than taking some classes, at least in the beginning. I'd say we can only afford the abstraction that can get the current job done as quickly as possible. CGI.pm is a great example that a novice can start using in the first day. But a good programmer will never stop at the initial level of abstraction: he/she will always go lower, and higher. Both directions provide much better understanding of the module at hand. So teach them whatever that can get the job done now, and give them time to figure out the rest.
Re: Appropriate amount of abstraction
by wfsp (Abbot) on Nov 15, 2004 at 18:22 UTC
    I would have found it very difficult to understand recursion if in the past I hadn't dabbled with Zilog Z80 assembler. Knowing what call and ret pushed on and popped off the stack has helped me get to grips with the concept of a sub calling itself. When I've tried to explain it to other people I always get blank looks.

    In engineering the difference between skilled and unskilled is not only knowing what to do but why it is done that way. I remember a tutor explaining it was the difference between training and education.

    Which reminds me, if you tell an engineer something is broken they will inevitably strip it, clean it, oil it where appropriate, rebuild it and then see. With electricians in particular (they have magic cloths!), this is often all that is required.

    Having some idea of what's going on under the bonnet is _good_.

Re: Appropriate amount of abstraction
by TomDLux (Vicar) on Nov 15, 2004 at 16:43 UTC

    Seventy years ago, people were taught how a telephone works. Who cares?---other than people whose job involves telephone internals, of course. My mom's boyfriend's understanding of a packet of carbon dust being compressed by sound waves does not make him less or more capable of picking up a phone and dialing a number. Everything he learned about how a phone works is obsolete, but everything he learned about how to use a phone is still valid, on a land line, a wireless, a cell phone...other than asking the telephone operator for a date

    Twenty-five years ago I was interested in digital logic, microprocessors, RAM, ROM, and so on. The bulk of what I learned is irrelevant when you look at a 4GHz P4, the little that remains is no better an explanation than, "it happens by magic".

    How, precisely, does MySQL process a query involving three tables and a sub-select? What happens, detail by detail, from the moment you type a URL into your web browser, to the point where the display is complete?

    When I took the dreaded trains course at UWaterloo, the final exam had one question where we had to describe the first time one processes sent a message to another process in the real-time operating system we had written. It took me 1 1/2 hours and 5 pages to discuss everything that took place.

    Theach them at a high level, and point out where additional info is available. You often wind up looking at details during debugging, so the person will pick up information during their work. If they don't ever need it, they'll have been spared learning things that aren't important to them.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

      I say that in any case where SQL is being used, knowing the queries and being able to optimize them by hand is crucial to the core performance of the application. It should absolutely be required knowledge. There are some guys in my company using java that work at a higher SQL abstraction level and their apps are slower than molasses. A little hand sql tuning and it turned to tip top shape.
Re: Appropriate amount of abstraction
by hardburn (Abbot) on Nov 15, 2004 at 20:36 UTC

    Something else that occured to me while reading the responses here: where do you stop moving down the abstraction levels?

    The old PostgreSQL HOWTO on tldp.org started with an introductory physics lesson (and a bad one at that). (This HOWTO has since been removed, thankfully). The author's justification was that relational databases work on computers, and computers are based in a physical world, so you can't understand PostgreSQL until you understand physics.

    Now there's a reason for a LART if I ever saw one.

    Why stop at explaining electrons and atoms, as the author had? Those are based on more fundamental particles. And those are based on yet more fundamental particles. And so on, down the abstraction chain, until you hit truely fundamental particles (if there are any, that is). So in our quest for fully understanding PostgreSQL, we may find ourselves traversing an abstraction chain of infinate depth.

    Making the assumption that I am not immortal (though I'll be quite happy if I turn out to be wrong here), I'll just install PostgreSQL and find some documentation elsewhere. If I ever want to dig further, I could learn about B-trees and how relational databases optimize their data layout for spinning magnetic media (like hard drives) and such. I would not jump down to electrons and work my way back up.

    Update: Found an old copy of the HOWTO: http://www.yolinux.com/HOWTO/PostgreSQL-HOWTO.html.

    "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Re: Appropriate amount of abstraction
by nothingmuch (Priest) on Nov 17, 2004 at 09:32 UTC
    I'd just like to add that in my experience, maintaining overly abstract code is easier than maintaining overly linear code.

    Recently I've had to fudge Maypole, Class::DBI and friends around quite extensively to get my job done.

    Some of the stuff I did to Class::DBI include objects which bless themselves at retrieve/create time based on an implicit class field, objects which pretend they've been deleted using a lifespan table, which contains (has_many) the times in which the row existed. The classes override delete, search, retrieve and so forth, and allow you to create "view handles" into the past, to query the database as it was at ... The list goes on.

    All of that has been very easy to do because Class::DBI is very heavily abstracted, both by design and by implementation. Sure learning curve is a bit higher, and you have to juggle more thoughts at once till you're intimate with a bit of code, but the amount of time you spend changing the abstracting core is traded off to time spent extending it, which is a much easier task, IMHO.

    Good post, ++!

    -nuffin
    zz zZ Z Z #!perl
Re: Appropriate amount of abstraction
by DentArthurDent (Monk) on Nov 16, 2004 at 12:43 UTC
    One thing that has been left out of the discussion, in my mind. The original poster seems to have a negative connotation to just using the highest level of abstraction to get the job done. Myself, I look at that as standing on the shoulders of giants. Programming in one way is science and the way we advance is to do the things we know how to do with the tools we have at hand. So there is no shame in using the high levels of abstraction to do something new. Eventually that something new could become something that someone else uses to do a new thing.
    ----
    My mission: To boldy split infinitives that have never been split before!