Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^2: Learning Fundamentals: Data Structures and Algorithm Analysis in Perl

by tilly (Archbishop)
on May 01, 2009 at 18:54 UTC ( #761366=note: print w/ replies, xml ) Need Help??


in reply to Re: Learning Fundamentals: Data Structures and Algorithm Analysis in Perl
in thread Learning Fundamentals: Data Structures and Algorithm Analysis in Perl

Very mild disagreement here with For the most part, the monks I see using big O do in fact know what it means and are aware of all or most of the issues involved in it.

I would say that most of the monks I see using it are aware of how it is used informally by hackers, but a very large fraction are unaware of the formal definition, and/or are unaware of several subtleties that come up fairly often. (I rather strongly suspect that most who are aware that the formal definition probably couldn't come up with a reasonable version of it on demand...)

For instance if someone says, "That's O(1), not O(n)" they are really thinking of big-Omega, not big-O. Because by the formal definition, anything that is O(1) is also O(n). However that's not how hackers informally use the term. And lots of them don't know that their informal usage is technically incorrect.

Furthermore hackers usually have an implicit "amortized average time, if things work like they are supposed to" thrown in that they are not even conscious of. Worse yet, if they were asked to really explain what they meant, most probably couldn't. For instance most of us accept that, "hash lookup is O(1)." Well actually that's the average case if your hashing algorithm is working properly. The worst case with most hash implementations is O(n). If your hashing algorithm isn't a good fit, you'll hit that worst case fairly often. And what about that "amortized" bit? Well if you accept that pushing an element is O(1), you've implicitly accepted the amortized bit. If you're doing Perl programming that's OK. If you're doing real-time programming however...

Then of course there is the glaring issue that scalability is often not what matters. For instance which is faster? Scanning a sorted list for an element, or doing a binary search? Despite O(n) vs O(log(n)), in a low-level language on modern CPUs, scanning is probably faster. Very few of us have a good sense for this kind of detail.

As you see there are a lot of issues and complications that come up in a more formal discussion. My suspicion is that a large fraction (and possibility the majority) of Perl hackers who throw around the term aren't really aware of those. They could learn it pretty easily, but don't know it off of the top of their heads. However I'd absolutely agree that their understanding is more than sufficient for a practical analysis of the kinds of problems that we programmers are usually likely to encounter.


Comment on Re^2: Learning Fundamentals: Data Structures and Algorithm Analysis in Perl
Re^3: Learning Fundamentals: Data Structures and Algorithm Analysis in Perl
by BrowserUk (Pope) on May 01, 2009 at 21:33 UTC
    but a very large fraction are unaware of the formal definition

    I think far too much is made about "formal definitions". Back when I did my formal CS education, I chose a multi-part long questionon big-O as my optional question. I passed the exam with distinction. Of course they don't tell you anything more than the overall scores, but I think I would probably have had to have scored perfectly on the four compulsuries to achieve that if I'd totally flunked the optional. Back then I could have, and probably did, give a formal definition of big-O and big Omega.

    But over a 25 year career, I never once needed to repeat it. And nor has it ever been useful to me. But the education means that I do have a way of informally discussing algortihmic complexity in terms that many other programmers recognise and understand. So in that sense the knowledge of it is useful.

    But in the real-world, when, as you point out, an O(n^2) algorithm in one language can out perform an O(n log n) in another; when memory allocation/reallocation costs can render a formally faster algorithm slower than a formally slower algorithm, because the latter is more cache friendly (eg. Mergesort versus QuickSort); and given multi-tasking OSs and mutli-core hardware renders static formal analysis iffy at best and an outright lie at worst; the usefulness of big-O is pretty much limited to discussion purposes only. A benchmark run many times under real-world or realistic conditions renders it mostly redundant.

    A starting point, and a means of informal discussion, but little more.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      When I speak of a significant portion of hackers, I am not speaking of any particular one. I am not surprised that you'd have learned the formal definition.

      Although I have to disagree on how extreme your dismissal is. Absolutely big-O is useful as a starting point and means of informal discussion. However it also provides a framework for how to think about performance, and that can go a lot farther. Being aware of big-O has helped me predict how well real systems will scale, understand bottlenecks, and multiple times has given me a sense as to whether a given implementation's performance is reasonable, or whether a performance mistake needs to be tracked down.

        My point wasn't that I know it. More that I knew it once, but probably could not have recalled it accurately for most of of my career and never missed the absence of the recall. I've never had the need to establish the formal, algorithmic complexity of any given algorithm in the real-world, since I learnt the formal methods involved. But the informal appreciation of what the notation means has allowed me to discuss algorithms without worrying too much about the fine distinctions.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://761366]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (15)
As of 2014-09-18 08:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (109 votes), past polls