Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
But what the most experienced people say is, "Don't prematurely optimize!".

I wasn't addressing those understand Tony Hoare's assertion, just those that don't.

Also improvements are often platform specific.

Of course. Should anyone with access to a.n.other OS and a couple of hours to spare care to run my tests on their OS, I'd be interested to see how much they differ.

For instance you got huge improvements by using sysread rather than read. However in past discussions...

Going by the date of the past discussions, I could well imagine that they were before the addition of the extra IO layers that caused the majority of the slowdown post 5.6.x

Someone who reads your post and begins a habit of always using sysread has taken away the wrong lesson.

Anyone who does that hasn't read my post--at least not properly. But I have to say that I have more faith in people than you seem to have. I'm just an average programmer and if I can work these things out out, most others can too.

In the worst case, that someone will use what I describe wrongly and their program will run more slowly. If this happens, they will:

  1. Either notice and correct the problem and be more careful with the application of random optimisations that they read about in the future.

    I'd consider that a long term win.

  2. They won't notice.

    In which case their program probably didn't need to be optimised in the first place, and there is no harm done.

Now some corrections.

You only make one correction, and it appears to be a matter of terminology rather than substance. Whether you class perl's reference counting/variable destruction as garbage collection or not. I have demonstrated in the past that, under Win32 at least, when large volumes of lexical variables are created and destroyed, some memory is frequently returned to the OS.

Also your slams against databases are unfair to databases.

I didn't make any "slams" against databases. Only against advice that they be used for the wrong purpose, and in the wrong ways.

But even on the performance front you're unfair. Sure, databases would not help with this problem.

I was only discussing this problem. By extension, I guess I was also discussing other problems of this nature, which by definition, is that class of large volume data processing problems that would not benefit from the use of a database.

But databases are often a performance win when they don't reduce how much data you need to fetch because they move processing into the databases query engine.

So what your saying is that instead of moving the data to the program, you move the program to the data. I whole heartedly agree that is the ideal way to utilise databases--for that class of programs that can be expressed in terms of SQL. But this is just an extreme form of subsetting the data. Potentially to just the results of the processing.

I see no conflict between that and what I said.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

In reply to Re: Re: Optimising processing for large data files. by BrowserUk
in thread Optimising processing for large data files. by BrowserUk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others lurking in the Monastery: (6)
    As of 2021-05-08 11:36 GMT
    Find Nodes?
      Voting Booth?
      Perl 7 will be out ...

      Results (96 votes). Check out past polls.