Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: "my" slowing down programs?

by sundialsvc4 (Abbot)
on Aug 17, 2015 at 12:08 UTC ( [id://1138875]=note: print w/replies, xml ) Need Help??


in reply to "my" slowing down programs?

Your shop has created such an edge-case ... such an enormous volume of both variables and data that a single program is expected to process ... that I think your only realistic alternative (as a shop ...) is to:   (a) in the short run, “throw silicon at it.”   Then, (b) in some way, start fundamentally re-defining the problem and therefore its approach ... splitting or sharing the file such that multiple blade processors (not cores, not threads, not processes...) can tackle it in parallel, reducing the number of variables from 30-god-thousand to only what is required, and so on.

The trouble with “removing my” is that it really is a fundamental logic-change to the program.   No matter how many years this piece of source-code has been buying your groceries, errors will be introduced, and when they do, how-the-hell will you know?   Data structures that used to be known-empty no longer are, and so on.   The business consequences could be disastrous indeed.   (And I don’t mean to under-state the business risk of “redefining the problem,” but it is a Hobson’s Choice by now.)

Basically ... and this needs to be raised right-now as a senior management issue ... “we are running out of track.   We have been putting this thing off and trying different languages and so forth, but our data volume is catching up with us.”

Replies are listed 'Best First'.
Re^2: "my" slowing down programs?
by jf1 (Beadle) on Aug 17, 2015 at 19:07 UTC

    Completely agree with you here. This is an edge case. Indeed the job of the script is to extract a subset of the columns from the original data set and reduce amount of data.

    Presently the desire to preserve original ordering together with some infrastructural limitations hinder migration to Hadoop-like platforms.

    Nonetheless the present version in use already is quite fast in comparison to the version used before (could speed up the python version I got as a legacy to about using about 1/2 of original time and at same time still improve stability.) Ideas of migration to other languages (as Perl, ...) were born out of academic interest (after seeing the improvements possible just in the legacy code) but didn't result in much progress yet. (Finally another step forward came with a multithreaded tcl implementation.)

    During the process I at least learned to admire the guys who created the string splitting and joining routines in the various scripting languages. These are enlightening examples of highly optimized code.

    P.S.: perl version is embarrassingly short, use of global variables is definitly no issue; I only was amazed to see that the use of lexical variables resulted in a performance penalty.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1138875]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2024-04-19 15:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found