Re: "my" slowing down programs?

Your shop has created such an edge-case ... such an enormous volume of both variables and data that a single program is expected to process ... that I think your only realistic alternative (as a shop ...) is to: (a) in the short run, “throw silicon at it.” Then, (b) in some way, start fundamentally re-defining the problem and therefore its approach ... splitting or sharing the file such that multiple blade processors (not cores, not threads, not processes...) can tackle it in parallel, reducing the number of variables from 30-god-thousand to only what is required, and so on.

The trouble with “removing my” is that it really is a fundamental logic-change to the program. No matter how many years this piece of source-code has been buying your groceries, errors will be introduced, and when they do, how-the-hell will you know? Data structures that used to be known-empty no longer are, and so on. The business consequences could be disastrous indeed. (And I don’t mean to under-state the business risk of “redefining the problem,” but it is a Hobson’s Choice by now.)

Basically ... and this needs to be raised right-now as a senior management issue ... “we are running out of track. We have been putting this thing off and trying different languages and so forth, but our data volume is catching up with us.”

Comment on Re: "my" slowing down programs?

Replies are listed 'Best First'.

Re^2: "my" slowing down programs?
by jf1 (Beadle) on Aug 17, 2015 at 19:07 UTC

Completely agree with you here. This is an edge case. Indeed the job of the script is to extract a subset of the columns from the original data set and reduce amount of data.

Presently the desire to preserve original ordering together with some infrastructural limitations hinder migration to Hadoop-like platforms.

Nonetheless the present version in use already is quite fast in comparison to the version used before (could speed up the python version I got as a legacy to about using about 1/2 of original time and at same time still improve stability.) Ideas of migration to other languages (as Perl, ...) were born out of academic interest (after seeing the improvements possible just in the legacy code) but didn't result in much progress yet. (Finally another step forward came with a multithreaded tcl implementation.)

During the process I at least learned to admire the guys who created the string splitting and joining routines in the various scripting languages. These are enlightening examples of highly optimized code.

[reply]


"be consistent"
	PerlMonks