http://www.perlmonks.org?node_id=1209455


in reply to creating and managing many hashes

HI Gtforce,

I suppose that this flat data file is coming from some fancy corporate sales/inventory DB. This may sound flippant, but buying some dinner and drinks for the person who generated file for you might yield the most efficient/effective solution for you! But I guess you have already considered that...

It sounds conceivable that your data processing could all be done in an Excel Spreadsheet with no Perl programming at all. I haven't done any serious spreadsheet work in years, but spreadsheets can be huge now, 2 million rows is possible.

Let's talk about Perl:
You are inexperienced at Perl and sounds like you have no SQL experience. However, I believe that a solution that involves learning the "least amount of new stuff" will involve learning a targeted subset combination of both Perl and SQL. Using an SQLite DB will simplify the data structures that the Perl code has to work with (less fancy Perl to learn). I believe that learning basic DBI will simplify your Perl code.

SQlite is the most used DB in the world because it is on every smart phone. SQlite doesn't require any fancy server setup and admin - it uses a simple file for its work. So huge admin hassles just disappear. You will need to learn how to create tables, insert new records, select (i.e. get) records from the DB. Only a very,very small subset of SQL needs to be learned. For the Perl I/F with SQLite, you will need to learn a subset of Perl data structures. I recommend only one: how handle an AoA, a 2D array structure or a reference to such a thing. Don't start with learning everything, just learn this fetchall_arrayref() function well.

From what I see so far, a basic idea could be:

The idea of this running for 4 hours is insane. Something is seriously wrong if this doesn't run in <4 minutes.