Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: CSV Manipulation

by Tanktalus (Canon)
on Nov 16, 2011 at 20:09 UTC ( #938459=note: print w/ replies, xml ) Need Help??


in reply to Re: CSV Manipulation
in thread CSV Manipulation

This. If I have a choice, and I usually do, I use DBD::CSV. It makes me think about the data as data rather than a string, and has the added advantage of making transitioning to SQLite or a full-fledged RDBMS (DB2, Oracle, etc.) easier.


Comment on Re^2: CSV Manipulation
Replies are listed 'Best First'.
Re^3: CSV Manipulation
by Tux (Monsignor) on Nov 17, 2011 at 09:47 UTC

    One note here. The OP did not mention the size of the CSV data file. DBD::CSV uses Text::CSV_XS under the hood, but it will have to read the complete file into memory to be able to do any database-like operations. With a file of 2Gb, that might result in say 20Gb of memory use (perl overhead). When files are that big - again, I don't know how large the file of the OP is - switching to basic streamed IO processing is usually a lot easier.

    I fully agree though that DBD::CSV is the best step towards RDBMS's where those memory limits are not applicable (for the end-user script).

    YMMV

    update: I just did a quick test with the OP data extended to a 1Mb CSV file. Reading that into memory using getline_all () resulted in a 10Mb data structure (reported by Devel::Size::total_size ()).


    Enjoy, Have FUN! H.Merijn

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938459]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (11)
As of 2015-07-30 07:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (270 votes), past polls