Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Modify values of tied, split lines in a file

by sundialsvc4 (Monsignor)
on Oct 22, 2012 at 18:06 UTC ( #1000396=note: print w/ replies, xml ) Need Help??


in reply to Modify values of tied, split lines in a file

My very-candid opinion is that you are creating an unholy monster that you will regret for the entire brief remaining tenure of your employment.   Don’t tie to a file just to avoid reading the thing line-by-line and using split; or, better, using a CSV-file handling package of known provenance.   Don’t try to “cut out seemingly-wasteful steps” only to have the program, for example, crash-and-burn in the middle and in so doing leave your both-input and-output file destroyed.

Step back completely from your present approach and reconsider the whole thing.   You are being led-on into unknown territory by the allure of the unfamiliar.   There are no words of warning strong enough to use here.


Comment on Re: Modify values of tied, split lines in a file
Re^2: Modify values of tied, split lines in a file
by glemley8 (Acolyte) on Oct 22, 2012 at 18:51 UTC
    I appreciate your honest criticism, but it is extremely vague and not very constructive. You've told me what not to do, but not why, nor any suggestions.

    I'm working on a set of scripts that were jimmy-rigged together and my task is to streamline them. They do currently work, but in a very poor manner. The input file is opened, read, and closed repeatedly during the process, which I'm working to minimize. I understand that it is not good practice to load very large files into memory, hence why I'm using Tie::File to read the input line-by-line, which is working very well. There is no risk of losing my input file, since my script creates a copy and works from it.

    You seem to think I'm going to blindly forward my work without any debugging or testing. For each addition, I run multiple tests to make sure the script is moving in a positive direction and that the output data is verified. You philosophy seems to be "if it ain't broke, don't fix it". My job is to fix it and perfect it.

    I invite any constructive suggestions.

      And I will take those rebuttals at face value and now try to address them as best I can ... since now I see that you were not the source of your problem.   Your situation is a familiar one, and if you interpret what I have said (alas, reasonably so ...) as a personal affront, then I now personally and publicly apologize for it.

      (Let me reiterate that:   my initial response, I now see, was strikerather/strike that of a horse’s asterisk.)   :*{   Okay, I said it first.   I am sorry.   May we please move on.

      The use of Tie::File is basically an in-efficient way to handle the input file, but for the moment “it is one that works” and I am not personally familiar with whether or not it loads the entire file in memory.   If it does, then that part of the program must immediately be replaced at whatever the cos might be, because it just might be the camel’s straw.

      In any case, the notion of modifying the file, if it remains a file in its present form, should be immediately and categorically excluded.   You need to consume a file as input, and to produce a file as output, without altering the input and with complete replacement of the output.   That is, if the output file in question must be of the same format and cannot possibly be, say, an SQLite database file instead.

      I cordially suggest that your task is destined to be more than “streamline.”   The best strategy would be to work with a file format that is specifically designed to be a read/write file, such as SQLite.   You definitely do not want to be working in terms of explicit print statements, even if they work “at the moment,” because they are destined to be maintenance PITAs forevermore.  

      The present modus operandi of this collection of legacy scripts is ... doomed, unsalvageable.   And so, not to be continued.   Deeper cuts, made carefully but made once, will lift this long-standing headache out of its present mire and could well dramatically transform it.   I suggest that you need to advocate for permission to make this deeper approach.

      (Please re-read the next to last sentence of paragraph #2.)

        (Damn!)   Why did Perlmonks log me out?   Those words above were mine.   But I can’t edit them now unless The Gods give them to me.
Re^2: Modify values of tied, split lines in a file
by tbone654 (Sexton) on Oct 22, 2012 at 19:59 UTC

    The first thing I would consider is whether the script is or will always be run on a server I have control over or not. If not or not sure, avoid creating a new file, database, or use modules that you may regret later. KISS principle.

    Next I would consider whether it is or will become a web app... Then same as above try to keep it lean. (as few modules as possible)

    Then, if you still have to write a file, consider whether it really needs to be a CSV file. In my work (stock indexes) I find it's easier to use SDF to write the file then read it back in and split.

    I do large scrapes and manipulate years worth of data line by line without creating a file, and that's by choice, without much penalty. for what it's worth.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1000396]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2014-07-29 23:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls