Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Re: Re: Cannot read in multiple lines

by DamnDirtyApe (Curate)
on Nov 04, 2002 at 09:28 UTC ( [id://210146]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Cannot read in multiple lines
in thread Cannot read in multiple lines

Why read the whole file into an array?

To prevent the file being read every time the function is called. Of course, it isn't strictly necessary, but it seemed appropriate here.

Why process each record twice?

If you mean why read each record in and then split them all, I felt it simplified the lookups. You could split a record once it's been matched instead, but this does it all up front. It still runs in order-n time, so unless the table is huge, it shouldn't be a big deal.

Why perform a numeric comparison?

Because the IDs appeared to be numeric. Is this somehow worse than using eq?

Why the following line? $|++ ;

Part of my emacs template for Perl files; it prevents output buffering. For more info, see perlvar or Suffering from Buffering.

If you've got reasons why these methods are poor, I'm all ears.


_______________
DamnDirtyApe
Those who know that they are profound strive for clarity. Those who
would like to seem profound to the crowd strive for obscurity.
            --Friedrich Nietzsche

Replies are listed 'Best First'.
Re:x4 Cannot read in multiple lines
by grinder (Bishop) on Nov 04, 2002 at 14:00 UTC

    Ok, let's try and take these one at a time.

    Why read the whole file into an array?
    To prevent the file being read every time the function is called

    A tough call either way. We don't know how often this function is called, nor how big the file is. But I suspect that what chromatic was trying to get at is that during a linear scan of a file, if the record exists, you'll find it on average half way through the file. You can then last out of the loop to save on unnecessary I/O (well I in any case)...

    If the file is not too big, it's probably no big deal, because the operating system will have paged the file into memory and thus will be accessed quickly. On the other hand, if the same IDs are repeatedly hit then, regardless of size, dominus' Memoize will be a big win. Finally if the file is large and IDs are found all the way through, it would be better to preprocess the file into a DBM file to treat the data as a disk-backed hash, keyed by ID. Each step up represents a higher engineering cost - and I don't know where the sweet spot is to be found.

    Why process each record twice?
    it simplified the lookups.

    Sure, but at a tremendous up-front cost. An array of arrays of scalars takes up far, far more room than an array of scalars, and you're performing calculations on all the unmatched records for absolutely zero gain. This is more pessimisation than optimisation.

    Why perform a numeric comparison?
    Because the IDs appeared to be numeric.

    Maybe so, but in the event of an ID containing non-numeric characters your script will spit out warnings (if you're running with -w which is usually a Good Idea). eq behaves gracefully in the light of random input.

    Why the following line? $|++ ;
    Part of my emacs template for Perl files; it prevents output buffering.

    I know about suffering from buffering, but it never affects me, nor do I put $|++ in my scripts (or very rarely). By disabling this, you are preventing your script from running at its best. Buffering is Good! Imagine if when typing in a post here, that between each keystroke you had to get up, run around your chair, and sit down again. That's an awful lot of effort for a little amount of work. I would switch things around. Turn on buffering (actually, do nothing, because it's on by default) and only turn it off when you need to.


    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'
Re: Re: Re: Re: Cannot read in multiple lines
by Angel (Friar) on Nov 05, 2002 at 01:00 UTC
    Ok so lets say you did set $/ to undef because the template scheme you are using is dumb and you just discovered the errors of your ways. How do you re-define it since I tried: local $/ = "/n" and that did not work. I am rewriting the template parts to get around that but how do I do that for future reference?
    Angel

      You've just got the slash backwards: local $/ = "\n" ;


      _______________
      DamnDirtyApe
      Those who know that they are profound strive for clarity. Those who
      would like to seem profound to the crowd strive for obscurity.
                  --Friedrich Nietzsche

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://210146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-04-24 02:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found