Re: Reading multiple lines?

Replies are listed 'Best First'.
Re: Re: Reading multiple lines? by rdw (Curate) on Nov 29, 2000 at 14:42 UTC
No qualms at all, I'm trying to get some existing data into a database, and I've got a lot of it. The file structure is a bit odd, and I need to read in N lines of header, then M lines of secondary data, before looping through line by line for a while and then going back to the header structure. I don't want to read it all into memory because the file is about 160Mb with about 8 million lines. The header is always a fixed number of lines, the secondary data is optional but a fixed number of lines and the bulk of the data is usually somewhere between 100 and 10,000 lines. I was just surprised that this wasn't as easy / neat to do as I expected. I'm quite pleased with my original map one liner, but nobody has really commented on whether it was really all that bad. Have fun, rdw	[reply]
Re: Re: Re: Reading multiple lines? by merlyn (Sage) on Nov 29, 2000 at 20:02 UTC
I was just surprised that this wasn't as easy / neat to do as I expected. I'm quite pleased with my original map one liner, but nobody has really commented on whether it was really all that bad. OK, I'll comment on that. In my mind `map` is a way to go from X to f(X) for a bunch of X's. If f(X) doesn't depend on X, it makes my brain go tilt a bit, but I can probably get used to it. Hence, I'll almost certainly try a different solution before I accept the void-arg map alternative. Hmm. What you really probably have is a state machine. I could see a big Switch statement based on state (reading header A, reading header B, in the body) with `eof(IN)` at the top, and if eof is detected while in header A or B, then carp out. See, I get worried about when the unusual happens. Maybe it's just my 30 years of programming, but any time I see someone write a "read 10 lines here" loop, I think "what if there aren't 10 lines?". That's what makes me good at QA. :) -- Randal L. Schwartz, Perl hacker	[reply]
Re: Re: Re: Re: Reading multiple lines? by rdw (Curate) on Nov 30, 2000 at 03:19 UTC
Thanks for that - I appreciate your comments. I'll confess to being a bit of a `map` abuser - I tend to use it whenever I need to create a list or hash, although I do try to comment serious abuse whenever possible. Your QA point is a good one too - I work for a well known and well established website built almost entirely with perl and going through some old code I've found all sorts of mistakes - now I'm trying to import all the millions of lines of bad data it's created into a replacement system. Most of the mistakes are due to bad assumptions, often using regexps to match parts of strings out, but never testing whether the match was successful and getting a previous value of $1 or something. I sometimes wish that there were much more warnings about that sort of thing. Have fun, rdw	[reply]
Re^3: Reading multiple lines? by Anonymous Monk on Aug 19, 2010 at 21:11 UTC
would it faster just lynux/unix command line? tail $fileName -n $start \| head -n $length where $start =100 and $length =10000-100 So that no readin file and thus not much memory used. I used this to fetch block of lines within a file with more than 100 million lines. Average time to get results was ~ 1 minute.	[reply]


Perl-Sensitive Sunglasses
	PerlMonks