Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^2: Parsing a Large file with no reason

by mrras25 (Acolyte)
on Jan 28, 2010 at 21:04 UTC ( #820244=note: print w/replies, xml ) Need Help??

in reply to Re: Parsing a Large file with no reason
in thread Parsing a Large file with no reason

Not a job offer - Already on a job. The only thing I can come up with is doing it in batches first grabbing the lv's and parsing that down with the lvgroup_id place that into a hash then step through the hash to split the lvgroup_id and and stepping through the file to get the vggroup_id but I still think there would be an easier way - i was hoping to learn how to find a line step back up a few lines to get the information I needed or do a find the line "-El lv" and then do an until the next -El lv do such and such but I am not sure of the right syntax
  • Comment on Re^2: Parsing a Large file with no reason

Replies are listed 'Best First'.
Re^3: Parsing a Large file with no reason
by roboticus (Chancellor) on Jan 29, 2010 at 02:29 UTC


    If you really want to step back a few lines, then you can just keep a buffer of the last few lines read. However, I'd suggest just parsing out the elements as you find them, and then insert them when you determine they're "interesting". If you find that it's not an interesting record, clear your list of elements and keep on going. Something like this1:

    my %largerHash; # Place to accumulate interesting records my %elements; # Place to accumulate data into records while (<INF>) { if (/^(yabba|dabba|doo)\s+(.*)/) { # We only care about some of the fields $elements{$1} = $2; } elsif (/End of record ID:\s+(.*)/) { if ($1 =~ /foo/) { # Interesting record (starts with foo) so, store it $largerHash{$1} = %elements; } # Since we found end of record, clear our workspace %elements={}; } } __DATA__ Record 1 scooby 7 dooby 8 yabba Fred dabba Wilma End of record ID: cupcake Record 2 doo not fold spindle staple or mutilate dabba Barney yabba Dino End of record ID: foobar

    In this example, we collect a couple of fields in record 1, but at the end of the record, we find that nothing was interesting, so we discard the elements we collected. Then we collect more items and at the end of the record, we find that it's interesting, so we add the elements to the larger hash that you want to process after parsing the data.

    Note 1: Untested and quite possibly bad syntax, as I've been wrestling a bunch of .Net and C++ code for the last couple of weeks.


    Insert witty banter here.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://820244]
and not a whimper to be heard...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (1)
As of 2018-05-26 01:19 GMT
Find Nodes?
    Voting Booth?