Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Matching over multiple lines in a scalar

by gjb (Vicar)
on Oct 27, 2002 at 22:23 UTC ( #208388=note: print w/replies, xml ) Need Help??


in reply to Matching over multiple lines in a scalar

I'd suggest a slightly different approach that has the advantage that one can read line by line so that there's no need to have all data in memory (which is nice if you've a lot of data).

#!perl use strict; my %data; my ($key, $data); while (<DATA>) { chomp($_); if (/^(\d+):\s*(.+)$/) { $data{$key} = $data if defined $key; $key = $1; $data = $2; } else { $data .= " $_"; } } $data{$key} = $data if defined $key; foreach my $key (sort {$a <=> $b} keys %data) { print "$key: '$data{$key}'\n"; } __DATA__ 3: Tag <test> found 1 Tag <test> found 2 5: Tag <test> found 3 7: Tag <test> found 4 14: Tag <test> found 5 16: Tag <test> found 6 18: Tag <test> found 7 21: Tag <test> found 8 25: Tag <test> found 9 27: Tag <test> found 10 29: Tag <test> found 11 32: Tag <test> found 12 34: Tag <test> found 13 49: Tag <test> found 14 80: Tag <test> found 15 98: Tag <test> found 16 Tag <test> found 17

Essentially, this is a finite state machine with two states, new-line and continue-line, represented by the if and the else part with the variable $key playing the role of state variable.

Essentially, this is a finite state machine with three states, initial, new-line and continue-line, the last two represented by the if and the else part with the variable $key playing the role of state variable distinguishing between the initial (undef) and the other two states.

(I modified the data slightly to be able to check that the data actually ends up with the right key in the hash.)

Hope this helps, -gjb-

Update: this explanation is more precise than the version I striked out.

Replies are listed 'Best First'.
Re: Re: Matching over multiple lines in a scalar
by Rich36 (Chaplain) on Oct 27, 2002 at 22:41 UTC

    The data's actually coming in as a scalar from another sub and isn't that significantly large, so I don't think it's worth it to split it and then deal with it that way. Although that's a cool approach to the problem and I may end up adopting it if the data gets too large. Thanks.


    «Rich36»

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://208388]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (6)
As of 2021-05-13 20:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (143 votes). Check out past polls.

    Notices?