Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Parsing multiline string line by line

by flamey (Scribe)
on Feb 19, 2009 at 12:27 UTC ( [id://745018]=perlquestion: print w/replies, xml ) Need Help??

flamey has asked for the wisdom of the Perl Monks concerning the following question:

open( INFILE, $fileName ) || die $!; while( <INFILE> ) { ... } close INFILE;
I parse text files line-by-line as in sample code above. Is there as simple&pretty way to parsing a multiline string one line at a time?

say I have:
my $str = "First line\nsecond line\netc. lines here";
and for each iteration i want to get "First line", then "second line", then "etc. lines here"...

Replies are listed 'Best First'.
Re: Parsing multiline string line by line
by AnomalousMonk (Archbishop) on Feb 19, 2009 at 13:30 UTC
    Or you can use the same file-access mechanisms to access the string if a reference to the string is passed to  open() instead of a file name:
    >perl -wMstrict -e "my $filelike = qq{Line 1\nLine 2\nLine etc.\n}; open my $fh, '<', \$filelike or die $!; while (<$fh>) { print } close $fh or die $!; " Line 1 Line 2 Line etc.
      Ooh! A new thing! I never knew this before. (It's not mentioned in the crib-sheet page I tend to use.)

      Does it do anything with references to other types as well?

      --
      use JAPH;
      print JAPH::asString();

        The manpage for open gives open FILEHANDLE,MODE,REFERENCE as a way to call open, but, oddly, neither the manpage nor perlopentut seems to discuss it further (as far as I can see from a very quick scan of those two documents). Anyone know where this may be discussed?
      aha! i saw doing it this way a long time ago in some code online, and remembered that it was possible, but figured i'll find it again when I need it... but i couldn't find any examples again! thank you very much! :-)
Re: Parsing multiline string line by line
by olus (Curate) on Feb 19, 2009 at 12:41 UTC

    You can split the string on newlines into an array, and then iterate through it in a foreach loop.

    my @lines = split /\n/, $str; foreach my $line (@lines) { ... }

      Note that this will remove LF in the end of the strings, but not CR, so if newlines in DOS format you will have CR in the end of the line, and it wouldn't work with old Mac strings at all. I personally prefer the following variation:

      for (split /^/, $str) { ... }

      It works just like while (<>)

      Updated: fixed, thanks to ikegami

        split /\n/ and split /^/ both run fine on Windows and old Macs, for two different reasons

        In Windows, the CRLF gets converted to LF on read, so there's no CR in the string to split.

        In old Macs, \n is redefined to CR.

        If you meant there would be problems parsing files from one system on a different system, you need to add unix files to the list.

        [split /^/] works just like while (<>)

        Indeed. It keeps the trailing newlines, and it keeps trailing blanks lines. split /\n/ does neither.

      yeah, I was actually hoping to avoid doing this, just probably didn't make it clear in the question. but I do appreciate the reply :-) thanks!
Re: Parsing multiline string line by line
by perreal (Monk) on Feb 19, 2009 at 13:12 UTC
    Or, you can save a little memory like this:
    use strict; use warnings; my $str = "First line\nsecond line\netc. lines here"; while($str =~ /([^\n]+)\n?/g){ print "LINE: $1\n"; }
    but probably you need to work on it if you need to process empty lines as well.

    edit: deleted /s modifier.
      Why did you use the /s modifier? That only changes '.' from matching anything but carriage return to also matching it. I dont think it is doing anything useful there.
        yes you are right. somehow I always use s whenever I think about new lines. edited. thanks.
      i was sure there would be some way of doing it with regex, but i don't quite understand what's happening here: first iteration it will match a string of non-newline chars (first line), but how does it know to continue on the "second (etc.) line" on the rest of the iterations of the while loop?

      thanks for reply!!
        Perhaps it's a truism that for any computational process, there is a regex of arbitrary complexity which will solve it.
        check this, it is explained in the global matching section.
Re: Parsing multiline string line by line
by perlsaran (Sexton) on Feb 19, 2009 at 14:27 UTC
    Is this something you folks need
    $/ = undef;#set input record seperator to null[default \n] open("FIN","<raja.txt") or die "cant open :$!\n"; $content = <FIN>; #Whole file content in a single scalar variable close(FIN); foreach $line (split /\n/ ,$content) { print $line; }
      hmm, this looks prettier than what perreal suggested above. thanks!
      If you're slurping, might as well assign to an array:

      $/ = undef;

      open "FIN", "<raja.txt" or die "cant open :$!";
      my @fin = <FIN>;
      close FIN;

      print for @fin;

        But that puts the whole file into the first element of the array. (Change your print to print "|$_|\n" for @fin; and you'll see only one element.)

        You probably want this:

        $/ = undef; open "FIN", "<raja.txt" or die "cant open :$!"; my @fin = split/\n/,<FIN>; close FIN; print "|$_|\n" for @fin;

        Or to also get rid of blank lines:

        ... my @fin = grep {$_} split /\n/, <FIN>;

        Update: Square brackets in my original post created a link I didn't intend. I changed them to pipes. And alternatively, you could just print scalar @fin to confirm how many items you slurped.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://745018]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (8)
As of 2024-04-23 08:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found