Parsing multiline string line by line

flamey has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Parsing multiline string line by line by AnomalousMonk (Archbishop) on Feb 19, 2009 at 13:30 UTC
Or you can use the same file-access mechanisms to access the string if a reference to the string is passed to `open()` instead of a file name: `>perl -wMstrict -e "my $filelike = qq{Line 1\nLine 2\nLine etc.\n}; open my $fh, '<', \$filelike or die $!; while (<$fh>) { print } close $fh or die $!; " Line 1 Line 2 Line etc.` [download]	[reply] [d/l] [select]
Re^2: Parsing multiline string line by line by wol (Hermit) on Feb 19, 2009 at 13:59 UTC
Ooh! A new thing! I never knew this before. (It's not mentioned in the crib-sheet page I tend to use.) Does it do anything with references to other types as well? -- use JAPH; print JAPH::asString();	[reply]
Re^3: Parsing multiline string line by line by AnomalousMonk (Archbishop) on Feb 19, 2009 at 21:01 UTC
The manpage for open gives open FILEHANDLE,MODE,REFERENCE as a way to call open, but, oddly, neither the manpage nor perlopentut seems to discuss it further (as far as I can see from a very quick scan of those two documents). Anyone know where this may be discussed?	[reply]
Re^2: Parsing multiline string line by line by flamey (Scribe) on Feb 19, 2009 at 14:25 UTC
aha! i saw doing it this way a long time ago in some code online, and remembered that it was possible, but figured i'll find it again when I need it... but i couldn't find any examples again! thank you very much! :-)	[reply]
Re: Parsing multiline string line by line by olus (Curate) on Feb 19, 2009 at 12:41 UTC
You can `split` the string on newlines into an array, and then iterate through it in a `foreach` loop. `my @lines = split /\n/, $str; foreach my $line (@lines) { ... }` [download]	[reply] [d/l] [select]
Re^2: Parsing multiline string line by line by zwon (Abbot) on Feb 19, 2009 at 14:15 UTC
Note that this will remove LF in the end of the strings, ~~but not CR, so if newlines in DOS format you will have CR in the end of the line, and it wouldn't work with old Mac strings at all~~. I personally prefer the following variation: `for (split /^/, $str) { ... }` [download] It works just like `while (<>)` Updated: fixed, thanks to ikegami	[reply] [d/l] [select]
Re^3: Parsing multiline string line by line by ikegami (Patriarch) on Feb 19, 2009 at 14:40 UTC
`split /\n/` and `split /^/` both run fine on Windows and old Macs, for two different reasons In Windows, the CRLF gets converted to LF on read, so there's no CR in the string to split. In old Macs, `\n` is redefined to CR. If you meant there would be problems parsing files from one system on a different system, you need to add unix files to the list. [`split /^/`] works just like `while (<>)` Indeed. It keeps the trailing newlines, and it keeps trailing blanks lines. `split /\n/` does neither.	[reply] [d/l] [select]
Re^2: Parsing multiline string line by line by flamey (Scribe) on Feb 19, 2009 at 14:15 UTC
yeah, I was actually hoping to avoid doing this, just probably didn't make it clear in the question. but I do appreciate the reply :-) thanks!	[reply]
Re: Parsing multiline string line by line by perreal (Monk) on Feb 19, 2009 at 13:12 UTC
Or, you can save a little memory like this: `use strict; use warnings; my $str = "First line\nsecond line\netc. lines here"; while($str =~ /([^\n]+)\n?/g){ print "LINE: $1\n"; }` [download] but probably you need to work on it if you need to process empty lines as well. edit: deleted /s modifier.	[reply] [d/l]
Re^2: Parsing multiline string line by line by metaperl (Curate) on Feb 19, 2009 at 14:47 UTC
Why did you use the `/s` modifier? That only changes '.' from matching anything but carriage return to also matching it. I dont think it is doing anything useful there.	[reply] [d/l]
Re^3: Parsing multiline string line by line by perreal (Monk) on Feb 19, 2009 at 15:09 UTC
yes you are right. somehow I always use s whenever I think about new lines. edited. thanks.	[reply]
Re^2: Parsing multiline string line by line by flamey (Scribe) on Feb 19, 2009 at 14:21 UTC
i was sure there would be some way of doing it with regex, but i don't quite understand what's happening here: first iteration it will match a string of non-newline chars (first line), but how does it know to continue on the "second (etc.) line" on the rest of the iterations of the while loop? thanks for reply!!	[reply]
Re^3: Parsing multiline string line by line by mikelieman (Friar) on Feb 19, 2009 at 14:42 UTC
Perhaps it's a truism that for any computational process, there is a regex of arbitrary complexity which will solve it.	[reply]
Re^3: Parsing multiline string line by line by perreal (Monk) on Feb 19, 2009 at 15:16 UTC
check this, it is explained in the global matching section.	[reply]
Re: Parsing multiline string line by line by perlsaran (Sexton) on Feb 19, 2009 at 14:27 UTC
Is this something you folks need `$/ = undef;#set input record seperator to null[default \n] open("FIN","<raja.txt") or die "cant open :$!\n"; $content = <FIN>; #Whole file content in a single scalar variable close(FIN); foreach $line (split /\n/ ,$content) { print $line; }` [download]	[reply] [d/l]
Re^2: Parsing multiline string line by line by flamey (Scribe) on Feb 19, 2009 at 15:05 UTC
hmm, this looks prettier than what perreal suggested above. thanks!	[reply]
Re^2: Parsing multiline string line by line by Jayson (Novice) on Feb 19, 2009 at 19:54 UTC
If you're slurping, might as well assign to an array: $/ = undef; open "FIN", "<raja.txt" or die "cant open :$!"; my @fin = <FIN>; close FIN; print for @fin;	[reply]
Re^3: Parsing multiline string line by line by hbm (Hermit) on Feb 19, 2009 at 21:36 UTC
But that puts the whole file into the first element of the array. (Change your print to `print "\|$_\|\n" for @fin;` and you'll see only one element.) You probably want this: `$/ = undef; open "FIN", "<raja.txt" or die "cant open :$!"; my @fin = split/\n/,<FIN>; close FIN; print "\|$_\|\n" for @fin;` [download] Or to also get rid of blank lines: `... my @fin = grep {$_} split /\n/, <FIN>;` [download] Update: Square brackets in my original post created a link I didn't intend. I changed them to pipes. And alternatively, you could just `print scalar @fin` to confirm how many items you slurped.	[reply] [d/l] [select]
Re^4: Parsing multiline string line by line by Jayson (Novice) on Feb 20, 2009 at 14:35 UTC


more useful options
	PerlMonks