http://www.perlmonks.org?node_id=1006961

johnfl68 has asked for the wisdom of the Perl Monks concerning the following question:

Hello:

I have sets of messages in strings, that are anywhere from 8 lines (LF at the end of each line) to about 50 lines.

I need to split them into chunks of 14 lines.

I know I can split the individual lines into an array using the line feed "\n" to do this, then scalar the array to find out how many lines, and then build back into a string of 14 for each block.

@descriptarray = split("\n",$description);

Is there a shorter way to do the split at every 14th line feed?

Thanks as always!

John

Replies are listed 'Best First'.
Re: Split string after 14 Line Feeds?
by LanX (Saint) on Dec 04, 2012 at 00:35 UTC
    A more general approach with "nested" iterators:

    sub read_chunk { my ($fh,$mod) = @_; my $chunk; while ( <$fh> ) { $chunk .= $_; return $chunk unless $. % $mod; } return $chunk; } open my $fh, "<", \$string; while ( my $chunk = read_chunk( $fh, 14 ) ) { print $chunk; print "-"x5,"\n"; }

    works with anything you can open via filehandle even strings.

    Cheers Rolf

Re: Split string after 14 Line Feeds?
by johngg (Canon) on Dec 03, 2012 at 23:47 UTC

    Another way would be to accumulate the chunks as you read the file.

    $ for i in `seq 1 13`; do echo Line $i; done | perl -e ' while ( not eof STDIN ) { my $buf; $buf .= $_ for map { eof STDIN ? () : scalar <> } 1 .. 5; print $buf; print q{+} x 10, qq{\n}; }' Line 1 Line 2 Line 3 Line 4 Line 5 ++++++++++ Line 6 Line 7 Line 8 Line 9 Line 10 ++++++++++ Line 11 Line 12 Line 13 ++++++++++ $

    I hope this is of interest.

    Cheers,

    JohnGG

Re: Split string after 14 Line Feeds? (//g)
by tye (Sage) on Dec 03, 2012 at 23:17 UTC
    my @chunks = $description =~ /^((?:[^\n]*\n){14}|.*)*/gs;

    - tye        

      Thanks. I think I see where you are going with this, but the array is coming up empty. I'll try and go over it again, regex's make my head hurt. :(

        Indeed. I conflated two similar techniques: getting a list of matches from /(...)/g and getting a list of matches from /(...)(...)(...)/. You don't get a list of matches from /(...)*/ (nor from /(...)*/g).

        What I should have written was:

        my @chunks = $description =~ /\G((?:[^\n]*\n){14}|.+)/gs;

        (tested even; works even)

        Update: Changed last * to + to eliminate extra empty string in result that I just noticed which is due to "quirk" in Perl regex processing (something I think we should just 'fix', but that is a story for another node, one I've written at least once already).

        - tye        

Re: Split string after 14 Line Feeds?
by AnomalousMonk (Archbishop) on Dec 04, 2012 at 09:27 UTC

    An approach using split (although conceptually similar to tye's) and splitting on groups of three lines because fourteen lines necessitates a very tedious example.

    >perl -wMstrict -le "my $s = qq{foo \n bar \n baz \n fee \n fie \n foe \n aa \n bb \n cc \n}; print qq{[[$s]]}; ;; my @fields = split m{ (?: [^\n]* \n){3} \K }xms, $s; print qq{[[$_]]} for @fields; " [[foo bar baz fee fie foe aa bb cc ]] [[foo bar baz ]] [[ fee fie foe ]] [[ aa bb cc ]]

    Update: Also tested/works in cases in which: last 'line' does not end in newline; exactly 3 lines processed; fewer than 3 lines processed; lines consist only of newlines; etc. – I think, in fact, all possible cases.

Re: Split string after 14 Line Feeds?
by Cody Fendant (Hermit) on Dec 04, 2012 at 11:30 UTC

    Just because this solution hasn't been hinted at yet:

    while (<DATA>) { chomp; if ( $. % 14 == 0 ) { print "$_ -- do something here\n"; } else { print "$_\n"; } } __DATA__ line 1 line 2 line 3 line 4 line 5 line 6 line 7 line 8 line 9 line 10 line 11 line 12 line 13 line 14 line 15 line 16 line 17 line 18 line 19 line 20 line 21 line 22 line 23 line 24 line 25 line 26 line 27 line 28 line 29 line 30

    Don't forget this kind of solution will leave you with extra unprocessed lines whenever the total number of lines isn't a multiple of 14, so wrap those up after you're finished with the loop.