Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Rosetta code: Split an array into chunks

by eyepopslikeamosquito (Archbishop)
on Sep 25, 2010 at 09:29 UTC ( [id://861938]=perlmeditation: print w/replies, xml ) Need Help??

Following the educational tradition of:

this meditation describes an arbitrary problem to be solved in different ways and in different languages.

The Problem

Given a list of strings, for example ("a", "bb", "c", "d", "e", "f", "g", "h"), and a chunksize, for example 3, write a subroutine to return a multi-line string, for example:

a bb c d e f g h
The output string must contain a single space between each array element and a newline every chunksize items. Note that no trailing space is permitted on any line and the last line must be properly newline-terminated.

Perl

Here was my first Perl attempt:

use strict; use warnings; sub chunk_array { my ($n, @vals) = @_; my $str; my $i = 0; for my $v (@vals) { ++$i; $str .= $v . ( ($i % $n) ? " " : "\n" ); } substr($str, -1, 1) = "\n"; return $str; } my $v1 = chunk_array(3, "a", "bb", "c", "d", "e", "f", "g", "h"); $v1 eq "a bb c\nd e f\ng h\n" or die "error: '$v1'\n"; print $v1; my $v2 = chunk_array(3, "a", "bb", "c", "d", "e", "f"); $v2 eq "a bb c\nd e f\n" or die "error: '$v2'\n"; print $v2;

I trust this initial solution will clarify the problem specification.

Being dissatisfied with this ugly first attempt, I next took at a look at the core List::Util and the non-core List::MoreUtils modules, writing two different solutions using List::MoreUtils, one using natatime, the other part, as shown below:

use List::MoreUtils qw(part natatime); sub chunk_array { my ($n, @vals) = @_; my $str; my $iter = natatime($n, @vals); while ( my @line = $iter->() ) { $str .= join(" ", @line) . "\n"; } return $str; } sub chunk_array { my ($n, @vals) = @_; my $i = 0; return join "", map { join(" ", @$_)."\n" } part { $i++/$n } @vals +; }

Python

For cheap thrills, I hacked out a Python itertools-based solution.

from itertools import * def group(n, iterable): args = [iter(iterable)] * n return izip_longest(*args) def chunk_array(n, vals): return "".join(" ".join(x for x in i if x!=None)+"\n" for i in group +(n, vals))

Discussion

I've derived little enjoyment so far from any of my solutions and accordingly encourage you to show us a more elegant way to solve this simple problem.

Please feel free to contribute more Perl solutions or a solution in any language you fancy. I'm especially eager to admire a Perl 6 solution.

Update: See also Re: How to Split on specific occurrence of a comma (2020)

Replies are listed 'Best First'.
Re: Rosetta code: Split an array into chunks
by TimToady (Parson) on Sep 25, 2010 at 14:10 UTC
    I hope you can admire this one:
    my @l = <a bb c d e f g h>; while @l.munch(3) -> $_ { .say }

      Yes, I do admire this simple and clear code.

      Choosing good names is an art and in terms of linguistic craft and agonizing over choosing good names, I claim that TimToady has no equal. :) As a former pacman player, I appreciate the (evocative) munch more than List::MoreUtils hard-to-pronounce natatime.

      With functional-style languages, you need a lot of names, so choosing good and consistent names is crucial. Certainly, Haskell has a vast number of functions in its core library; related Haskell names here appear to be take, drop and splitAt from Data.List; more specific to the problem at hand, though not in core, is hackage Data.List.Split's splitEvery aka chunk. Does Perl 6 have equivalents to Haskell's take, drop, splitAt, splitEvery?

      While salivating over munch, I hungrily searched for its documentation. Googling for "Perl 6 munch" left me unsatisfied, turning up only the Edvard Munch Museum tourist attraction in Oslo, among various IRC logs and blog entries, but no hits from the official Perl 6 docs. And I couldn't find it in S32-setting-library-Containers ... though I did find vaguely related Perl 6 names there, namely: shift, pop, splice, zip. Does anyone know where munch is officially documented?

      Update: I just checked an online dictionary and it appears that munch and chomp mean almost the same thing:

      chomp - To chew or bite on noisily
      munch - To chew or eat (food) audibly or with pleasure
      A mouthwatering challenge is to explain to a Perl 6 newbie how to remember which one does what. :)

        Hmm... If you let me google that for you, "perl 6" munch, I find Perl 6 documentation on the first page of hits and again on the second page. Unfortunately, it looks like both of those mention/use .munch rather than documenting it.

        But it also found http://irclog.perlgeek.de/perl6/2010-06-04 where TimToady discusses .munch. #i_2403514 appears to be where .munch was invented... less than 4 months ago.

        - tye        

Re: Rosetta code: Split an array into chunks
by GrandFather (Saint) on Sep 25, 2010 at 09:54 UTC

    I'd have thought the idiomatic Perl solution would splice:

    sub chunky { my ($n, @strs) = @_; my $str = ''; $str .= join (' ', splice @strs, 0, $n) . "\n" while @strs; return length $str ? $str : "\n"; }
    True laziness is hard work
      Exactly, my first thought was also splice

      my take on it

      use strict; use warnings; use Test::More; sub chunks { my ( @list ) = @_ ; my $count = 3; my $str = ""; while (my @cols = splice @list, 0, $count ) { $str .= "@cols\n"; } return $str; } # ========= Tests is( chunks("a", "bb", "c", "d", "e", "f", "g", "h") => <<'__CHUNK__', "Rosetta's Example" ); a bb c d e f g h __CHUNK__ is( chunks() => "", "Empty" ); done_testing;

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        Even shorter and faster (and most probably also memory efficient)

        use strict; use warnings; use Test::More; sub chunks { my $str = ""; while (my @cols = splice @_, 0, 3) { $str .= "@cols\n"; } return $str; } # ========= Tests my @list = ("a", "bb", "c", "d", "e", "f", "g", "h"); my @old = @list; my $text = "a bb c\n" . "d e f\n" . "g h\n" ; is( chunks(@list) => $text, "Rosetta's Example"); is( chunks() => "", "Empty string" ); is( chunks("") => "\n", "Empty List" ); is_deeply( \@list, \@old, "Non destructive! yeah ... :)"); done_testing;

        NB: @_ is a array of aliases, not an alias itself. Destryoing it doesn't affect the argument given :)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        Why splice when you can regex ?

        #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=861938 use warnings; sub chunk_array { my $n = shift() - 1; "@_\n" =~ s/(?: \S*){$n}\K /\n/gr; } my $v1 = chunk_array(3, "a", "bb", "c", "d", "e", "f", "g", "h"); $v1 eq "a bb c\nd e f\ng h\n" or die "error: '$v1'\n"; print $v1; print "------------------------------\n"; my $v2 = chunk_array(3, "a", "bb", "c", "d", "e", "f"); $v2 eq "a bb c\nd e f\n" or die "error: '$v2'\n"; print $v2;

        Outputs:

        a bb c d e f g h ------------------------------ a bb c d e f

        Another one of the \K/r idiom that makes perl so great!

Re: Rosetta code: Split an array into chunks
by moritz (Cardinal) on Sep 25, 2010 at 10:15 UTC
    Here is a Perl 6 solution in a single statement. It's probably not the best solution, but it does work today in rakudo:
    my @l = <a bb c d e f g h>; sub chunky(@l, $len) { (0, $len ...^ *>=@l).map({@l[$_ .. ($_ + $len - 1 min @l.end)] ~ " +\n" }).join } print chunky(@l, 3);

    The 0, $len ...^ *>=@l creates a list in steps of $len, up to (but excluding) the number of elements in @l.

    Inside the map there is an array slice. The min @l.end is only necessary because Rakudo doesn't clip array slices to the end of the list yet (which is a known and reported bug).

    I'm still looking for a nicer solution, will update my post if I find one.

    Update: Nicer solution:

    my @l = <a bb c d e f g h>; sub chunky(@l, $len) { (@l Z (' ' xx $len - 1, "\n") xx *).join.substr(0, -1) ~ "\n"; } print chunky(@l, 3);

    This creates an infinite list (' ', ' ', "\n", ' ', ' ', "\n", ...), and then zips the input list with it. (zip = takes one item from each list in turn). Zips stops when the shortest list is exhausted, so we don't have to worry about it looping forever.

    The result is then joined together, and the last separator is unconditionally substituted for a newline.

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Rosetta code: Split an array into chunks
by JavaFan (Canon) on Sep 25, 2010 at 10:56 UTC
    A solution that'll be hard to map to a different language (except perhaps Perl6):
    sub TIESCALAR{bless[0,$_[1]],$_[0]} sub FETCH{${$_[0]}[0]++%${$_[0]}[1]?" ":"\n"}; sub chunk_array{tie local$",'main',shift;"@_\n";}

      Looks good,

      Can you explain how it works?

Re: Rosetta code: Split an array into chunks
by johngg (Canon) on Sep 25, 2010 at 18:11 UTC

    This seems to work.

    knoppix@Microknoppix:~$ perl -E ' > @l = qw{ a bb c d e f g h }; > say sub { join q{ }, @_ }->( > grep defined, map shift @l, 1 .. 3 > ) while @l;' a bb c d e f g h knoppix@Microknoppix:~$

    I also piped the output through hexdump to check that there were no trailing spaces (which you do get without the grep).

    Update: Using List::Util::min() to remove the need for the grep.

    knoppix@Microknoppix:~$ perl -MList::Util=min -E ' > @l = qw{ a bb c d e f g h }; > say sub { join q{ }, @_ }->( > map shift @l, 1 .. min( 3, scalar @l ) > ) while @l;' a bb c d e f g h knoppix@Microknoppix:~$

    Cheers,

    JohnGG

Re: Rosetta code: Split an array into chunks
by liverpole (Monsignor) on Sep 25, 2010 at 18:14 UTC

    Here's my obfuscated Perl solution:

    use strict; use warnings; my @list = ("a", "bb", "c", "d", "e", "f", "g", "h"); my $chunky = chunky(3, @list); print "Chunky(@list) = \n$chunky\n"; sub chunky { my ($n, $i, $_) = shift; while (\0) { $_ .= (shift or last) . (!@_? $@: chr 32/(++$i % $n? 1: 3)) } $_.$/ } __END__ Chunky(a bb c d e f g h) = a bb c d e f g h

    Give that to anyone asking this question as homework ;-)


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: Rosetta code: Split an array into chunks
by chb (Deacon) on Sep 27, 2010 at 07:42 UTC
    Common Lisp anyone ?
    (defun chunky (strings &optional (chunksize 3)) (format nil "~{~A~%~}" (loop while strings collect (reduce #'(lambda (a b) (concatenate 'string a " " b)) (loop repeat chunksize append (when strings (list (pop strings)))))))) (chunky '("a" "bb" "c" "d" "e" "f" "g" "h")) "a bb c d e f g h "
Re: Rosetta code: Split an array into chunks
by paddy3118 (Acolyte) on Sep 26, 2010 at 18:09 UTC
    Another Python solution:
    >>> def formatter(data, chunksize=3): return ''.join( (' ' if i % chunksize else '\n') + d for i, d in enumerate(data))[1:] >>> print formatter(("a", "bb", "c", "d", "e", "f", "g", "h"), 3) a bb c d e f g h
Re: Rosetta code: Split an array into chunks
by muba (Priest) on Sep 26, 2010 at 04:34 UTC
    muba@localhost ~ $ cat rosetta sub chunky { my ($n, @l) = @_; my @f = ( ( (' ')x ($n-1)), "\n"); my $str = ''; $str .= shift(@l) . do {push(@f, shift @f); @l ? $f[-1] : ""} whil +e @l; return "$str\n"; } print chunky 3, qw(a bb c d e f g h); muba@localhost ~ $ perl rosetta a bb c d e f g h muba@localhost ~ $
Re: Rosetta code: Split an array into chunks
by stefan k (Curate) on Sep 27, 2010 at 08:42 UTC
    Clojure?
    (defn pm-chunk [data chsize] (apply str (interleave (map #(apply str (interpose " " %)) (partition chsize chsize nil data)) (repeat \newline))))
    Then use it at the REPL:
    user=> (print (pm-chunk ["a" "bb" "c" "d" "e" "f" "g" "h"] 3)) a bb c d e f g h
    Please note that the rather ugly "apply str" at the top is necessary to fulfil the perlish requirements. Usually you'd stop at creating the right sequence, which is achieved after the call to interleave.

    Regards... stefan k
    you begin bashing the string with a +42 regexp of confusion

Re: Rosetta code: Split an array into chunks
by MonkOfAnotherSect (Sexton) on Sep 28, 2010 at 02:45 UTC
    The simplest Python version is probably:
    def chunk_array(n, vals): return "\n".join(" ".join(vals[i:i+n]) for i in range(0, len(vals), n))
    Given that you have to return one big string anyway, it's not worthwhile doing anything more fancy with generators/iterators.
Re: Rosetta code: Split an array into chunks
by karlgoethebier (Abbot) on Oct 23, 2021 at 17:06 UTC

    Stolen from brother BUK, our beloved former leader: print splice @a, 0, 5 for 1 .. 5; It rhymes somehow again! Not tested but I guess it works. I’m still on a mobile device and in a hurry.

    See also Generate a # between 1 and 50 in groups of 5 as well as the sources of List::MoreUtils::PP:

    sub natatime ($@) { my $n = shift; my @list = @_; return sub { return splice @list, 0, $n; } }

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

Re: Rosetta code: Split an array into chunks
by Arunbear (Prior) on Feb 11, 2011 at 19:52 UTC
    Another Lisp variant:
    #lang scheme (require scheme/string) (define (chunky li sz [str ""]) (let ([graft (lambda (s l) (string-append s (string-join l " ") "\n"))]) (if [<= (length li) sz] (graft str li) (let-values ([(x y) (split-at li sz)]) (chunky y sz (graft str x)))))) (display (chunky '("a" "bb" "c" "d" "e" "f" "g" "h") 3))
    +% mzscheme chunky.ss a bb c d e f g h +%
Re: Rosetta code: Split an array into chunks
by shevek (Beadle) on Oct 23, 2010 at 01:31 UTC
    How about an iterator for it?
    sub gen_group_array { my ($group,$array) = @_; my ($start,$end) = (0,$group - 1); return sub { my $str = join ' ',@$array[$start..$end]; $start += $group; $end += $group; return $str; }; } my @array = qw(a b c d e f); my $group_size = 2; my $grouping = gen_group_array($group_size,\@array); print $grouping->() . "\n" for 1 .. ($#array+1)/$group_size;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://861938]
Approved by GrandFather
Front-paged by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-03-19 04:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found