http://www.perlmonks.org?node_id=1011020


in reply to Re^5: join all elements in array reference by hash
in thread join all elements in array reference by hash

Thanks for your prompt response and patience/wisdom. I didn't expect so many quick and detailed responses during the Christmas season. I've broken down my exceedingly long 300 lines or so into several subroutines to make for better readability.

I found the answer to my memory problems - "make an anonymous copy" via a post from Randal Shwartz. At: http://www.stonehenge.com/merlyn/UnixReview/col30.html. I basically needed to put [] around my @verses array so that I create a copy of it when I point my hash reference there.

In terms of the foreach stylistic point (@{$bible...), that was sort of the syntax I was looking for for my join, but my issue had been that I had an empty array. Would not the following be even cleaner? Comments on efficiency?

print join("\n\n",@{$bible{$qbook}{$chapter}{'verses'}});
That seems to be working for me now. I may have future questions regarding my little project but now I have some experience regarding "too little" and "too much" so I'll try to ensure I follow your advice of making it "just right" and using <readmore> as appropriate.

Replies are listed 'Best First'.
Re^7: join all elements in array reference by hash
by AnomalousMonk (Archbishop) on Dec 31, 2012 at 20:10 UTC
    In terms of the foreach stylistic point (@{$bible...), that was sort of the syntax I was looking for for my join... Would not the following be even cleaner? Comments on efficiency?

    print join("\n\n",@{$bible{$qbook}{$chapter}{'verses'}});

    The salient difference between a statement like
        print join("\n\n",@{$bible{$qbook}{$chapter}{'verses'}});
    and a loop like
        foreach my $verse (@{$bible{$qbook}{$chapter}{'verses'}}) {
            print $verse, "\n\n";
            }
    is that the join built-in creates a copy in memory of all the concatenated elements of the array, and the for-loop does not. In fact, the for-loop does not even create a copy of any element of the array, but rather aliases  $verse to each element in turn.

    I vaguely recall that the Bible consists in fewer than 900,000 words. Even with commentaries included and using the hairiest possible UTF character set, it's hard for me to imagine the whole thing being longer than a few score MBs as a single string, and this is easily accomodated by Perl (in addition to whatever is still sitting in the array) on any remotely modern machine/OS I'm aware of. Furthermore, certain things, e.g., multi-line regex operations, often become quite simple with such a string. OTOH, you're only talking about join-ing all the verses of a single chapter, amounting to a still relatively short string, and modern operating systems are quite well adapted to I/O operations involving many, relatively short 'lines' of data.

    IOW, the join-versus-for-loop question is one of scalability, and the task you are dealing with does not seem likely to encounter scaling problems. (If you were dealing with genomics problems, the situation would be different; such problems very often involve processing files with many MB or GB of large records, so one must be very sensitive to scaling issues.)

    So for me, the chief considerations in dealing with your code would be readability and maintainability, with efficiency a distant third and scalability nowhere in sight. Based on these considerations, my personal preference would be the for-loop: it's highly idiomatic and familiar. (But I can't help saying that my guess would be that the for-loop would also be slightly more efficient in terms of speed and certainly in terms of memory usage.)

    HTH, and best wishes for the new year.

    Update: If you want to throw readability and maintainability to the four winds and go with cute, the following might be the most efficient of all:
        { local $, = "\n\n";  print @{$bible{$qbook}{$chapter}{'verses'}}; }