Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Clubbing array elements together:

by newbie1991 (Acolyte)
on Jan 29, 2013 at 11:24 UTC ( #1015855=perlquestion: print w/ replies, xml ) Need Help??
newbie1991 has asked for the wisdom of the Perl Monks concerning the following question:

Here is what my current array looks like :

(blank) (blank) abcd efgh (blank) (blank) jklm nopq (blank) (blank)

My objective is to a) Delete the double blank spaces. b) All elements between a set of double blanks should be combined together into one element. I was using a foreach loop to check for blank elements but that's a problem when testing for consecutive blanks. Is there a more concise way? The sample output should be:

abcdefgh jklmnopq

Comment on Clubbing array elements together:
Select or Download Code
Re: Clubbing array elements together:
by choroba (Abbot) on Jan 29, 2013 at 11:33 UTC
    You can profit from the input record separator $/:
    $/ = q(); while (<>) { s/\n//g; print "$_\n"; }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      How do you wanna apply this on arrays?

      Though, I think you have the right idea, the array will most likely originate from a file-read.

      So it's obviously a XY Problem and should be solved directly.

Re: Clubbing array elements together:
by Athanasius (Monsignor) on Jan 29, 2013 at 11:40 UTC

    Here is one approach:

    #! perl use Modern::Perl; use Data::Dump; my @array1 = ( undef, undef, 'abcd', 'efgh', undef, undef, 'jklm', 'nopq', undef, undef, ); my @array2 = split /\0\0/, join('', map { $_ // "\0" } @array1); @array2 = @array2[1 .. $#array2] unless $array2[0]; dd @array1; dd @array2;

    Output:

    21:38 >perl 508_SoPW.pl ( undef, undef, "abcd", "efgh", undef, undef, "jklm", "nopq", undef, undef, ) ("abcdefgh", "jklmnopq") 21:38 >

    Update 1: The above assumes that “blank” means undef. If it means '' (the empty string), then change the expression:

    map { $_ // "\0" } @array1

    to

    map { $_ || "\0" } @array1
    map { $_ eq '' ? "\0" : $_ } @array1

    30th January, 2013. Amended to address the problem noted by The Perlman, below.

    Update 2: As with muba’s solution below, the above assumes that the array begins with a double blank.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Nice! But map { $_ || "\0" } @array1 will fail with anything false like "0".
      - Ron

      Hi, Can you me explain me what is the below code doing:

      dd @array1; dd @array2;

        To print out the contents of an array, you can use join: print join(', ', @array1), "\n";. However, for more complicated (i.e. nested) data structures, you need to write loops to iterate over the data to be printed. Fortunately, Perl has a module Data::Dumper which handles all of this for you:

        use Data::Dumper; ... print Dumper(\@array1), "\n";

        This is a core module, so it comes as part of your Perl installation. And there are other modules available on CPAN which do essentially the same job, but with some differences. My current favourite is Data::Dump, which can be used like this:

        use Data::Dump; ... dd @array1;

        See Data::Dump.

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Clubbing array elements together:
by ww (Bishop) on Jan 29, 2013 at 11:52 UTC

    Is your "(blank)" literal, undef, nul or what?

      blank is "", or just an element with nothing in it. My bad, should have specified!
      This is a special case. It is important that you read the documentation for $/. (Refer: perldoc -v $/)
      Bill
Re: Clubbing array elements together:
by sundialsvc4 (Abbot) on Jan 29, 2013 at 13:28 UTC

    I prefer to solve such problems using a Finite-State Machine (FSM) algorithm, because it represents a generalized and easily-adaptable approach to the problem.

    For example, from an initial state (that is to say, “while not in final state ...”), you read a line and have one of three possibilities:   end-of-file, a blank string, or not-all-blanks.   That would lead to final, skip_blanks, or first_string.   And so on.   The FSM can be drawn out on a piece of paper as the WikiPedia article describes.   When you determine (on paper) that the graph is entirely correct, the programming is a snap.

    Since the FSM graph can be tricky to correctly design, I suggest that you build a test-suite using e.g. Test::Most to prove that it works as intended in all cases.

      sundialsvc4:

      "I prefer to solve such problems using a Finite-State Machine (FSM) algorithm, because it represents a generalized and easily-adaptable approach to the problem."

      Sounds really cool. I'd like very much to see how you code it.

      Thanks in advance for advice and best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

Re: Clubbing array elements together:
by RichardK (Priest) on Jan 29, 2013 at 15:10 UTC

    Perhaps you're asking the wrong question :)

    Where did your data come from? and can you get it in the format you need, rather then having to mung it into shape later?

    e.g. If you're creating the array by reading a file, you can remove the blank lines when looping through lines.

Re: Clubbing array elements together:
by Utilitarian (Vicar) on Jan 29, 2013 at 15:13 UTC
    What's wrong with a foreach approach?
    $ perl -Mstrict -MData::Dumper -e ' my @words=("",",","abcd","efgh","","","jklm","nopq",""); my $combined; my @combinations; for my $word (@words){ if ($word){ $combined .=$word; } else{ push @combinations, $combined if $combined; $combined=""; } } @words=@combinations; print Dumper\@words;' $VAR1 = [ ',abcdefgh', 'jklmnopq' ];
    print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."

      In itself, there is nothing wrong with a foreach loop. However, the original specification mentioned double blancs, and neglected to explicitly mention what should happen with a single blanc, but it can be inferred from "b) All elements between a set of double blanks should be combined together into one element" that single blancs should just be added to that one element. Or ignored, which boils down to the same. Your code, however, will act as soon as it find a blanc, no matter whether it's a double or single one:

      use Data::Dumper; # Two strings, single blanc, two strings, double blanc my @words=("abcd","efgh", "", "jklm","nopq", "", ""); my $combined; my @combinations; for my $word (@words){ if ($word){ $combined .=$word; } else{ push @combinations, $combined if $combined; $combined=""; } } @words=@combinations; print Dumper\@words;
      $VAR1 = [ 'abcdefgh', 'jklmnopq' ];

      Here's a little something that almost does what newbie1991 specified. It just doesn't deal well with arrays that don't start with a double blanc: it will just pretend that the array did.

      use strict; use warnings; use Data::Dump 'pp'; my @words1 = ("", "", "abcd", "efgh", "", "", "jklm", "nopq", "", "") +; my @words2 = ("", ",", "abcd", "efgh", "", "", "jklm", "nopq", ""); my @words3 = ("", "", "abcd", "efgh", "", "jklm", "nopq", "", "", "rst +u", "vwxy", "", ""); print "Double blancs:\n"; my @combined1 = process(@words1); pp \@combined1; print "\nArray doesn't start or end with double blancs:\n"; my @combined2 = process(@words2); pp \@combined2; print "\nArray contains single blanc:\n"; my @combined3 = process(@words3); pp \@combined3; sub process { my @input = @_; my @output = (); my $buffer = undef; my $last = shift @input; # We take the first el +ement for whatever it is while ( defined(my $this = shift @input) ) { if ($last eq "" and $this eq "") { # Double blancs # When we run into the first double blanc, # $buffer will be undef. # We don't want to push that. push @output, $buffer if defined $buffer; $buffer = ""; } elsif ($this ne "") { # Non-blanc string # ($buffer || "") to prevent "undefined value in concatena +tion" warning $buffer = ($buffer || "") . $this; } $last = $this; } return @output; }
      Double blancs: ["abcdefgh", "jklmnopq"] Array doesn't start or end with double blancs: [",abcdefgh"] Array contains single blanc: ["abcdefghjklmnopq", "rstuvwxy"]
Re: Clubbing array elements together:
by sundialsvc4 (Abbot) on Jan 29, 2013 at 16:53 UTC

    To my way of thinking, the presence of a blank line could be a significant and intentional part of the input.   It could also be “an intended stricture of the data,” i.e. “deviation from which indicates bad data,” if more-than-two or other-than-two nonblank strings occur betwixt the blanks (or if only one nonblank instead of two appear at the end).   I generally believe that it ought to be the program’s responsibility not only to do the right thing in all cases, but to detect and report anything that is designated to be not-the-right situation with regard to its own inputs.   (Otherwise, you might well have a malfunction ... or, worse yet, unrecognized incorrect-output ... because no one is in the position to detect the flaw or to call attention to it other than this program itself.)

    The FSM-approach that I outlined previously will, in a very implementable and adjustable way, enable this sort of thing to be done.   Obviously, every case is different, but I do find that this has consistently benefited my projects.

    I guess what I’m driving at is ... there is one sort of approach which says, “okay, it seems to work so I’m done.”   But there is also another approach which allows one to say, “because this program completed without error, we can assert, not only that the output is good, but that the inputs also were good.”   All 1,771,561 of them.   Extremely do-able, and of course, frequently beneficial.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1015855]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2014-10-25 05:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (142 votes), past polls