LanX has asked for the wisdom of the Perl Monks concerning the following question:
Hi
First I thougt about writing a longer meditation about iterators and semipredicate problem, but maybe a simple seek for perl wisdom is a better start.
In gnu.emacs.help someone asked
How would you code this simple list compression problem in Ruby:
1.08 (**) Eliminate consecutive duplicates of list elements.
If a list contains repeated elements they should be replaced with
a single copy of the element. The order of the elements should not be
changed.
Example:
?- compress([a,a,a,a,b,c,c,a,a,d,e,e,e,e],X).
X = [a,b,c,a,d,e]
my first idea to solve it in perl was grep
DB<1> my $p; print grep { $p = $_ if $_ ne $p } (a,a,a,a,b,c,c,
+a,a,d,e,e,e,e);
abcade
but this approach fails for false list elements like e.g "0"!
A nice exmaple where a grep like map is better than the plain grep
DB<2> my $p; print map { $p ne $_ ? $p = $_ : () } (a,a,a,a,b,c,c,a
+,a,d,e,e,e,e)
abcade
But beside the need to lexicalize $p in the outer scope ...
using state wouldn't be more elegant :
DB<3> use feature "state"; print map { state $p; $p ne $_ ? $p = $_ :
+ () } (a,a,a,a,b,c,c,a,a,d,e,e,e,e)
abcade
... it's still buggy if the compressed list was starting with undef values.
Thats a variation of the old Semipredicate problem:
"One can't find an initial value which is not also potentially a part of the iterated list!"
So I had a look into List::Util and the only appropriate approach was reduce
The first call will be with $a and $b
set to the first two elements of the list, subsequent calls will be
done by setting $a to the result of the previous call and $b to the
next element in the list.
Looks perfect, but reduce {} LIST only returns a scalar and List::MoreUtils doesn't seem to provide anything better¹.
(Sure I could use $a as an array-ref for accumulation, not very elegant...)
Temporary conclusion:
There seem to be a lack of iterators allowing to compare successive elements
And now I'm wondering about the best design ...
could a special variable like $^PRE generally help augmenting existing iterators, without the need to define and appropiately name a whole new family of reduce-like iterators?
( IMHO too many specialized functions like in LISP are hard to remember.)
something like
print grep { $_ ne $^PRE } (a,a,a,a,b,c,c,a,a,d,e,e,e,e);
abcade
Thoughts?
¹) couldn't find much more on CPAN...
Re: reduce like iterators
by ELISHEVA (Prior) on Jan 03, 2011 at 18:27 UTC
|
This is fairly easy to implement and wrap in a sub, even with the undefs. Once the sub is written you are back to having a one-liner. What would you like that this wouldn't do?
use strict;
use warnings;
sub compress {
my $x;
map {
if (!defined($x)) {
defined($_) ? ($x = $_) : ()
} else {
defined($_) && ($x eq $_) ? () : ($x = $_)
}
} @_;
}
my @aData=(qw(a a a a b c c a a d e e e e), undef, undef, qw(f g g));
my @aCompressed = compress @aData;
print "compressed: @aCompressed\n";
# outputs
a b c a d e undef f g
Or if you want to play golf (though others I'm sure can do better)
sub compress { my$x;map{defined($x)?(defined($_)&&($x eq$_)?():($x=$_)
+):defined($_)?($x=$_):()}@_;
}
Update: added golf
Update:: fixed mistake - undefs were strings. | [reply] [d/l] [select] |
|
Sure, I also did a function like this in the emacs help group
But I was asking about designing a generic iterator solution, opening a variety of applications and augmenting readablity
already a reduce_list {BLOCK} LIST variant of reduce could be sufficient in this case
reduce_list { $a ne $b } qw(a a a a b c c a a d e e e e undef undef f g g)
another application would be Haskells group iterator which partitions into successive groups:
group [1, 1, 1, 1, 2, 3, 3, 1, 1, 4, 5, 5, 5, 5]
==> [[1,1,1,1], [2], [3,3], [1,1], [4], [5,5,5,5]]
Simpling allowing access to the previous iteration easily would help implementing this with part from List::MoreUtils
part { $a eq $b } LIST
or
part { $^PRE eq $_ } LIST
Without the need to invent and name a whole new group of iterators.
| [reply] [d/l] [select] |
|
Your reduce_list is named similarly to reduce, but bears no resemblance since it doesn't to reduce at all.
-
reduce allows an arbitrary state to be passed from one pass to another. Your reduce_list doesn't.
-
reduce can return any value, not just the input. reduce_list can return at most one scalar, and it can only be the input. (That's not very "listy"!)
Because of those reasons, reduce is a general purpose function. (It can implement any other function in List::Util.) Your reduce_list is just grep with access to the last element.
This accounts for the differences with what I suggested it should look like.
| [reply] [d/l] [select] |
|
|
|
Perl also lets you use parameters to roll your own block syntax. Here is a slightly different definition that would allow you to do arbitrary operations on the first element of each run in a list:
Or if you want to be able to work with the current run value ($b) and the previous run value $a, you could do something like this:
There is almost no limit to what you could create. For example, if you wanted to be able to group items, you could additionally track the number of items in each run and set $_ to number of items in the current run:
Would that do what you want?
Update: added yet another block iterator, this time one that could be used for generating groups as in the Haskell function above.
Update: put in readmore tags
Update: fixed 3rd example so it prints count of current (not previous) run.
| [reply] [d/l] [select] |
|
|
|
Re: reduce like iterators
by BrowserUk (Patriarch) on Jan 03, 2011 at 18:49 UTC
|
#! perl -slw
use strict;
sub adjacentPairs (&@) {
my $code = shift;
map {
local( $a, $b ) = @_[ $_-1, $_ ];
$code->();
} 1 .. @_;
}
print adjacentPairs{
defined $b && $a eq $b ? () : $a
} qw[a a a b c c a a d e e e e];
__END__
C:\test>compress
abcade
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Re: reduce like iterators
by JavaFan (Canon) on Jan 03, 2011 at 19:05 UTC
|
You have to be careful with any state solution. They'll work fine until you go through the same code path again (due to a loop, the sub it's in called again, etc). Being a state variable, it'll retain whatever was in it last. So you have to be careful you aren't comparing the last element of the first list with the first element of the second list.
@cleaned = @uncleaned[0, grep {@uncleaned[$i] ne @uncleaned[$i-1]} 1..
+$#uncleaded];
It needs some additional hackery to deal with undefs correctly. | [reply] [d/l] [select] |
|
DB<3> use feature "state"; map { state $p; print $p;$p=$_} 1..3
12
DB<5> use feature "state"; for (1..3) {map { state $p; print $p;$p=$
+_} 1..3}
12312312
thanks a lot! :)
| [reply] [d/l] |
|
This particular problem is solved in Perl 6 by always cloning all blocks as closures the same way, so state is reset in loops as well as in map. (In fact, the for loop is defined in terms of map in Perl 6. So for loops return a list of values just like map does.) Perl 5 could probably move in the direction of fewer special-cased blocks that aren't true closures; this would have many subtle benefits. Not everything in Perl 6 can be borrowed back, but I suspect this is one of them.
| [reply] |
Re: reduce like iterators
by moritz (Cardinal) on Jan 03, 2011 at 19:20 UTC
|
I couldn't resist writing a "golfed" Perl 6 version, heavily inspired by functional programming techniques:
sub compress(*@a) {
flat @a Zxx 1, (@a Zne @a[1..*-1])
}
say compress(<a a a a b c c a a d e e e e>).perl;
The *-1 is a workaround for a rakudo bug, in a perfect Perl 6 implementation 1..* would work too.
| [reply] [d/l] [select] |
Re: reduce like iterators
by Anonymous Monk on Jan 29, 2024 at 09:59 UTC
|
| [reply] |
|
> Since somebody revived this really old node
well you did
> mention there's a recent ticket
created by you
> Also found a ticket in the queue of List::MoreUtils, but the author claims the functionality already exists in List::Util ?!
He's the maintainer, I think the authors abandoned it. And he most likely misunderstood the question.
I already contacted him privately regarding the scalar behavior of slide, his answer was kind of indicating that he would pass on the maintainership if asked.
| [reply] |
Re: reduce like iterators
by NERDVANA (Priest) on Jan 10, 2024 at 03:00 UTC
|
This probably runs faster than anything with map or grep, and isn't much longer than the fully golfed ones.
sub compress {
my @x;
!@x || $_ ne $x[-1] and push @x, $_ for @{$_[0]};
\@x
}
| [reply] [d/l] |
Re: reduce like iterators
by tybalt89 (Monsignor) on Jan 10, 2024 at 21:49 UTC
|
It's much simpler than the "semipredicate problem" because all you have to do
is set the previous value to "not the first value".
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=880228
use warnings;
sub squeeze
{
no warnings;
my $prev = not $_[0];
return grep +($_ ne $prev, $prev = $_)[0], @_;
}
| [reply] [d/l] |
Re: reduce like iterators
by furry_marmot (Pilgrim) on Jan 04, 2011 at 01:16 UTC
|
Hi LanX,
map{} will pass on undef's, but grep{} needs undef to tell whether something passed the filter. But you can stack any number of maps and greps together, which is how I came up with a workaround for it, using two maps and a grep.
The first map{}, reading from the bottom up, simply marks duplicates in-a-row (as opposed to any dupes at all) by changing them to 'aardvark'. Undef's and zeroes go through without problem. The second map{} changes actual undefined elements to the text 'undef'. Change these as suits your algorithm. The grep{} is where the duplicates/aardvarks are finally removed.
It seems to work, accommodating undefined values, while still being simple enough to use in a one-liner, and uses array refs so it doesn't pass whole lists back and forth.
use strict;
my @orig = ( qw(a a b b c 0 c d d u u),
(undef, undef, 'blink', 'blink'),
qw(0 0 v v w w a a a b b b c c c)
);
my @list;
nodupes (\@orig, \@list);
print join ' ', @list, "\n";
sub nodupes {
my ($ar1, $ar2) = @_;
my $p;
push @$ar2, grep{$_ ne 'aardvark'}
map{defined $_ ? $_ : 'undef'}
map { $p ne $_ ? $p = $_ : 'aardvark' }
@$ar1;
}
__END__
Prints --> a b c 0 c d u undef blink 0 v w a b c
And the one-liner -- actually broken up for easier viewing:
perl -e "@list = grep{$_ ne 'aardvark'}
map{defined $_ ? $_ : 'undef'}
map{$p ne $_ ? $p = $_ : 'aardvark'}
(qw(a a b b c 0 c u u), (undef, undef), qw(0 0 v v w w));
print join ' ', @list;"
Prints --> a b c 0 c u undef 0 v w
I hope you find this interesting/useful.
--marmot
UPDATE: I was thinking about this some more, and realized I had made it waaaay too complicated. The nodupes() below pretty much is a one-liner, and accomodates 0, "0", and undef just fine. And nodupes() can be used inline with other maps and greps. In the end, the grep was the only thing needed.
use strict;
sub nodupes {
my $p;
return grep{ $_ ne '~~' } map { $p ne $_ ? $p = $_ : '~~' } @_;
}
my @orig = ( qw(a a b b c 0 c d d u u),
(undef, undef, 'blink', 'blink'),
qw(0 0 v v "0" "0" "0" w w a a a b b b c c c)
);
my @new = nodupes @orig;
print join ' ', @new, "\n";
__END__
Prints --> a b c 0 c d u blink 0 v "0" w a b c
^^
There's an undef between these two spaces.
| [reply] [d/l] [select] |
|
Any reason not to just do:
my $p;
my @new = grep { $p = $_ or 1 if $p ne $_ } @orig;
? | [reply] [d/l] |
|
Ummm....because I didn't think of that? :-) Thanks! You just expanded my understanding of what you can do with grep.
--marmot
| [reply] |
|
Try @orig starting with undef.
Also see the OP for the "semi-predicate problem" discussion.
If this seems too theoretical for you, consider the practical task to do of a run-length encoding of sparsely set arrays. (undef is a real value)
| [reply] [d/l] [select] |
|
|
|
|
Re: reduce like iterators
by talexb (Chancellor) on Jan 11, 2024 at 14:34 UTC
|
#!/usr/bin/perl
# 2024-0111: From the node: "Eliminate consecutive duplicates of list
# elements. If a list contains repeated elements they should be
# replaced with a single copy of the element. The order of the
# elements should not be changed."
use strict;
use warnings;
{
my @orig = qw/a a a a b c c a a d e e e e/;
my @correct = qw/a b c a d e/;
my @soln;
for ( push ( @soln, shift @orig ); @orig; ) {
my $val = shift @orig;
if ( $soln[-1] ne $val ) { push ( @soln, $val ); }
}
print "Correct solution is @correct;\n";
print "My solution is @soln.\n";
}
Because my background (before Perl) is in C, that automatically looks like a for loop problem to me. In other words, you need to prime the pump so that you're not forced into doing something Really Clever for the first element.
And I could have written
push ( @soln, $val ) if ( $soln[-1] ne $val );
to make it more Perl-ish, but I'm not a fan of the postfix syntax. My OldSchool brain wants to see an if statement at the beginning of the line.
And, yes, this assumes that the list is string values (or stringable values), and does not deal with the undef value. That reminds me of a Google interview that I had a while back ("What about this ridiculous limitation to the solution?" "And how about this even more insane exception?").
Alex / talexb / Toronto
Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.
| [reply] [d/l] [select] |
|
push @soln, $val if $soln[-1] ne $val;
I know some folks like the extra brackets but I find removing them aids clarity.
| [reply] [d/l] [select] |
|
push @soln, $val if ( $soln[-1] ne $val );
and with the comma suggesting do this thing, then do that other thing, it could read like this:
push @soln # and then ..
$val if ( $soln[-1] ne $val );
Huh. That can't be right.
You could also use indentation to make it clearer ..
push @soln, $val
if ( $soln[-1] ne $val );
Anyway, it's a matter of taste, and my preferences comes from writing C in the 80's.
Alex / talexb / Toronto
Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.
| [reply] [d/l] [select] |
|
Re: reduce like iterators
by LanX (Saint) on Jan 15, 2024 at 17:56 UTC
|
Hello younger me :)
> and List::MoreUtils doesn't seem to provide anything better
This changed in the meantime, slide was added, but there are still issues.
The worst one is that it only returns scalars instead of lists like map does, which means we have to add an extra grep to filter undefined values :/
use v5.12;
use warnings;
use List::MoreUtils qw/slide/;
use Data::Dump;
my @x= split',', q(a,a,a,a,b,c,c,a,a,d,e,e,e,e);
dd grep defined, slide { $a ne $b ? $b : () } "", @x;
| [reply] [d/l] [select] |
|
> The worst one is that it only returns scalars instead of lists like map does, which means we have to add an extra grep to filter undefined values :/
That's most likely a bug in the XS implementation of slide.
I looked into the Pure Perl implementation in List::MoreUtils::PP, and it looked fine, it's literally using map
And a test reveals, that it really works like it should.
use v5.12;
use warnings;
use Data::Dumper;
BEGIN { $ENV{LIST_MOREUTILS_PP} = 1 }; # enforce PP version, comment f
+or XS
use List::MoreUtils qw/slide/;
my @x= split',', q(a,a,a,a,b,c,c,a,a,d,e,e,e,e);
print Dumper slide { $a ne $b ? $b : () } "", @x;
$VAR1 = 'a';
$VAR2 = 'b';
$VAR3 = 'c';
$VAR4 = 'a';
$VAR5 = 'd';
$VAR6 = 'e';
XS version
$VAR1 = 'a';
$VAR2 = undef;
$VAR3 = undef;
$VAR4 = undef;
$VAR5 = 'b';
$VAR6 = 'c';
$VAR7 = undef;
$VAR8 = 'a';
$VAR9 = undef;
$VAR10 = 'd';
$VAR11 = 'e';
$VAR12 = undef;
$VAR13 = undef;
$VAR14 = undef;
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
Re: reduce like iterators
by ikegami (Patriarch) on Jan 03, 2011 at 18:51 UTC
|
my @b = uniq @a;
my %seen;
my @b = grep !$seen{$_}++, @a;
But if you really want to use reduce with just one expression, it's definitely possible.
my @b = @{( reduce { push @{$a->[1]}, $b if !$a->[0]{$b}++ } [ {}, []
+], @a; $a )->[1]};
| [reply] [d/l] [select] |
|
read the example again, it's not about uniq but identifying sequences!
update:
DB<1> use List::MoreUtils "uniq"; print uniq qw(a a a a b c c a a d
+e e e e)
abcde
| [reply] [d/l] |
|
my @b = @{ reduce { push @$a, $b if !@$a || $b ne $a->[-1]; $a } [], @
+a };
If one were to make a list version of reduce, the callback would need access to three variables: The list (say $_), the state (say $a), the current value (say $b). The problem could be solved as follows:
my @b = list_reduce { push @$_, $b if !@$_ || $b ne $_->[-1]; undef }
+undef, @a;
my @b = list_reduce { push @$_, $b if $a || $b ne $_->[-1]; 0 } 1, @a;
my @b = list_reduce { push @$_, $b if defined($b) && $b ne $a; $b } un
+def, @a;
my @b = list_reduce { push @$_, grep defined && $_ ne $a, $b; $b } und
+ef, @a;
(@a may not start with undef for the last two to work properly.)
Update: Added everything after the first block of code.
| [reply] [d/l] [select] |
|
|
|
|
|
|
Re: reduce like iterators
by Anonymous Monk on Jan 03, 2011 at 19:43 UTC
|
| [reply] |
|
|