Re: Shorten a list...
by toolic (Bishop) on Oct 16, 2011 at 00:10 UTC
|
One way is to keep track of the group count in a hash:
use warnings;
use strict;
my %groups;
while (<DATA>) {
next unless /\S/;
chomp;
$groups{$_}++;
}
for (sort keys %groups) {
print "$_\n" if $groups{$_} > 1;
}
__DATA__
Group 1
Group 2
Group 3
Group 3
Group 4
Group 5
Group 5
Group 5
Prints...
Group 3
Group 5
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Thanks for the quick reply. I want the resultant list to read:
Group 3
Group 3
Group 5
Group 5
Group 5
is this possible with the code you have given?
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Shorten a list...
by eyepopslikeamosquito (Archbishop) on Oct 16, 2011 at 10:35 UTC
|
Here's a version that preserves the line order within the file
while allowing out of order lines (for example, "Group 1"
appears on the first and last line below) and making empty
lines separating groups optional (for example, there are blank lines within the "Group 5" lines below):
use strict;
use warnings;
my %seen;
print map { $seen{$_} > 1 ? $_ x $seen{$_} . "\n" : () }
grep { not $seen{$_}++ }
grep { !/^\s*$/ } <DATA>;
__DATA__
Group 1
Group 2
Group 3
Group 3
Group 3
Group 4
Group 5
Group 5
Group 5
Group 5
Group 1
Running the above program produces:
Group 1
Group 1
Group 3
Group 3
Group 3
Group 5
Group 5
Group 5
Group 5
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Thanks so much for all of the feedback. As I have been working on this, the format has been changed so that the original input list now looks like:
Group 1
name
Group 2
name
Group 3
name
name
Group 4
name
Group 5
name
name
name
Your suggestions so far have been really helpful. Can anyone help me now with trying to only print the groups with multiple entries (Group 3 and Group 5) in this format:
Group 3
name
name
Group 5
name
name
name
Again, your help is greatly appreciated
| [reply] [Watch: Dir/Any] |
|
There may be other ways than this. If the file is large (MB or GB), you might not want this method. It reads the whole file into an array, @data, before printing.
#!/usr/bin/perl
use strict;
use warnings;
my (@buffer, @data);
while (<DATA>) {
if (/^Group/) {
push @data, [@buffer] if @buffer > 2;
@buffer = $_;
}
else {
push @buffer, $_;
}
}
push @data, [@buffer] if @buffer > 2;
{
local $" = '';
print join("\n", map "@$_", @data);
}
Chris | [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
Re: Shorten a list...
by Cristoforo (Curate) on Oct 16, 2011 at 01:31 UTC
|
My program starts by reading in paragraph mode, then chomping. Only groups with more than 1 line can have 1 or more newlines.
#!/usr/bin/perl
use strict;
use warnings;
use 5.014;
{
local $/ = "";
while (<DATA>) {
chomp;
print "$_\n" if tr/\n//; # if 1 or more newlines
}
}
__DATA__
Group 1
Group 2
Group 3
Group 3
Group 4
Group 5
Group 5
Group 5
prints:
Group 3
Group 3
Group 5
Group 5
Group 5
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Shorten a list...
by chromatic (Archbishop) on Oct 16, 2011 at 00:10 UTC
|
| [reply] [Watch: Dir/Any] |
Re: Shorten a list...
by ambrus (Abbot) on Oct 16, 2011 at 09:28 UTC
|
sort | uniq -D
| [reply] [Watch: Dir/Any] [d/l] |
Re: Shorten a list...
by sundialsvc4 (Abbot) on Oct 16, 2011 at 13:43 UTC
|
A list which appears to the user as, say:
(1,2), (1,4), (1,7), (2,1), (3,5), (4,161), (4,1991) could be physically represented in the actual application like this: (“Holy LISP, Batman!!”)
(1, (2,4,7)), (2,(1)), (3, (5)), (4, (161, 1991))
An abstract data type could be constructed which knew about this internal efficiency without exposing it to its clients, providing them a “list of 2-tuples” interface while, unbeknownst to them, actually storing it and/or indexing it in a more efficient way.
| [reply] [Watch: Dir/Any] |