Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Re: treat pairs, set as one item: looking for better way

by kcott (Chancellor)
on Oct 19, 2013 at 07:00 UTC ( #1058880=note: print w/replies, xml ) Need Help??

in reply to treat pairs, set as one item: looking for better way

G'day remiah,

I wasn't entirely certain how @key_array factored into your sets: they're clearly not pairs (as per the title); however, they are included in the output you show. That led me to thinking whether you're really dealing with just pairs or if that was meant as a simplified example. Anyway, the following code should handle whatever you intended.

#!/usr/bin/env perl -l use strict; use warnings; use List::Util qw{first}; my @key_array = 'a' .. 'i'; my @set_array = ( ['a', 'b'], ['e','f'], ['f','g'] ); my %sets; for my $set_ref (@set_array, map {[$_]} @key_array) { my $key = first { exists $sets{item}{$_} } @$set_ref; my $set_data = defined $key ? $sets{item}{$key} : do { push @{$sets{data}}, {}; $sets{data}[-1] }; for my $item (@$set_ref) { $sets{item}{$item} = $set_data unless exists $sets{item}{$item +}; ++$set_data->{$item}; } } for (sort { @$b <=> @$a } map { [ sort keys %$_ ] } @{$sets{data}}) { print join ',' => @$_; }


e,f,g a,b c d h i

So, all the sets are in the @{$sets{data}} array. Each element is a hash. Each hash has the items of the set as its keys and a count of those items as the values. I wasn't sure if you wanted a count (some of your commented out code indicated you might); the code above hasn't used the value: ignore it if you don't need it.

$sets{data} looks like this:

[ { a => 2, b => 2 }, { e => 2, f => 3, g => 2 }, { c => 1 }, { d => 1 }, { h => 1 }, { i => 1 }, ]

Individual items are the keys of the %{$sets{item}} hash; the values are references to the elements in the @{$sets{data}} array.

$sets{item} looks like this:

{ a => $sets{data}[0], b => $sets{data}[0], c => $sets{data}[2], d => $sets{data}[3], e => $sets{data}[1], f => $sets{data}[1], g => $sets{data}[1], h => $sets{data}[4], i => $sets{data}[5], }

As you can see, 'a' and 'b' point to a single set; 'c' and 'd' point to two different sets; 'e', 'f' and 'g' point to a single set, and so on.

You were also asking about sorting. I've sorted the sets by the number of elements to get the largest sets first (descending, numerical):

sort { @$b <=> @$a }

I've also sorted the items within each set (ascending, string):

sort keys %$_

You should note that the data you provided as input was inherently sorted (i.e. 'a' .. 'z' and the set pairs). If your real input is unordered, and you want sets of the same size to be ordered, you'll need an additional sort.

-- Ken

Replies are listed 'Best First'.
Re^2: treat pairs, set as one item: looking for better way
by remiah (Hermit) on Oct 20, 2013 at 03:49 UTC

    Hello kcott
    I could get really deep responses with this thread.

    I didn't notice that this is graph, until LanX said that, and your item and data are "adjacency list" with reference + count information. "item" is vertex which has reference to edge information,and "data" is edge data storage.

    Your usage of reference: reference from "item" to "data" seems to me an essence I would like to use by myself...

    I would like to have some more time to stick around this code, and thanks for your kind explanation.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1058880]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2018-05-27 12:37 GMT
Find Nodes?
    Voting Booth?