Do you know where your variables are?

Re: Optimizing a naive clustering algorithm

by RichardK (Parson)
in reply to Optimizing a naive clustering algorithm

I haven't read about the concept (yet!) so I'm just commenting on your code.

Copying and manipulating those hashes in max_diff is going to be slow, lots of memory copies, and if I've understood correctly you don't need to do it that way. Wouldn't something like this give you the number you need?

sub max_diff { ... my $count = 0; for (keys %{$hash1}) { $count++ unless exists $hash2->{$_}; } for (keys %{$hash2}) { $count++ unless exists $hash1->{$_}; } return $count;

Re^2: Optimizing a naive clustering algorithm
by BUU (Prior) on Apr 15, 2014 at 18:23 UTC
    Ha, yes, I think you're right. I don't think it solves the overall problem but its a good catch.

[ambrus]: Corion: that's not true. Actually for Christmas and Thanksgiving, a lot of people buy electronics such as cameras as present, then many of them figure out they don't need it,
[ambrus]: and the electronics gets reselled almost new, but it has to be sold at half price because otherwise everyone chooses to buy the new product which has fewer risk of selling damaged products labelled as almost new.
[ambrus]: You can actually get a lot of useful cheap really almost new products that way, with only a little risk of scams.
[ambrus]: That's what some of the "Black Friday" sales are about.
[Corion]: ambrus: Well, usually, these people don't have in their description "mail me at dodgy_reseller # g m a i l | co m" , replace the "#" by "@" :)
[Corion]: Oh, and the "o" in "com" is a zero
choroba orders a camera from Ole Scæmmer
[ambrus]: Corion: ah. that's different. the ones I mean are selling at reputable sites like ebay that usually filters scammers out pretty quickly (as well as filters a lot of legitimate users who then get annoyed that the biggest providers exclude them)

