Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re^2: Extracting Unique Characters from a Array

by Fletch (Chancellor)
on Sep 07, 2006 at 17:24 UTC ( #571753=note: print w/replies, xml ) Need Help??

in reply to Re: Extracting Unique Characters from a Array
in thread Extracting Unique Characters from a Array

Yeah, let's turn a single O(n) scan with a hash (the right way to do it) into an O(n log n) sort followed by another O(n) scan. Aside from the added inefficiency you've lost the original ordering if that needed to be preserved.

Update: Ah, a followup did note that the list was to be sorted. Still makes more sense to do the O(n) cull of duplicates first and then the O(n log n) sort of the smaller list.

When in doubt:

#!/usr/bin/perl use Benchmark qw( timethese cmpthese ); use constant SIZE => 6_000; use constant COUNT => 500; my $count = shift || COUNT; my $size = shift || SIZE; my @source = map { int( rand($size) ) } 1 .. $size; cmpthese( $count, { sort_first => sub { my @sorted = sort @source; my $le = undef; my @uniq; for (@sorted) { if ( $le != $_ ) { $le = $_; push @uniq, $_; } } }, cull_first => sub { my %seen; my @uniq = grep { !$seen{$_}++ } @source; my @sorted = sort @uniq; } } ); exit 0; __END__

The cull_first version is from 8-20% faster for lists of from 5 to several thousand items.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://571753]
[stevieb]: berrybrew has been updated with the new perl checksums. Now I'm going to implement dynamic loading of perls thanks to their new JSON releases file
[LanX]: GotToBTru ++ # erix & MySql ... talk about a database trigger xD

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (12)
As of 2017-03-30 16:01 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (360 votes). Check out past polls.