Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Bidirectional lookup algorithm? (Updated: further info.)

by soonix (Chancellor)
on Jan 05, 2015 at 10:22 UTC ( [id://1112165]=note: print w/replies, xml ) Need Help??


in reply to Bidirectional lookup algorithm? (Updated: further info.)

would using a database be overkill? I saw that DBD::SQLite can work in-memory:
my $dbh = DBI->connect("dbi:SQLite::memory:"});

Replies are listed 'Best First'.
Re^2: Bidirectional lookup algorithm? (Updated: further info.)
by BrowserUk (Patriarch) on Jan 05, 2015 at 13:15 UTC

    That certainly is worth considering as it reduces the memory requirement to ~1/6th compared to the single hash method.

    However, the penalty is that the lookups run ~30x more slowly, so a job that currently takes ~8hrs would require 10 days! Hence, it will be a last resort for when I cannot find anything quicker.

    Single hash:

    #! perl -slw use strict; use Time::HiRes qw[ time ]; sub mem{ `tasklist /nh /fi "PID eq $$"` =~ m[(\S+ K)$] } our $S //= 1; my $scale = chr( ord('a') + $S - 1 ); print "Start: mem: ", mem; my( $i, %lookup ) = 1; $lookup{ $_ } = $i, $lookup{ $i++ } = $_ for 'aaaaa' .. $scale . 'zzzz +'; print scalar keys %lookup; print "Hash built: mem: ", mem; my $start = time; for( 'aaaaa' .. $scale . 'zzzz' ) { my $addr = $lookup{ $_ }; # print $addr; } printf "Fetch by Name took %f seconds \n", time() - $start; $start = time; for( 1 .. 26**4 * $S ) { my $name = $lookup{ $_ }; # print $name; } printf "Fetch by Addr took %f seconds \n", time() - $start; print "Hash queried: check mem: ", mem; __END__ C:\test>1112165-hash -S=1 Start: mem: 8,032 K 913952 Hash built: mem: 169,116 K Fetch by Name took 0.217840 seconds Fetch by Addr took 0.234923 seconds Hash queried: check mem: 169,140 K C:\test>1112165-hash -S=2 Start: mem: 8,008 K 1827904 Hash built: mem: 329,660 K Fetch by Name took 0.445586 seconds Fetch by Addr took 0.512336 seconds Hash queried: check mem: 329,684 K C:\test>1112165-hash -S=3 Start: mem: 7,984 K 2741856 Hash built: mem: 524,328 K Fetch by Name took 0.699219 seconds Fetch by Addr took 0.717473 seconds Hash queried: check mem: 524,352 K C:\test>1112165-hash -S=4 Start: mem: 7,972 K 3655808 Hash built: mem: 651,160 K Fetch by Name took 0.976562 seconds Fetch by Addr took 1.065514 seconds Hash queried: check mem: 651,184 K

    Sqlite & :memory:

    #! perl -slw use strict; use Time::HiRes qw[ time ]; use DBI; use constant { CREATE => 'create table DB ( NAME test primary key not null, ADDR +int not null )', INSERT => 'insert into DB ( NAME, ADDR ) values ( ?, ? )', INDEX => 'create index ADDR_IDX on DB ( ADDR );', QUERYBYNAME => 'select ADDR from DB where NAME = ?', QUERYBYADDR => 'select NAME from DB where ADDR = ?', }; sub mem{ `tasklist /nh /fi "PID eq $$"` =~ m[(\S+ K)$] } our $S //= 1; my $scale = chr( ord('a') + $S - 1 ); print "Start: mem: ", mem; my $dbh = DBI->connect( 'dbi:SQLite:dbname=:memory:','','' ) or die DB +I::errstr; $dbh->do( CREATE ) or die DBI::errstr; print "DB created: mem: ", mem; my $ins = $dbh->prepare( INSERT ) or die DBI->errstr; my $i = 1; for( 'aaaaa' .. $scale . 'zzzz' ) { $ins->execute( $_, $i++ ); } print "DB populated: mem: ", mem; $dbh->do( INDEX ); print "DB indexed: mem: ", mem;; my $start = time; my $sth = $dbh->prepare( QUERYBYNAME ) or die DBI->errstr; for( 'aaaaa' .. $scale . 'zzzz' ) { $sth->execute( $_ ) or die DBI::errstr; my $ref = $sth->fetch; my $addr = $ref->[0]; # print $addr; } printf "Fetch by Name took %f seconds \n", time() - $start; $start = time; $sth = $dbh->prepare( QUERYBYADDR ) or die DBI->errstr; for( 1 .. 26**4 * $S ) { $sth->execute( $_ ) or die DBI::errstr; my $ref = $sth->fetch; my $name = $ref->[0]; # print $name; } printf "Fetch by Addr took %f seconds \n", time() - $start; print "DB queried: mem: ", mem; __END__ C:\test>for /l %s in (1,1,4) do 1112165-sql -S=%s C:\test>1112165-sql -S=1 Start: mem: 9,656 K DB created: mem: 11,072 K DB populated: mem: 30,796 K DB indexed: mem: 39,384 K Fetch by Name took 6.933570 seconds Fetch by Addr took 7.299877 seconds DB queried: mem: 39,412 K C:\test>1112165-sql -S=2 Start: mem: 9,700 K DB created: mem: 11,116 K DB populated: mem: 50,088 K DB indexed: mem: 65,400 K Fetch by Name took 14.425735 seconds Fetch by Addr took 14.653500 seconds DB queried: mem: 65,428 K C:\test>1112165-sql -S=3 Start: mem: 9,656 K DB created: mem: 11,068 K DB populated: mem: 69,544 K DB indexed: mem: 92,320 K Fetch by Name took 21.599556 seconds Fetch by Addr took 21.844892 seconds DB queried: mem: 92,348 K C:\test>1112165-sql -S=4 Start: mem: 9,668 K DB created: mem: 11,084 K DB populated: mem: 88,800 K DB indexed: mem: 119,372 K Fetch by Name took 30.339751 seconds Fetch by Addr took 30.587817 seconds DB queried: mem: 119,400 K

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Bidirectional lookup algorithm? (Updated: further info.)
by Anonymous Monk on Jan 05, 2015 at 11:36 UTC
    that probably isn't faster than using a single %hash

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1112165]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2025-07-08 14:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.