Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Bidirectional lookup algorithm? (Updated: further info.)

by BrowserUk (Patriarch)
on Jan 06, 2015 at 02:17 UTC ( [id://1112288]=note: print w/replies, xml ) Need Help??


in reply to Re: Bidirectional lookup algorithm? (Updated: further info.)
in thread Bidirectional lookup algorithm? (Updated: further info.)

If you look at my code in Re^2: Bidirectional lookup algorithm? (Updated: further info.), using a hash does lookups by name in: 0.976562/3655808 = 0.000000267 secs; and by addr in 1.065514/3655808 = 0.000000291.

C:\test>1112165-hash -S=4 Start: mem: 7,972 K 3655808 Hash built: mem: 651,160 K Fetch by Name took 0.976562 seconds Fetch by Addr took 1.065514 seconds Hash queried: check mem: 651,184 K

And using an in-memory sqlite DB does them ~30 times slower at: by name in: 30.339751/3655808 = 0.0000083; and by addr in: 30.587817/3655808 = 0.0000084 for a memory reduction of 5/6ths.

C:\test>1112165-sql -S=4 Start: mem: 9,668 K DB created: mem: 11,084 K DB populated: mem: 88,800 K DB indexed: mem: 119,372 K Fetch by Name took 30.339751 seconds Fetch by Addr took 30.587817 seconds DB queried: mem: 119,400 K

And using a disk-based sqllite db does them ~57 times slower at: by name in: 6.867170/456976 = 0.0000150; and by addr in: 6.949115/456976 = 0.0000152 seconds for a memory reduction of ~3/5ths:

C:\test>1112165-sql Start: mem: 9,620 K DB created: mem: 11,464 K DB populated: mem: 29,788 K DB indexed: mem: 45,540 K Fetch by Name took 6.867170 seconds Fetch by Addr took 6.949115 seconds DB queried: mem: 45,564 K

The (single, and possible non representative depending upon where in the strings the chosen keys are (beginning middle or end)) timings you've posted are 800x slower again than the memoryDB, or nearly 5000 times slower than the hashes.

The net result would be that my current 8 hour runs would require 4.5 years. And that's before I added the extra data that the space reduction would permit.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^3: Bidirectional lookup algorithm? (Updated: further info.)
by flexvault (Monsignor) on Jan 06, 2015 at 11:10 UTC

    BrowserUk,

    When programming my database engine, I found that 'index' did best with smaller scalars ( reason for mentioning 'database sharding' ). My times were close to the worst case ( worst case is 'not found' ).

    I found that with a 8-byte key, the 'index' technique I used was 4.7 times faster searching 512 byte records than searching 1,024 byte records. Naturally, 'index' searching a 5MM+ byte record is going to be real slow!

    My attempt was to give you another approach, not a final solution.

    Good Luck...Ed

    "Well done is better than well said." - Benjamin Franklin

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1112288]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-24 13:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found