Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re^5: Finding Nearly Identical Sets (Updated:4200/sec)

by BrowserUk (Pope)
on Oct 04, 2016 at 14:56 UTC ( #1173261=note: print w/replies, xml ) Need Help??

in reply to Re^4: Finding Nearly Identical Sets (Updated:4200/sec)
in thread Finding Nearly Identical Sets

I'm not sure an in-memory solution will work because of parallel processing.

Hm. The primary reason -- there are others -- for using parallel processing is: speed.

I pretty much guarantee that you will not be able to achieve 500/s using a disk-based file or DB let alone 5000/s; -- disk access is at least 100,000 times slower than memory -- which means you now need 10 processors instead on one just to get back to par.

And if 5000/s isn't enough? Put the bitmaps in shared memory (NOT threads::shared) and run multiple threads...

Anyway, good luck with the project which ever way you choose to go :)

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^5: Finding Nearly Identical Sets (Updated:4200/sec)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1173261]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2021-02-27 19:11 GMT
Find Nodes?
    Voting Booth?

    No recent polls found