go ahead... be a heretic PerlMonks

Re: Fast algorithm for 2d array queries

by oiskuu (Hermit)
 on Feb 07, 2014 at 23:57 UTC ( #1073935=note: print w/replies, xml ) Need Help??

in reply to Fast algorithm for 2d array queries

A problem rather similar to Comparing two arrays, don't you think? Sparse matrix again, this time 3k by 3M booleans, density 1/60. Exact same solution is viable, too: pack your int-vectors as "l/l", merge-sort n query vectors and scan. Perhaps an 8-core machine to achieve 4k queries per sec.

Speed vs memory trade-off is also possible. Pairwise intersections of your 3000 vectors amount to 4.5M vectors of size ~833. Lookup with small n==4 is 6 combinations ie merge 6*833 == 5k elements instead of 4*50k == 200k elements. About 30-fold speed-up at the cost of 14 GB of memory.

GPU-based solution would be quite interesting, but for that you really ought to ask another forum.

Replies are listed 'Best First'.
Re^2: Fast algorithm for 2d array queries
by BrowserUk (Pope) on Feb 08, 2014 at 00:30 UTC
Perhaps an 8-core machine to achieve 4k queries per sec.

Prove it :)

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

See the referenced thread. I later implemented a partial SSE version as well (merge using SSE, scan not optimized). Result:

```Total 301068 elements in 30 vectors
timethis for 5:  5 wallclock secs ( 5.31 usr +  0.00 sys =  5.31 CPU)
+@ 439.36/s (n=2333)
Update:
```Total 200752 elements in 4 vectors
timethis for 5:  6 wallclock secs ( 5.28 usr +  0.00 sys =  5.28 CPU)
+@ 1326.33/s (n=7003)
Update2: Right you are, BrowserUk, I was considering small n case only.
```Total 50063728 elements in 1000 vectors
timethis for 5:  6 wallclock secs ( 6.32 usr +  0.00 sys =  6.32 CPU)
+@  0.63/s (n=4)

IMO, that thread is not applicable to this problem.

Prove this assertion!

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Let me rephrase my challenge.

Are you really suggesting that you can merge sort 500 to 1000 sets of 50,000 numbers, then scan the resulting 2.5 to 5 million element array to count the most frequent constituent -- in 2 milliseconds?

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Create A New User
Node Status?
node history
Node Type: note [id://1073935]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2018-01-21 10:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How did you see in the new year?

Results (227 votes). Check out past polls.

Notices?