Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^5: Modified Binary Search

by BrowserUk (Pope)
on Jan 15, 2010 at 08:01 UTC ( #817560=note: print w/replies, xml ) Need Help??


in reply to Re^4: Modified Binary Search
in thread Modified Binary Search

Hm. You have a big list and a small list of dates, both sorted (or readily sortable).

And for each date in the small list, you want to find the subset of the big list that are 1 year either side.

And you search anew for each item in the small list?

Here's a one-pass algorithm:

my @bigSorted = ...; my @smallSorted = ...; my %ranges; my( $lo, $hi ) = ( 0, 0 ); for ( @smallSorted ) { my( $early, $late ) = calcBoundaryDates( $_ ); ++$lo while $bigList[ $lo ] lt $early; $hi //= $lo; ++$hi while $bigList[ $hi ] le $late; $ranges{ $_ } = $hi - $lo; }

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^6: Modified Binary Search
by Limbic~Region (Chancellor) on Jan 15, 2010 at 14:19 UTC
Re^6: Modified Binary Search
by jdporter (Canon) on Aug 11, 2011 at 16:02 UTC

    That works fine if you're doing it as a one-time deal for all elements of Tbl_B. LR's problem description seems to say that the search needs to be done for the modified elements each time Tbl_B is modified. Since the above technique does linear search, it's not going to be optimal.

      LR's problem description seems to say ...

      Hm. That's not how I read it, nor do I see how you could possibly derive that interpretation from the OP?

      Besides Limbic~Region 'imself seemed to think it was okay and he is not exactly known for accepting incomplete solutions.

      Since the above technique does linear search, it's not going to be optimal.

      Perhaps what he saw that you missed, is that whilst it is a linear search, it is a linear search once over the 5 million. Ie. O(5E6).

      Whereas, if you use a binary search, you'd do 600,000 * 5e6 * log 5e6 = O(20,096,910,013,008)

      600,000 * log 5e6 = O(48,494,850)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      jdporter,
      You and BrowserUk are both correct. In the real world problem, anytime there was a change the process had to begin anew. In my "snapshot in time" use case, I only had to run it once. My goal was to demonstrate that changing the behavior of the vendor provided C compiled stored procedure was to our advantage. Everyone was convinced but after a year, I am still waiting to see the change.

      Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://817560]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2021-06-20 01:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (93 votes). Check out past polls.

    Notices?