in reply to
Re: bitvector > global minimum
in thread bitvector > global minimum
ok yes this is the first level that i probably didn't explain as well as i shpuld. but that is what i was talking about and complaining that this runs in O(X) time where X= A[i,j]. Now if you would preprocess this one more time to get from :
100101000100
A[0] A[7]
1  001010  0
to
1  0  0 > in this array you need to do 3 jumps (for
+my $p ( $s .. $e ))
#the reduction follows from the Cartesian tree that follows from the b
+itvector 100101000100#
you would need to do only 3 jumps and so on and so on. and this iterative preprocessing will in the end place a tree on top of my vector. and searching such binary tree is faster. So what i need is a better preprocessig of my initial 100101000100. so i can save it in a Sparse table and in just 2 comparisons (steps) do what you achieved here in 7 (X)steps. Or if something totally crazy is suggested that will blow my mind straight downunder :) I will not complain :)
Thank You !!
baxy
Re^3: bitvector > global minimum by BrowserUk (Pope) on Oct 03, 2011 at 19:04 UTC 
I think that this may actually be a live sighting of the mythical "PM XY problem".
Is it fair to sum up your question as:
You don't want to use the straight forward linear calculation because it will be too slow, but you don't want to build the obvious tree structure because it takes too much memory? Is there some magical third way?
You've describe the query you need to satisfy as given a range of positions (n,m), where along it lies the minima. How many of those queries do you have to satisfy?
Are they effectively random queries. Ie. random start position and random length?
Or are you calculating one (or a few) length(s) for all start positions?
Or all lengths for all start positions?
You mentioned 300GB. Is that a single huge bitvector, or many short vectors?
Basically what I'm getting at here is a clearer description of what this data is; how big it is; the nature of the required processing; etc. rather than just your current approach to this very specific problem, might trigger a different or more innovative approach to the overall problem.
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] 

Fair enough !
"You don't want to use the straight forward linear calculation because it will be too slow, but you don't want to build the obvious tree structure because it takes too much memory? Is there some magical third way?"
A:Correct, and there is some magical third way by doing O(1) checkups in the Spares table. The problem is to reduce ST entries to bits (0,1).
"You've describe the query you need to satisfy as given a range of positions (n,m), where along it lies the minima. How many of those queries do you have to satisfy?
Are they effectively random queries. Ie. random start position and random length?
Or are you calculating one (or a few) length(s) for all start positions?
Or all lengths for all start positions?"
A: All possible queries and all possible lengths!
"You mentioned 300GB. Is that a single huge bitvector, or many short vectors?"
A:One big and many short vectors due to the reduction that I have described in the earlier post.
So this is a really big string of 0's and 1's and i have no idea what is it for i only know that it is used in string matching and alignment. the thing is, this is a classical LCA problem because you have an ordered structure in the background (that you cannot change because then the whole concept of the algorithm falls apart) and you need to solve it.
solving it means converting it to an array of integers. And if this is done then memory requiremants jump through the roof. The reason why i have tried to simplify things is because when you give an extensive description of the problem people, me being the first, loose interest in messing with it because ,if you are not in the field then there is a bunch of strange stuff going on that needs to be accounted for and then it all comes down to google , Sedgewick, Gusfield, Fischer ..... which is what i am trying to avoid, been there done that and now i'm looking for a person to give an opinion on this particular subproblem that i feel could do the trick. If not, then back to the drawing board for me:). So I avoided the higher purpose and just focused on the small bit which may not be well explained but i feel it is simple enough to at least receive couple of suggestions (like the ones you guys made ), and this is what i prefer more, and find more informative, then referencing me to a google site.
So thank you and any suggestion comment or even saying that the problem as given cannot be solved is more then welcome!
Cheers baxy
 [reply] 

I don't think that I understand the question. Help me out...because this just seems too simple. Successive table lookups that are combined together  no input vector bit is used more than once.
I will proffer that O(n) notation may not be the best
for describing what is efficient or not. In an abstract sense,
the number of operations is paramount. In an implementation, how
fast/slow each operation is also matters. Sometimes using more really
fast operations is "better" than fewer slow operations.
Re: bitvector > global minimum sounded pretty good to me. It appears to me that every
bit in the bit vector will need to be examined. If this is not done
bit by bit, then why not do it in groups of bits as a lookup table?
A binary search tree that takes into account all possible variations
of a really, really long vector could be huge!
If I do this 8 bits at a time, then there is a table of 512 possible
results and I think the result can be calculated one lookup at time.
01234567
10010010
bits 2,5 are lowest min values, that whole vector '10010010' could be an index
into an array of struct that lists the values of 2,5. I think there needs to
be an extra test of '0' at the end of the 8 bits and a way to propagate
results to the next 8 bits to keep track of the "best" so far. The implementation
may work better if bits are inverted before the table lookup. And I think that
probably one more bit needs to be added to the lookup table to help with a "stitch" of the previous result together with current result is required.
I am thinking that a static table would suffice. A dynamic table would perhaps be better (shorter)?
 [reply] 

Hm. Whilst looking up some of your terminology  LCA; Sparse Table  I ran across this. Maybe you've seen it already, but it sounds vaguely related.
Maybe you'll understand it well enough to work out how to implement it. Assuming it is actually useful.
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] 


