in reply to bitvector > global minimum
This seems too easy, (which usually means I didn't understand the question but...): #! perl slw
use strict;
sub minpos {
my( $v, $s, $e ) = @_;
my( $n, $min, $o ) = ( 0, 1e30, 0 );
for my $p ( $s .. $e ) {
$n += vec( $v, $p, 1 ) ? 1 : 1;
( $min, $o ) = ( $n, $p ) if $n < $min;
}
return $o;
}
my $vec = pack 'b*', '100101000100';
my( $s, $e ) = ( 1 , 7 );
printf "The minima between %d  %d is at %d\n",
$s, $e, minpos( $vec, $s, $e );
__END__
C:\test>junk5
The minima between 1  7 is at 7
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: bitvector > global minimum by baxy77bax (Chaplain) on Oct 03, 2011 at 18:05 UTC 
ok yes this is the first level that i probably didn't explain as well as i shpuld. but that is what i was talking about and complaining that this runs in O(X) time where X= A[i,j]. Now if you would preprocess this one more time to get from :
100101000100
A[0] A[7]
1  001010  0
to
1  0  0 > in this array you need to do 3 jumps (for
+my $p ( $s .. $e ))
#the reduction follows from the Cartesian tree that follows from the b
+itvector 100101000100#
you would need to do only 3 jumps and so on and so on. and this iterative preprocessing will in the end place a tree on top of my vector. and searching such binary tree is faster. So what i need is a better preprocessig of my initial 100101000100. so i can save it in a Sparse table and in just 2 comparisons (steps) do what you achieved here in 7 (X)steps. Or if something totally crazy is suggested that will blow my mind straight downunder :) I will not complain :)
Thank You !!
baxy  [reply] [d/l] [select] 

I think that this may actually be a live sighting of the mythical "PM XY problem".
Is it fair to sum up your question as:
You don't want to use the straight forward linear calculation because it will be too slow, but you don't want to build the obvious tree structure because it takes too much memory? Is there some magical third way?
You've describe the query you need to satisfy as given a range of positions (n,m), where along it lies the minima. How many of those queries do you have to satisfy?
Are they effectively random queries. Ie. random start position and random length?
Or are you calculating one (or a few) length(s) for all start positions?
Or all lengths for all start positions?
You mentioned 300GB. Is that a single huge bitvector, or many short vectors?
Basically what I'm getting at here is a clearer description of what this data is; how big it is; the nature of the required processing; etc. rather than just your current approach to this very specific problem, might trigger a different or more innovative approach to the overall problem.
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] 

Fair enough !
"You don't want to use the straight forward linear calculation because it will be too slow, but you don't want to build the obvious tree structure because it takes too much memory? Is there some magical third way?"
A:Correct, and there is some magical third way by doing O(1) checkups in the Spares table. The problem is to reduce ST entries to bits (0,1).
"You've describe the query you need to satisfy as given a range of positions (n,m), where along it lies the minima. How many of those queries do you have to satisfy?
Are they effectively random queries. Ie. random start position and random length?
Or are you calculating one (or a few) length(s) for all start positions?
Or all lengths for all start positions?"
A: All possible queries and all possible lengths!
"You mentioned 300GB. Is that a single huge bitvector, or many short vectors?"
A:One big and many short vectors due to the reduction that I have described in the earlier post.
So this is a really big string of 0's and 1's and i have no idea what is it for i only know that it is used in string matching and alignment. the thing is, this is a classical LCA problem because you have an ordered structure in the background (that you cannot change because then the whole concept of the algorithm falls apart) and you need to solve it.
solving it means converting it to an array of integers. And if this is done then memory requiremants jump through the roof. The reason why i have tried to simplify things is because when you give an extensive description of the problem people, me being the first, loose interest in messing with it because ,if you are not in the field then there is a bunch of strange stuff going on that needs to be accounted for and then it all comes down to google , Sedgewick, Gusfield, Fischer ..... which is what i am trying to avoid, been there done that and now i'm looking for a person to give an opinion on this particular subproblem that i feel could do the trick. If not, then back to the drawing board for me:). So I avoided the higher purpose and just focused on the small bit which may not be well explained but i feel it is simple enough to at least receive couple of suggestions (like the ones you guys made ), and this is what i prefer more, and find more informative, then referencing me to a google site.
So thank you and any suggestion comment or even saying that the problem as given cannot be solved is more then welcome!
Cheers baxy
 [reply] 



