in reply to How best to fill missing values in a sparse matrix?
I'm a bit confused about your logic. You indicate that you want to look ahead for a good value, but you also seem to be relying on that first good row for missing values as well. (as indicated by your sample output and your mention of the fact that the first line is guaranteed to be good ). Also you say that the first line is guaranteed to be filled, but your sample input has out of range values. Does filled mean has a value (even out of range) or does filled mean "have an in-range value"? What determines when you look ahead and when you look behind?
Also how many lines do you typically need to look ahead before you find a complete set of good values? And do you even need a complete set, or do you just need in-range values for the columns explicitly specified in the current line? Put another way, will your output file have 50 values for each row? Or will it only have 2 values if the input row also had two values?
Assuming you always want to look forward for the next best value, why not use the *nix command tac to put the file in reverse order (last line first)?
This would allow a much simpler look-behind After you read enough lines to fill up your default array with good values for each column, your script would never need to hold more than two lines in memory at a time (the composite good value array and the current line), no matter how sparse the data nor how far away the next good value is.
Of course, if you are processing a real-time feed or do not have access to tac or an equivalent, then maybe you have no choice but to do a look ahead since you never really know what the "next" good value is until it happens, but if you don't really need to do it, there is no sin in taking the easy way out and using the tools at hand.
Update: realized that there are lots of unanswered questions here and added some. Also withdrew suggestion of using tac - either reading forward or backwards there are missing values, one will still need a multi-line look ahead to construct a full set of good values.