comment on

I'm gonna try to keep this short b/c it seems I get into trouble when I "yammer on".

In re: the non-integer values for x and y. The full explanation is mathematically forbidding but has to do with the nature of the detecting instrument: coordinate systems are actually transformed somewhat during data processing.

But to cut it short: in my first version of the program I converted these values to integers anyway (a sanctioned move - I didn't just decide to do that on my own). I muddied the issue here by posting the full precision values in this post.

In re: what I am trying to do. Essentially I am trying to find places on the detectors where 'hits' represented by the pixel values seem to "bunch up." These 'hits' represent places on the detectors where photons have struck. A number of "hits" in x or y that goes over a predetermined value _may_ indicate something that needs to be looked at more closely (e.g. by human eyes).

The first version of the program took a list of observation sessions, represented by numbers, as input. For each observation session, it did a database call to find out which of the seven detectors/CCDs were involved.

THEN, for each detector in that observation, it did a database call to pull in the data for the "hits", populating an array for the x axis and one for the y axis of the detector.

Then it iterated over those built-up arrays for x and y, kind of doing a histogram in memory (repeat for each detector, then move on to the next observation) ...

I must emphasize: this approach worked. But it's apparently inefficient, especially in terms of time (total run time: 19 minutes) spent doing db calls. So I figured out how to pull all the data in first. This takes only 2 minutes.

All the lines of the lump are like this:

$observation, $detector, $x_coord, $y_coord

Now I keep getting stuck trying to get the big lump to do what I want:

... to give me an array of the x values and an array of the y values for a SINGLE detector in a SINGLE observation. And so on, through the lump, until I am done. I need to examine the DISTRIBUTION of values in x and y axes of each detector, in each observation, individually.

Maybe I should be satisfied with my 19 minute runtime, and leave the data munging / structures alone until I am more experienced ... ? I don't know.

Do I need a data structure? I don't know that either. It feels like I do, because without one I don't know how to "address" subsets of the lump of data.

I hope that's clearer, anyway. I don't know why I am so stuck, and I am sorry I am.

In reply to Re^2: structuring data: aka walk first, grok later by chexmix
in thread structuring data: aka walk first, grok later by chexmix

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


No such thing as a small change
	PerlMonks