Re: How can I improve the efficiency of this very intensive code?

by sk (Curate)
on Aug 06, 2005 at 21:23 UTC

in reply to How can I improve the efficiency of this very intensive code?

I feel the use of Hash might not be required for your task. You have recordID which can act as an index to an array so why put them in a hash and mess up the order? You get linear access in array using index anyways an no overhead of the hash-table

That said, i would do a Matrix(nxn) (square not a requirement, dimensions might change based on num of records of course) to keep track of scores. Consider the following table

The values inside the cell are the scores. Now if you want the best matching score (max value) then a O(n) max will provide you the answer for your records and you have to do that n-times for each record in your first file.

Sorting to finx max/min is an overkill. I might be missing your porblem so please correct me if i am wrong.



Re^2: How can I improve the efficiency of this very intensive code?
by clearcache (Beadle) on Aug 06, 2005 at 21:41 UTC

    I was thinking about the use of ids are pretty big numbers so I wouldn't use them alone as array indices. I could always use $., however, when I read in the file rather than the id.

    My ranking is based on # of seconds from last log entry in one file to first log entry in the second file. So I create scoring by looking at # of seconds between each record. My ability to identify a "strong match" comes from the rate of concurrent users in the application that my data comes from. Low concurrent users, I'll have lots of strong matches - records that clearly line up. If I have high concurrent users with lots of log file entries, then I've got to get a little creative.

    I was sorting b/c my hash is being used to store # of elapsed seconds...not a true "rank" in terms of 1, 2, 3, etc.

    I'm considering the use of arrays, but don't want to lose the elapsed seconds as data quite yet b/c that will be used in the next step to figure out the best match from the remaining data.

Node Type: note [id://481552]
