Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^2: Most efficient record selection method?

by Kraythorne (Sexton)
on Feb 13, 2009 at 12:09 UTC ( [id://743591]=note: print w/replies, xml ) Need Help??


in reply to Re: Most efficient record selection method?
in thread Most efficient record selection method?

Ok.

Variants, = mixture of values contained in each field.

Sample records = Live records containing these values

I don't care about the combination, but we do need to have the minimum number of records so:

[A1,B1],[A1,B2],[A2,B1],[A2,B2] (variants over 2 fields)

could be resolved in 3 records as you have said:

record 1 - contains [A1,B1] record 2 - contains [A1,B2] record 3 - contains [A2,B1]

HOWEVER, if there are 2 records that contain:

record 1 - [A1,B1] record 2 - [A2,B2]

Then these need to be identified and used in preference to the first combination as that will be the minimum number of records I need to represent all the variants in each of the fields.

This gets more complicated as the number of fields and files being interrogated increases

Replies are listed 'Best First'.
Re^3: Most efficient record selection method?
by ELISHEVA (Prior) on Feb 13, 2009 at 13:48 UTC
    Ok. Here is a start on an algorithm. The sample code contains two fake files (represented by arrays of lines) to demonstrate how to maintain state as we pass from one file to the next. It also contains a very simple optimizer that can eliminate lines that add only one new value in favor of lines that add two or more new values at once.

    However, that optimization is not going to find the actual smallest sample if the number of selected fields is greater than 2. It is also possible that we might find a record with two new values and then later on a record with the same two values plus a new value in a third field. You will need to replace the simple optimizer with a more general purpose optimizer. Alas, coming up with a general solution is more work than I have time for.

    Best, beth

    Update: Fixed small bug

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://743591]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (6)
As of 2024-04-18 09:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found