When in doubt, sort

by Thelonius (Priest)
on Mar 23, 2002

in reply to Up for Critique

A generally good strategy for a load is to parse the data, writing the data out into one file for each table. Then sort the data in each file on the field(s) that you are doing your SELECT on.

If the database is empty before your run (it's not clear from your description), then you don't need to index until you are done. Any duplicates will be in consecutive records after you sort, so you will know when there are duplicates.

