I can't imagine code like that working in this application since the files can be quite large. Here is an outline of how the script works:
- Prepare several DB SELECT statement handles that will be used inside the loop to get useful information.
- Create tied hashes for caching that information so that I don't have to hit the database everytime I need the id of some frequenly used term.
- Create an IO object that will parse the file line by line and hand back information about the line in an OO way.
- Loop using the IO object's next_feature method. Do lots of bookkeeping using the tied hashes. Write output to several (about 10) files that will later be loaded into postgres using COPY FROM STDIN.
- Close open files, destroy DB statement handles, and load data into database.